scispace - formally typeset
Search or ask a question
Topic

Mahalanobis distance

About: Mahalanobis distance is a research topic. Over the lifetime, 4616 publications have been published within this topic receiving 95294 citations.


Papers
More filters
Proceedings ArticleDOI
10 Dec 2002
TL;DR: Efficient post-processing techniques namely noise removal, shape criteria, elliptic curve fitting and face/non-face classification are proposed in order to further refine skin segmentation results for the purpose of face detection.
Abstract: This paper presents a new human skin color model in YCbCr color space and its application to human face detection. Skin colors are modeled by a set of three Gaussian clusters, each of which is characterized by a centroid and a covariance matrix. The centroids and covariance matrices are estimated from large set of training samples after a k-means clustering process. Pixels in a color input image can be classified into skin or non-skin based on the Mahalanobis distances to the three clusters. Efficient post-processing techniques namely noise removal, shape criteria, elliptic curve fitting and face/non-face classification are proposed in order to further refine skin segmentation results for the purpose of face detection.

287 citations

Journal ArticleDOI
01 Jan 1996
TL;DR: Experiments show that the HEC network leads to a significant improvement in the clustering results over the K-means algorithm with Euclidean distance, and indicates that hyperellipsoidal shaped clusters are often encountered in practice.
Abstract: We propose a self-organizing network for hyperellipsoidal clustering (HEC). It consists of two layers. The first employs a number of principal component analysis subnetworks to estimate the hyperellipsoidal shapes of currently formed clusters. The second performs competitive learning using the cluster shape information from the first. The network performs partitional clustering using the proposed regularized Mahalanobis distance, which was designed to deal with the problems in estimating the Mahalanobis distance when the number of patterns in a cluster is less than or not considerably larger than the dimensionality of the feature space during clustering. This distance also achieves a tradeoff between hyperspherical and hyperellipsoidal cluster shapes so as to prevent the HEC network from producing unusually large or small clusters. The significance level of the Kolmogorov-Smirnov test on the distribution of the Mahalanobis distances of patterns in a cluster to the cluster center under the Gaussian cluster assumption is used as a compactness measure. The HEC network has been tested on a number of artificial data sets and real data sets, We also apply the HEC network to texture segmentation problems. Experiments show that the HEC network leads to a significant improvement in the clustering results over the K-means algorithm with Euclidean distance. Our results on real data sets also indicate that hyperellipsoidal shaped clusters are often encountered in practice.

287 citations

Journal ArticleDOI
TL;DR: The development and implementation of a line segment-based token tracker that combines prediction and matching steps and is illustrated in several experiments that have been carried out considering noisy synthetic data and real scenes obtained from the INRIA mobile robot.

287 citations

Journal ArticleDOI
TL;DR: A method that enables scalable similarity search for learned metrics and an indirect solution that enables metric learning and hashing for vector spaces whose high dimensionality makes it infeasible to learn an explicit transformation over the feature dimensions.
Abstract: We introduce a method that enables scalable similarity search for learned metrics. Given pairwise similarity and dissimilarity constraints between some examples, we learn a Mahalanobis distance function that captures the examples' underlying relationships well. To allow sublinear time similarity search under the learned metric, we show how to encode the learned metric parameterization into randomized locality-sensitive hash functions. We further formulate an indirect solution that enables metric learning and hashing for vector spaces whose high dimensionality makes it infeasible to learn an explicit transformation over the feature dimensions. We demonstrate the approach applied to a variety of image data sets, as well as a systems data set. The learned metrics improve accuracy relative to commonly used metric baselines, while our hashing construction enables efficient indexing with learned distances and very large databases.

281 citations

Book
01 Jan 2002
TL;DR: In this article, the authors present an overview of the state of the art in multidimensional systems and their application in the medical domain, including the use of MTS and MTGS.
Abstract: Preface. Acknowledgments. Terms and Symbols. Definitions of Mathematical and Statistical Terms. 1 Introduction. 1.1 The Goal. 1.2 The Nature of a Multidimensional System. 1.2.1 Description of Multidimensional Systems. 1.2.2 Correlations between the Variables. 1.2.3 Mahalanobis Distance. 1.2.4 Robust Engineering/ Taguchi Methods. 1.3 Multivariate DiagnosisA--The State of the Art. 1.3.1 Principal Component Analysis. 1.3.2 Discrimination and Classification Method. 1.3.3 Stepwise Regression. 1.3.4 Test of Additional Information (Rao's Test). 1.3.5 Multiple Regression. 1.3.6 Multivariate Process Control Charts. 1.3.7 Artificial Neural Networks. 1.4 Approach. 1.4.1 Classification versus Measurement. 1.4.2 Normals versus Abnormals. 1.4.3 Probabilistic versus Data Analytic. 1.4.4 Dimensionality Reduction.1.5 Refining the Solution Strategy. 1.6 Guide to This Book. 2 MTS and MTGS. 2.1 A Discussion of Mahalanobis Distance. 2.2 Objectives of MTS and MTGS. 2.2.1 Mahalanobis Distance (Inverse Matrix Method). 2.2.2 GramA-Schmidt Orthogonalization Process. 2.2.3 Proof That Equations 2.2 and 2.3 Are the Same. 2.2.4 Calculation of the Mean of the Mahalanobis Space. 2.3 Steps in MTS. 2.4 Steps in MTGS. 2.5 Discussion of Medical Diagnosis Data: Use of MTGS and MTS Methods. 2.6 Conclusions. 3 Advantages and Limitations of MTS and MTGS. 3.1 Direction of Abnormalities. 3.1.1 The GramA-Schmidt Process. 3.1.2 Identification of the Direction of Abnormals. 3.1.3 Decision Rule for Higher Dimensions. 3.2 Example of a Graduate Admission System. 3.3 Multicollinearity. 3.4 A Discussion of Partial Correlations. 3.5 Conclusions. 4 Role of Orthogonal Arrays and Signal to Noise Ratios in Multivariate Diagnosis. 4.1 Role of Orthogonal Arrays. 4.2 Role of S/ N Ratios. 4.3 Advantages of S/ N ratios. 4.3.1 S/ N Ratio as a Simple Measure to Identify Useful Variables. 4.3.2 S/ N Ratio as a Measure of Functionality of the System. 4.3.3 S/ N Ratio to Predict the Given Conditions. 4.4 Conclusions. 5 Treatment of Categorical Data in MTS/MTGS Methods. 5.1 MTS/ MTGS with Categorical Data. 5.2 A Sales and Marketing Application. 5.2.1 Selection of Suitable Variables. 5.2.2 Description of the Variables. 5.2.3 Construction of Mahalanobis Space. 5.2.4 Validation of the Measurement Scale. 5.2.5 Identification of Useful Variables (Developing Stage). 5.2.6 S/ N Ratio of the System (Before and After). 5.3 Conclusions. 6 MTS/ MTGS under a Noise Environment. 6.1 MTS/ MTGS with Noise Factors. 6.1.1 Treat Each Level of the Noise Factor Separately. 6.1.2 Include the Noise Factor as One of the Variables. 6.1.3 Combine Variables of Different Levels of the Noise Factor. 6.1.4 Do Not Consider the Noise Factor If It Cannot Be Measured. 6.2 Conclusions. 7 Determination of ThresholdsA--A Loss Function Approach. 7.1 Why Threshold Is Required in MTS/ MTGS. 7.2 Quadratic Loss Function. 7.2.1 QLF for the Nominal the Best Characteristic. 7.2.2 QLF for the Larger the Better Characteristic. 7.2.3 QLF for the Smaller the Better Characteristic. 7.3 QLF for MTS/ MTGS. 7.3.1 Determination of Threshold. 7.3.2 When Only Good Abnormals Are Present. 7.4 Examples. 7.4.1 Medical Diagnosis Case. 7.4.2 A Student Admission System. 7.5 Conclusions. 8 Standard Error of the Measurement Scale. 8.1 Why Mahalanobis Distance Is Used for Constructing the Measurement Scale. 8.2 Standard Error of the Measurement Scale. 8.3 Standard Error for the Medical Diagnosis Example. 8.4 Conclusions. 9 Advance Topics in Multivariate Diagnosis. 9.1 Multivariate Diagnosis Using the Adjoint Matrix Method. 9.1.1 Related Topics of Matrix Theory. 9.1.2 Adjoint Matrix Method for Handling Multicollinearity. 9.2 Examples for the Adjoint Matrix Method. 9.2.1 Example 1. 9.2.2 Example 2. 9.3 beta Adjustment Method for Small Correlations. 9.4 Subset Selection Using the Multiple Mahalanobis Distance Method. 9.4.1 Steps in the MMD Method. 9.4.2 Example.9.5 Selection of Mahalanobis Space from Historical Data. 9.6 Conclusions. 10 MTS/ MTGS versus Other Methods. 10.1 Principal Component Analysis. 10.2 Discrimination and Classification Method. 10.2.1 Fisher's Discriminant Function. 10.2.2 Use of Mahalanobis Distance. 10.3 Stepwise Regression. 10.4 Test of Additional Information (Rao's Test). 10.5 Multiple Regression Analysis. 10.6 Multivariate Process Control. 10.7 Artificial Neural Networks. 10.7.1 Feed Forward (Backpropagation) Method. 10.7.2 Theoretical Comparison. 10.7.3 Medical Diagnosis Data Analysis. 10.8 Conclusions. 11 Case Studies. 11.1 American Case Studies. 11.1.1 Auto Marketing Case Study. 11.1.2 Gear Motor Assembly Case Study. 11.1.3 ASQ Research Fellowship Grant Case Study. 11.1.4 Improving the Transmission Inspection System Using MTS. 11.2 Japanese Case Studies. 11.2.1 Improvement of the Utility Rate of Nitrogen While Brewing Soy Sauce. 11.2.2 Application of MTS for Measuring Oil in Water Emulsion. 11.2.3 Prediction of Fasting Plasma Glucose (FPG) from Repetitive Annual Health Checkup Data. 11.3 Conclusions.12 Concluding Remarks. 12.1 Important Points of the Proposed Methods. 12.2 Scientific Contributions from MTS/MTGS Methods. 12.3 Limitations of the Proposed Methods. 12.4 Recommendations for Future Research. Bibliography. Appendixes. A.1 ASI Data Set. A.2 Principal Component Analysis (MINITAB Output). A.3 Discriminant and Classification Analysis (MINITAB Output). A.4 Results of Stepwise Regression (MINITAB Output). A.5 Multiple Regression Analysis (MINITAB Output). A.6 Neural Network Analysis (MATLAB Output). A.7 Variables for Auto Marketing Case Study. Index.

280 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
79% related
Artificial neural network
207K papers, 4.5M citations
79% related
Feature extraction
111.8K papers, 2.1M citations
77% related
Convolutional neural network
74.7K papers, 2M citations
77% related
Image processing
229.9K papers, 3.5M citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
2023208
2022452
2021232
2020239
2019249