scispace - formally typeset
Search or ask a question

Showing papers on "Mahalanobis distance published in 2002"


Journal ArticleDOI
TL;DR: An active shape model segmentation scheme is presented that is steered by optimal local features, contrary to normalized first order derivative profiles, as in the original formulation, using a nonlinear kNN-classifier to find optimal displacements for landmarks.
Abstract: An active shape model segmentation scheme is presented that is steered by optimal local features, contrary to normalized first order derivative profiles, as in the original formulation [Cootes and Taylor, 1995, 1999, and 2001]. A nonlinear kNN-classifier is used, instead of the linear Mahalanobis distance, to find optimal displacements for landmarks. For each of the landmarks that describe the shape, at each resolution level taken into account during the segmentation optimization procedure, a distinct set of optimal features is determined. The selection of features is automatic, using the training images and sequential feature forward and backward selection. The new approach is tested on synthetic data and in four medical segmentation tasks: segmenting the right and left lung fields in a database of 230 chest radiographs, and segmenting the cerebellum and corpus callosum in a database of 90 slices from MRI brain images. In all cases, the new method produces significantly better results in terms of an overlap error measure (p<0.001 using a paired T-test) than the original active shape model scheme.

592 citations


Book ChapterDOI
28 May 2002
TL;DR: It is shown that EMICP robustly aligns the barycenters and inertia moments with a high variance, while it tends toward the accurate ICP for a small variance, and is used in a multi-scale approach using an annealing scheme on this parameter to combine robustness and accuracy.
Abstract: We investigate in this article the rigid registration of large sets of points, generally sampled from surfaces. We formulate this problem as a general Maximum-Likelihood (ML) estimation of the transformation and the matches. We show that, in the specific case of a Gaussian noise, it corresponds to the Iterative Closest Point algorithm(ICP) with the Mahalanobis distance.Then, considering matches as a hidden variable, we obtain a slightly more complex criterion that can be efficiently solved using Expectation-Maximization (EM) principles. In the case of a Gaussian noise, this new methods corresponds to an ICP with multiple matches weighted by normalized Gaussian weights, giving birth to the EM-ICP acronym of the method.The variance of the Gaussian noise is a new parameter that can be viewed as a "scale or blurring factor" on our point clouds. We show that EMICP robustly aligns the barycenters and inertia moments with a high variance, while it tends toward the accurate ICP for a small variance. Thus, the idea is to use a multi-scale approach using an annealing scheme on this parameter to combine robustness and accuracy. Moreover, we show that at each "scale", the criterion can be efficiently approximated using a simple decimation of one point set, which drastically speeds up the algorithm.Experiments on real data demonstrate a spectacular improvement of the performances of EM-ICP w.r.t. the standard ICP algorithm in terms of robustness (a factor of 3 to 4) and speed (a factor 10 to 20), with similar performances in precision. Though the multiscale scheme is only justified with EM, it can also be used to improve ICP, in which case the performances reaches then the one of EM when the data are not too noisy.

470 citations


Proceedings ArticleDOI
10 Dec 2002
TL;DR: Efficient post-processing techniques namely noise removal, shape criteria, elliptic curve fitting and face/non-face classification are proposed in order to further refine skin segmentation results for the purpose of face detection.
Abstract: This paper presents a new human skin color model in YCbCr color space and its application to human face detection. Skin colors are modeled by a set of three Gaussian clusters, each of which is characterized by a centroid and a covariance matrix. The centroids and covariance matrices are estimated from large set of training samples after a k-means clustering process. Pixels in a color input image can be classified into skin or non-skin based on the Mahalanobis distances to the three clusters. Efficient post-processing techniques namely noise removal, shape criteria, elliptic curve fitting and face/non-face classification are proposed in order to further refine skin segmentation results for the purpose of face detection.

287 citations


Book
01 Jan 2002
TL;DR: In this article, the authors present an overview of the state of the art in multidimensional systems and their application in the medical domain, including the use of MTS and MTGS.
Abstract: Preface. Acknowledgments. Terms and Symbols. Definitions of Mathematical and Statistical Terms. 1 Introduction. 1.1 The Goal. 1.2 The Nature of a Multidimensional System. 1.2.1 Description of Multidimensional Systems. 1.2.2 Correlations between the Variables. 1.2.3 Mahalanobis Distance. 1.2.4 Robust Engineering/ Taguchi Methods. 1.3 Multivariate DiagnosisA--The State of the Art. 1.3.1 Principal Component Analysis. 1.3.2 Discrimination and Classification Method. 1.3.3 Stepwise Regression. 1.3.4 Test of Additional Information (Rao's Test). 1.3.5 Multiple Regression. 1.3.6 Multivariate Process Control Charts. 1.3.7 Artificial Neural Networks. 1.4 Approach. 1.4.1 Classification versus Measurement. 1.4.2 Normals versus Abnormals. 1.4.3 Probabilistic versus Data Analytic. 1.4.4 Dimensionality Reduction.1.5 Refining the Solution Strategy. 1.6 Guide to This Book. 2 MTS and MTGS. 2.1 A Discussion of Mahalanobis Distance. 2.2 Objectives of MTS and MTGS. 2.2.1 Mahalanobis Distance (Inverse Matrix Method). 2.2.2 GramA-Schmidt Orthogonalization Process. 2.2.3 Proof That Equations 2.2 and 2.3 Are the Same. 2.2.4 Calculation of the Mean of the Mahalanobis Space. 2.3 Steps in MTS. 2.4 Steps in MTGS. 2.5 Discussion of Medical Diagnosis Data: Use of MTGS and MTS Methods. 2.6 Conclusions. 3 Advantages and Limitations of MTS and MTGS. 3.1 Direction of Abnormalities. 3.1.1 The GramA-Schmidt Process. 3.1.2 Identification of the Direction of Abnormals. 3.1.3 Decision Rule for Higher Dimensions. 3.2 Example of a Graduate Admission System. 3.3 Multicollinearity. 3.4 A Discussion of Partial Correlations. 3.5 Conclusions. 4 Role of Orthogonal Arrays and Signal to Noise Ratios in Multivariate Diagnosis. 4.1 Role of Orthogonal Arrays. 4.2 Role of S/ N Ratios. 4.3 Advantages of S/ N ratios. 4.3.1 S/ N Ratio as a Simple Measure to Identify Useful Variables. 4.3.2 S/ N Ratio as a Measure of Functionality of the System. 4.3.3 S/ N Ratio to Predict the Given Conditions. 4.4 Conclusions. 5 Treatment of Categorical Data in MTS/MTGS Methods. 5.1 MTS/ MTGS with Categorical Data. 5.2 A Sales and Marketing Application. 5.2.1 Selection of Suitable Variables. 5.2.2 Description of the Variables. 5.2.3 Construction of Mahalanobis Space. 5.2.4 Validation of the Measurement Scale. 5.2.5 Identification of Useful Variables (Developing Stage). 5.2.6 S/ N Ratio of the System (Before and After). 5.3 Conclusions. 6 MTS/ MTGS under a Noise Environment. 6.1 MTS/ MTGS with Noise Factors. 6.1.1 Treat Each Level of the Noise Factor Separately. 6.1.2 Include the Noise Factor as One of the Variables. 6.1.3 Combine Variables of Different Levels of the Noise Factor. 6.1.4 Do Not Consider the Noise Factor If It Cannot Be Measured. 6.2 Conclusions. 7 Determination of ThresholdsA--A Loss Function Approach. 7.1 Why Threshold Is Required in MTS/ MTGS. 7.2 Quadratic Loss Function. 7.2.1 QLF for the Nominal the Best Characteristic. 7.2.2 QLF for the Larger the Better Characteristic. 7.2.3 QLF for the Smaller the Better Characteristic. 7.3 QLF for MTS/ MTGS. 7.3.1 Determination of Threshold. 7.3.2 When Only Good Abnormals Are Present. 7.4 Examples. 7.4.1 Medical Diagnosis Case. 7.4.2 A Student Admission System. 7.5 Conclusions. 8 Standard Error of the Measurement Scale. 8.1 Why Mahalanobis Distance Is Used for Constructing the Measurement Scale. 8.2 Standard Error of the Measurement Scale. 8.3 Standard Error for the Medical Diagnosis Example. 8.4 Conclusions. 9 Advance Topics in Multivariate Diagnosis. 9.1 Multivariate Diagnosis Using the Adjoint Matrix Method. 9.1.1 Related Topics of Matrix Theory. 9.1.2 Adjoint Matrix Method for Handling Multicollinearity. 9.2 Examples for the Adjoint Matrix Method. 9.2.1 Example 1. 9.2.2 Example 2. 9.3 beta Adjustment Method for Small Correlations. 9.4 Subset Selection Using the Multiple Mahalanobis Distance Method. 9.4.1 Steps in the MMD Method. 9.4.2 Example.9.5 Selection of Mahalanobis Space from Historical Data. 9.6 Conclusions. 10 MTS/ MTGS versus Other Methods. 10.1 Principal Component Analysis. 10.2 Discrimination and Classification Method. 10.2.1 Fisher's Discriminant Function. 10.2.2 Use of Mahalanobis Distance. 10.3 Stepwise Regression. 10.4 Test of Additional Information (Rao's Test). 10.5 Multiple Regression Analysis. 10.6 Multivariate Process Control. 10.7 Artificial Neural Networks. 10.7.1 Feed Forward (Backpropagation) Method. 10.7.2 Theoretical Comparison. 10.7.3 Medical Diagnosis Data Analysis. 10.8 Conclusions. 11 Case Studies. 11.1 American Case Studies. 11.1.1 Auto Marketing Case Study. 11.1.2 Gear Motor Assembly Case Study. 11.1.3 ASQ Research Fellowship Grant Case Study. 11.1.4 Improving the Transmission Inspection System Using MTS. 11.2 Japanese Case Studies. 11.2.1 Improvement of the Utility Rate of Nitrogen While Brewing Soy Sauce. 11.2.2 Application of MTS for Measuring Oil in Water Emulsion. 11.2.3 Prediction of Fasting Plasma Glucose (FPG) from Repetitive Annual Health Checkup Data. 11.3 Conclusions.12 Concluding Remarks. 12.1 Important Points of the Proposed Methods. 12.2 Scientific Contributions from MTS/MTGS Methods. 12.3 Limitations of the Proposed Methods. 12.4 Recommendations for Future Research. Bibliography. Appendixes. A.1 ASI Data Set. A.2 Principal Component Analysis (MINITAB Output). A.3 Discriminant and Classification Analysis (MINITAB Output). A.4 Results of Stepwise Regression (MINITAB Output). A.5 Multiple Regression Analysis (MINITAB Output). A.6 Neural Network Analysis (MATLAB Output). A.7 Variables for Auto Marketing Case Study. Index.

280 citations


Journal ArticleDOI
TL;DR: A new approach to covariance-weighted factorization, which can factor noisy feature correspondences with high degree of directional uncertainty into structure and motion and provides a unified approach for treating corner-like points together with points along linear structures in the image.
Abstract: Factorization using Singular Value Decomposition (SVD) is often used for recovering 3D shape and motion from feature correspondences across multiple views. SVD is powerful at finding the global solution to the associated least-square-error minimization problem. However, this is the correct error to minimize only when the x and y positional errors in the features are uncorrelated and identically distributed. But this is rarely the case in real data. Uncertainty in feature position depends on the underlying spatial intensity structure in the image, which has strong directionality to it. Hence, the proper measure to minimize is covariance-weighted squared-error (or the Mahalanobis distance). In this paper, we describe a new approach to covariance-weighted factorization, which can factor noisy feature correspondences with high degree of directional uncertainty into structure and motion. Our approach is based on transforming the raw-data into a covariance-weighted data space, where the components of noise in the different directions are uncorrelated and identically distributed. Applying SVD to the transformed data now minimizes a meaningful objective function in this new data space. This is followed by a linear but suboptimal second step to recover the shape and motion in the original data space. We empirically show that our algorithm gives very good results for varying degrees of directional uncertainty. In particular, we show that unlike other SVD-based factorization algorithms, our method does not degrade with increase in directionality of uncertainty, even in the extreme when only normal-flow data is available. It thus provides a unified approach for treating corner-like points together with points along linear structures in the image.

132 citations


Proceedings ArticleDOI
09 Oct 2002
TL;DR: The Euclidean distance and the elastic-matching distance are employed as the measures of distance between pairs of feature vectors and performed fairly well in retrieving models having similar shape from a database of VRML models.
Abstract: In this paper, we propose a method for shape-similarity search of 3D polygonal-mesh models. The system accepts triangular meshes, but tolerates degenerated polygons, disconnected component, and other anomalies. As the feature vector, the method uses a combination of three vectors, (1) the moment of inertia, (2) the average distance of surface from the axis, and (3) the variance of distance of the surface from the axis. Values in each vector are discretely parameterized along each of the three principal axes of inertia of the model. We employed the Euclidean distance and the elastic-matching distance as the measures of distance between pairs of feature vectors. Experiments showed that the proposed shape features and distance measures perform fairly well in retrieving models having similar shape from a database of VRML models.

132 citations


Journal ArticleDOI
TL;DR: In this article, the Mahalanobis distance classifier was employed to determine the best eight-band combination for two multispectral, multitemporal and multisensor image datasets.
Abstract: Determination of the 'best' bands that are assigned to the input neurons of an artificial neural network (ANN) is one of the critical steps in designing the ANN for a particular problem. A large number of inputs reduces the network's generalization capabilities and introduces redundant and perhaps irrelevant information, while a small number of inputs could be insufficient for the network to learn the characteristics of the training data. The number of input bands defines the complexity of the problem. Methods used to select the optimum inputs are known as feature selection techniques. Their use in the context of artificial neural networks was investigated in this study. Statistical separability measures, specifically Wilks' v and Hotelling's T 2, and separability indices were employed to determine the best eight-band combination for two multispectral, multitemporal and multisensor image datasets. The Mahalanobis distance classifier was employed in the determination of the 'best' subset solution. In the s...

111 citations


Patent
01 Oct 2002
TL;DR: In this article, the presence of abnormal condition in the heart of a to-be tested person and the factor thereof (cause of the disease) from the data of magnetic field strengths measured at a plurality of measuring positions is presented.
Abstract: Disclosed is to provide means for supporting the diagnosis by quantitatively measuring the presence of abnormal condition in the heart of a to-be-tested person and the factor thereof (cause of the disease) from the data of magnetic field strengths measured at a plurality of measuring positions. Feature parameters are automatically picked up from the measured data to calculate Mahalanobis distances thereof, and any abnormal function of the heart is detected relying upon the magnitude thereof. Further, chief factors that cause an increase in the Mahalanobis distance are analyzed to specify the cause of a disease.

99 citations


Journal ArticleDOI
TL;DR: An improved search procedure is proposed that is more robust against outlier configurations in the boundary target points by requiring subsequent shape changes to be smooth, which is imposed by a smoothness constraint on the displacement of neighbouring target points at each iteration and implemented by a minimal cost path approach.

99 citations


Book
01 Jan 2002

82 citations


Journal ArticleDOI
TL;DR: The combination of the convex hull and the uncertainty estimation offers a practical way for detecting outliers in prediction by adding the potential function method, inliers can also be detected.

Proceedings ArticleDOI
22 Sep 2002
TL;DR: An unsupervised method to recognize and classify QRS complexes was developed in order to create an automatic cardiac beat classifier in real time, and four features extracted from the QRS complex in the time domain were selected as the best results.
Abstract: An unsupervised method to recognize and classify QRS complexes was developed in order to create an automatic cardiac beat classifier in real time. After exhaustive analysis, four features extracted from the QRS complex in the time domain were selected as the ones presenting the best results: width, total sum of the areas under the positive and negative curves, total sum of the absolute values of sample variations and total amplitude. Preliminary studies indicated these features follow a normal distribution, allowing the use of the Mahalanobis distance as their classification criterion. After an initial learning period, the algorithm extracts the four features from every new QRS complex and calculates the Mahalanobis distance between its feature set and the centroids of all existing classes to determine the class in which the new QRS belongs to. If a predefined distance is surpassed, a new class is created Using 44 records from the MIT-BIH we have obtained 90,74% of sensitivity, 96,55% of positive predictivity and 0.242% of false positives.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: A modification of SFAM is proposed, which uses activation and matching functions based on the Mahalanobis distance, which considerably reduces the network size and increases the efficiency in training and classification.
Abstract: Simplified fuzzy ARTMAP networks (SFAM) typically generate a large number of output neurons and require a large number of input neurons due to the input complementation. We propose a modification of SFAM, which uses activation and matching functions based on the Mahalanobis distance. This modification considerably reduces the network size and increases the efficiency in training and classification. The new network has shown an excellent performance in classification of prehensile motions based on EMG patterns.

Journal ArticleDOI
TL;DR: It is shown that by utilizing the correlation structure the multivariate method, in addition to the genes found by the one-dimensional criteria, finds genes whose differential expression is not detectable marginally.
Abstract: An important problem addressed using cDNA microarray data is the detection of genes differentially expressed in two tissues of interest. Currently used approaches ignore the multidimensional structure of the data. However it is well known that correlation among covariates can enhance the ability to detect less pronounced differences. We use the Mahalanobis distance between vectors of gene expressions as a criterion for simultaneously comparing a set of genes and develop an algorithm for maximizing it. To overcome the problem of instability of covariance matrices we propose a new method of combining data from small-scale random search experiments. We show that by utilizing the correlation structure the multivariate method, in addition to the genes found by the one-dimensional criteria, finds genes whose differential expression is not detectable marginally.

Proceedings ArticleDOI
08 May 2002
TL;DR: Comparisons with existing clustering methods show several advantages of the proposed methodology, and data from a highly nonlinear acetone-butanol fermentation example are clustered to demonstrate the effectiveness.
Abstract: A new methodology for clustering multivariate time-series data is proposed. The methodology is based on calculation of the degree of similarity between multivariate time-series datasets using two similarity factors. One similarity factor is based on principal component analysis and the angles between the principal component subspaces while the other is based on the Mahalanobis distance between the datasets. The standard K-means algorithm is modified to cluster multivariate time-series datasets using similarity factors. Data from a highly nonlinear acetone-butanol fermentation example are clustered to demonstrate the effectiveness of the proposed methodology. Comparisons with existing clustering methods show several advantages of the proposed methodology.

Proceedings Article
01 Jan 2002
TL;DR: The proposed weighted metric-based technique detects the speaker change points in a multi-speaker audio stream using segmentation and classification techniques and new weights are originated from Fisher Linear Discriminant Analysis and Mel Cepstrum feature vectors.
Abstract: Speaker change detection is a key pre-requisite to speaker tracking and speaker adaptation. It detects the points where a speaker identity changes in a multi-speaker audio stream. We first extract the speech segments from an audio stream by segmentation and classification techniques. Using the extracted speech segments, the proposed weighted metric-based technique detects the speaker change points. New weights are originated from Fisher Linear Discriminant Analysis and, when used with Mel Cepstrum feature vectors, it has an effect of subband processing. Experiments were performed with HUB-4 Broadcast News Evaluation English Test Material (1999) and a movie audio track. Results showed that our technique gave about 37.7% improvement compared with Euclidean distance on the broadcast news data and about 27.1% on the movie data; with Mahalanobis distance, the improvements were 37.7% and 25.3% for broadcast news and movie data, respectively.

Patent
25 Jan 2002
TL;DR: In this article, a Mahalanobis distance measure is used to identify a query image among plural images in a database, and the measure may be used to rank the similarity of one or more images to the query image.
Abstract: A Mahalanobis distance measure is used to identify a query image among plural images in a database. The measure may be used to rank the similarity of one or more images to the query image. A varance-covariance matrix is calculated for all images in the database. The variance-covariance matrix is used to calculate the Mahalanobis distance between the query image and one or more images in the database. A range tree may be used to identify likely image candidates for performing the Mahalanobis distance measurement.

Journal ArticleDOI
TL;DR: In this paper, the authors extend the definition of the influence function to functionals of more than one distribution, and derive useful results such as an asymptotic variance formula for estimators of the Mahalanobis distance between two populations and linear discriminant function coefficients.

Journal ArticleDOI
TL;DR: The quadratic classifier was able to detect EEG activity related to imagination of movement with an affordable accuracy by using only C3 and C4 electrodes, interesting for the use of Mahalanobis-based classifiers in the brain computer interface area.
Abstract: Objectives: In this paper, we explored the use of quadratic classifiers based on Mahalanobis distance to detect mental EEG patterns from a reduced set of scalp recording electrodes. Methods: Electrodes are placed in scalp centro-parietal zones (C3, P3, C4 and P4 positions of the international 10-20 system). A Mahalanobis distance classifier based on the use of full covariance matrix was used. Results: The quadratic classifier was able to detect EEG activity related to imagination of movement with an affordable accuracy (97% correct classification, on average) by using only C3 and C4 electrodes. Conclusions: Such a result is interesting for the use of Mahalanobis-based classifiers in the brain computer interface area.

Book ChapterDOI
TL;DR: This paper describes experiences and results applying Support Vector Machine (SVM) to a Computer Intrusion Detection (CID) dataset, emphasizing incorporation of anomaly detection in the modeling and prediction of cyber-attacks.
Abstract: This paper describes experiences and results applying Support Vector Machine (SVM) to a Computer Intrusion Detection (CID) dataset. This is the second stage of work with this dataset, emphasizing incorporation of anomaly detection in the modeling and prediction of cyber-attacks. The SVMmethod for classification is used as a benchmark method (from previous study [1]), and the anomaly detection approaches compare so-called "one class" SVMs with a thresholded Mahalanobis distance to define support regions. Results compare the performance of the methods, and investigate joint performance of classification and anomaly detection. The dataset used is the DARPA/KDD-99 publicly available dataset of features from network packets classified into non-attack and four attack categories.

Journal Article
TL;DR: The procedure for data assembly provides for the preservation of some aspects of the nuclear structure of a two-dimensional gene expression pattern of Drosophila melanogaster embryos based on creating an averaged model that reproduces the spatial distribution of nuclei over the embryo image.
Abstract: We apply the fast redundant dyadic wavelet transform to the spatial registration of two-dimensional gene expression patterns of 736 Drosophila melanogaster embryos. This method is superior to the Fourier transform or windowed Fourier transform because of its ability to reduce noise and is of high resolution. In registration of the dataset we use two cost functions based on computing the Euclidean or Mahalanobis distance. The algorithm shows a high level of accuracy. For early temporal classes the cost function based on Mahalanobis distance gives better results. We have reported a method for construction of an integrated dataset elsewhere. In this paper the method is extended to the two-dimensional case. The procedure for data assembly provides for the preservation of some aspects of the nuclear structure of a two-dimensional gene expression pattern. It is based on creating an averaged model that re- produces the spatial distribution of nuclei over the embryo image. The average concentrations of each protein in each averaged nucleus are computed from the series of embryos of the same age.

Book ChapterDOI
01 Jan 2002
TL;DR: The method here described is to remove the dependence of multivariate normality of the bulk of the data on the identification of outliers in multivariate data.
Abstract: Outlier identification is important in many applications of multivariate analysis. Either because there is some specific interest in finding anomalous observations or as a pre-processing task before the application of some multivariate method, in order to preserve the results from possible harmful effects of those observations. It is also of great interest in supervised classification (or discriminant analysis) if, when predicting group membership, one wants to have the possibility of labelling an observation as “does not belong to any of the available groups”. The identification of outliers in multivariate data is usually based on Mahalanobis distance. The use of robust estimates of the mean and the covariance matrix is advised in order to avoid the masking effect (Rousseeuw and Leroy, 1985; Rousseeuw and von Zomeren, 1990; Rocke and Woodruff, 1996; Becker and Gather, 1999). However, the performance of these rules is still highly dependent of multivariate normality of the bulk of the data. The aim of the method here described is to remove this dependence.

Proceedings ArticleDOI
08 Jul 2002
TL;DR: This paper introduces a comprehensive theory of distance metrics for multitarget (and, more generally, multi-object) systems, and shows that this theory extends an optimal-assignment approach proposed by O. Drummond.
Abstract: The concept of miss distance-Euclidean, Mahalanobis, etc.-is a fundamental, far-reaching, and taken-for-granted element of the engineering theory and practice of single-sensor, single-target systems. One might expect that multisensor, multitarget information fusion theory and applications would already rest upon a similarly fundamental concept-namely, miss distance between multi-object systems (i.e., systems in which not only individual objects can vary, but their number as well). However, this has not been the case. Consequently, in this paper we introduce a comprehensive theory of distance metrics for multitarget (and, more generally, multi-object) systems. We show that this theory extends an optimal-assignment approach proposed by O. Drummond. We describe tractable computational approaches for computing such metrics, as well as some potentially far-reaching implications for applications such as sensor management.

Journal ArticleDOI
TL;DR: The DD plot is a plot of classical vsfirobust Mahalanobis distances: MDi vs. RDi that can be used as a diagnostic for multivariate normality and elliptical symmetry, and to assess the success of numerical transformations towards elliptICAL symmetry.
Abstract: The DD plot, introduced by Rousseeuw and Van Driessen (1999), is a plot of classical vs robust Mahalanobis distances: MDi vs RDi. The DD plot can be used as a diagnostic for multivariate normality and elliptical symmetry, and to assess the success of numerical transformations towards elliptical symmetry. In the regression context, many procedures can be adversely aected if strong nonlinearities are present in the predictors. Even if strong nonlinearities are present, the robust distances can be used to help visualize important regression models such as generalized linear models.

Journal ArticleDOI
TL;DR: A new three-stage verification system which is based on three types of features: global features; local features of the corner points; and function features that contain information of each point of the signatures is presented.
Abstract: This paper presents a new three-stage verification system which is based on three types of features: global features; local features of the corner points; and function features that contain information of each point of the signatures. The first verification stage implements a parameter-based method, in which the Mahalanobis distance is used as a dissimilarity measure between the signatures. The second verification stage involves corner extraction and corner matching. It also performs signature segmentation. The third verification stage implements a function-based method, which is based on an elastic matching algorithm establishing a point-to-point correspondence between the compared signatures. By combining the three different types of verification, a high security level can be reached. According to our experiments, the rates of false rejection and false acceptance are, respectively, 5.8% and 0%.

Patent
19 Apr 2002
TL;DR: In this article, a Mahalanobis space of plural manufacturing control parameters is generated on the basis of first sampled data and second sampled data is calculated to determine whether a manufacturing process is under a malfunction operating condition.
Abstract: In a method of controlling a manufacturing process, a Mahalanobis space of plural manufacturing control parameters is generated on the basis of first sampled data. Then, a Mahalanobis distance from the Mahalanobis space and second sampled data is calculated. A manufacturing process is determined to be under a malfunction operating condition by comparing the Mahalanobis distance and a threshold value.

Journal ArticleDOI
TL;DR: It is shown that the incorporation of nonlinear dynamical measures into a multivariate discrimination provides a signal classification system that is robust to additive noise and can be achieved in cases where spectral measures are known to fail.
Abstract: In this contribution, we show that the incorporation of nonlinear dynamical measures into a multivariate discrimination provides a signal classification system that is robust to additive noise. The signal library was composed of nine groups of signals. Four groups were generated computationally from deterministic systems (van der Pol, Lorenz, Rossler and Henon). Four groups were generated computationally from different stochastic systems. The ninth group contained inter-decay interval sequences from radioactive cobalt. Two classification criteria (minimum Mahalanobis distance and maximum Bayesian likelihood) were tested. In the absence of additive noise, no errors occurred in a within-library classification. Normally distributed random numbers were added to produce signal to noise ratios of 10, 5 and 0 dB. When the minimum Mahalanobis distance was used as the classification criterion, the corresponding error rates were 2.2%, 4.4% and 20% (Expected Error Rate = 89%). When Bayesian maximum likelihood was the criterion, the error rates were 1.1%, 4.4% and 21% respectively. Using nonlinear measures an effective discrimination can be achieved in cases where spectral measures are known to fail. Most classification errors occurred at low signal to noise ratios when a stochastic signal was misclassified into a different group of stochastic signals. When the within-library classification exercise is limited to the four groups of deterministic signals, no classification errors occurred with clean data, at SNR = 10 dB, or at SNR = 5 dB. A single classification error (Observed Error Rate = 2.5%, Expected Error Rate = 75%) occurred with both classification criteria at SNR = 0 dB.

Journal ArticleDOI
TL;DR: A three-step-approach is suggested to find clusters in large datasets of spectra from the Hamburg/ESO survey by means of fixed point clustering, a method to find a single cluster at a time based on the Mahalanobis distance.

Proceedings ArticleDOI
23 Oct 2002
TL;DR: A real-time diagnosis system, based on Motorola's 56311 digital signal processor (DSP), is used for the classification of lung sounds into two classes: healthy and pathological.
Abstract: A real-time diagnosis system, based on Motorola's 56311 digital signal processor (DSP), is used for the classification of lung sounds into two classes: healthy and pathological. The instrument has two inputs the first of which is from a microphone placed on the chest of the patient while the other is from a flowmeter that is used to label the lung sounds as belonging to the inspiration or expiration cycle. The sampled lung sound of a full respiration cycle is divided into 6 phases with the help of the flowmeter signal, and each phase is divided further into 10 overlapping segments. Each segment is modeled by an auto regressive (AR) model of order 6 by means of the Levinson-Durbin algorithm. The classification process is done using two classifiers: k-nearest neighbor (k-NN) classifier with Itakura and Euclidean distance measures, and minimum distance classifier with the Mahalanobis distance measure. The software was written entirely in assembly language and the result of the classification process is displayed on a character display (LCD).

Journal ArticleDOI
TL;DR: It is shown that smaller subregions reduce both the bias and the variance of the estimated gradient and the Mahalanobis distance function, and offers a new direction for future research.