scispace - formally typeset
Search or ask a question
Topic

Mahalanobis distance

About: Mahalanobis distance is a research topic. Over the lifetime, 4616 publications have been published within this topic receiving 95294 citations.


Papers
More filters
Journal ArticleDOI
05 Mar 2003-Analyst
TL;DR: In this paper, pyrolysis-gas chromatography-mass spectrometry (Py-GC-MS) is used to discriminate wet granulation and direct compression with the help of chemometric techniques.
Abstract: Wet granulation and direct compression are two processes employed in tablet preparation. In this paper, pyrolysis-gas chromatography-mass spectrometry (Py-GC-MS) is used to discriminate these processes with the help of chemometric techniques. The data analysis procedure is as follows. First, deconvolute the Py-GC-MS data of each sample into concentration profiles and spectra, and then construct a matrix with each compound corresponding to one column; those contained only in a small number of samples are then removed. Second, the main principal components are kept after excluding three variables and one sample, and further processed by Fisher discriminant analysis. Third, the resultant data are assigned to classes using unsupervised and supervised classification methods. Results from cross-validation show that only 3 of 20 samples are misclassified by the Mahalanobis distance measure.

26 citations

Journal ArticleDOI
TL;DR: A new algorithm called fuzzy-minimals is obtained, which detects the possible prototypes of the groups of a sample and applies the theoretical results using the Euclidean distance.

26 citations

Proceedings ArticleDOI
29 Jul 2010
TL;DR: The intent was to find an accurate estimate of the correlation of sensor data to build up a robust PCA model that could then be used for fault detection, and results clearly show that the approach outperforms existing methods in terms of accuracy even when processing corrupted data.
Abstract: To address the problem of outlier detection in wireless sensor networks, in this paper we propose a robust principal component analysis based technique to detect anomalous or faulty sensor data in a distributed wireless sensor network with a focus on data integrity and accuracy problem. The main key features are that it considers the correlation existing among the sensor data in order to disclose anomalies that span through a number of neighboring sensors, does not require error free data for PCA model construction and the operation takes place in a distributed fashion. In this paper, a two-step algorithm is proposed. First, the intent was to find an accurate estimate of the correlation of sensor data to build up a robust PCA model that could then be used for fault detection. This locally developed correlation based robust PCA model tends to accentuate the contribution of close observations in comparison with distant observations and does not impose any constraints in model design. Second, we use mahalanobis distance, a multivariate distance metric to determine the similarity between the current sensor readings against the developed sensor data model. Combined with component analysis, mahalanobis distance is extended to examine whether a sensor node is an outlier from a model defined by principal components based on principal component analysis. We examined the algorithm's performance using simulation with synthetic and real sensor data streams. The results clearly show that our approach outperforms existing methods in terms of accuracy even when processing corrupted data.

26 citations

Journal ArticleDOI
Robin D. Thomas1
TL;DR: This article provides a formal definition for a sensivity measure, d′g, between two multivariate stimuli, and reveals shortcomings in the diagonald′ test and demonstrates that the assumptions behind equating perceptual independence to dimensional orthogonality are too weak.
Abstract: This article provides a formal definition for a sensivity measure,d′ g , between two multivariate stimuli. In recent attempts to assess perceptual representations using qualitative tests on response probabilities, the concept of ad′ between two multidimensional stimuli has played a central role. For example, Kadlec and Townsend (1992a, 1992b) proposed several tests based on multidimensional signal detection theory that allow conclusions concerning the perceptual and/or decisional interactions of stimulus dimensions. One proposition, referred to as thediagonal d′test, relies on specific stimulus subsets of a feature-complete factorial identification task to infer perceptual separability. Also, Ashby and Townsend (1986), in a similar manner, attempted to relate perceptual independence to dimensional orthogonality in Tanner’s (1956) model, which also involvesd′ between two multivariate signals. An analysis of the proposedd′ g reveals shortcomings in the diagonald′ test and also demonstrates that the assumptions behind equating perceptual independence to dimensional orthogonality are too weak. Thisd′ g can be related to a common measure of statistical distance, Mahalanobis distance, in the special case of equal covariance matrices.

26 citations

Posted Content
TL;DR: It is shown that one-vs-all formulations can improve calibration on image classification tasks, while matching the predictive performance of softmax without incurring any additional training or test-time complexity.
Abstract: Accurate estimation of predictive uncertainty in modern neural networks is critical to achieve well calibrated predictions and detect out-of-distribution (OOD) inputs. The most promising approaches have been predominantly focused on improving model uncertainty (e.g. deep ensembles and Bayesian neural networks) and post-processing techniques for OOD detection (e.g. ODIN and Mahalanobis distance). However, there has been relatively little investigation into how the parametrization of the probabilities in discriminative classifiers affects the uncertainty estimates, and the dominant method, softmax cross-entropy, results in misleadingly high confidences on OOD data and under covariate shift. We investigate alternative ways of formulating probabilities using (1) a one-vs-all formulation to capture the notion of "none of the above", and (2) a distance-based logit representation to encode uncertainty as a function of distance to the training manifold. We show that one-vs-all formulations can improve calibration on image classification tasks, while matching the predictive performance of softmax without incurring any additional training or test-time complexity.

26 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
79% related
Artificial neural network
207K papers, 4.5M citations
79% related
Feature extraction
111.8K papers, 2.1M citations
77% related
Convolutional neural network
74.7K papers, 2M citations
77% related
Image processing
229.9K papers, 3.5M citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
2023208
2022452
2021232
2020239
2019249