Topic

Mahalanobis distance

About: Mahalanobis distance is a research topic. Over the lifetime, 4616 publications have been published within this topic receiving 95294 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The Mahalanobis distance

[...]

R. De Maesschalck¹, Delphine Jouan-Rimbaud¹, D.L. Massart¹•Institutions (1)

Vrije Universiteit Brussel¹

04 Jan 2000-Chemometrics and Intelligent Laboratory Systems

TL;DR: The Mahalanobis distance, in the original and principal component (PC) space, will be examined and interpreted in relation with the Euclidean distance (ED).

...read moreread less

1,802 citations

Proceedings Article•DOI•

Clustering with Bregman Divergences

[...]

Arindam Banerjee¹, Srujana Merugu¹, Inderjit S. Dhillon¹, Joydeep Ghosh¹•Institutions (1)

University of Texas at Austin¹

01 Dec 2005

TL;DR: This paper proposes and analyzes parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergences, and shows that there is a bijection between regular exponential families and a largeclass of BRegman diverGences, that is called regular Breg man divergence.

...read moreread less

Abstract: A wide variety of distortion functions, such as squared Euclidean distance, Mahalanobis distance, Itakura-Saito distance and relative entropy, have been used for clustering. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergences. The proposed algorithms unify centroid-based parametric clustering approaches, such as classical kmeans , the Linde-Buzo-Gray (LBG) algorithm and information-theoretic clustering, which arise by special choices of the Bregman divergence. The algorithms maintain the simplicity and scalability of the classical kmeans algorithm, while generalizing the method to a large class of clustering loss functions. This is achieved by first posing the hard clustering problem in terms of minimizing the loss in Bregman information, a quantity motivated by rate distortion theory, and then deriving an iterative algorithm that monotonically decreases this loss. In addition, we show that there is a bijection between regular exponential families and a large class of Bregman divergences, that we call regular Bregman divergences. This result enables the development of an alternative interpretation of an efficient EM scheme for learning mixtures of exponential family distributions, and leads to a simple soft clustering algorithm for regular Bregman divergences. Finally, we discuss the connection between rate distortion theory and Bregman clustering and present an information theoretic analysis of Bregman clustering algorithms in terms of a trade-off between compression and loss in Bregman information.

...read moreread less

1,723 citations

Journal Article•DOI•

Unmasking Multivariate Outliers and Leverage Points

[...]

Peter J. Rousseeuw, Bert C. van Zomeren¹•Institutions (1)

Delft University of Technology¹

01 Jan 1990-Journal of the American Statistical Association

TL;DR: This work proposes to compute distances based on very robust estimates of location and covariance, better suited to expose the outliers in a multivariate point cloud, to avoid the masking effect.

...read moreread less

Abstract: Detecting outliers in a multivariate point cloud is not trivial, especially when there are several outliers. The classical identification method does not always find them, because it is based on the sample mean and covariance matrix, which are themselves affected by the outliers. That is how the outliers get masked. To avoid the masking effect, we propose to compute distances based on very robust estimates of location and covariance. These robust distances are better suited to expose the outliers. In the case of regression data, the classical least squares approach masks outliers in a similar way. Also here, the outliers may be unmasked by using a highly robust regression method. Finally, a new display is proposed in which the robust regression residuals are plotted versus the robust distances. This plot classifies the data into regular observations, vertical outliers, good leverage points, and bad leverage points. Several examples are discussed.

...read moreread less

1,419 citations

Proceedings Article•

A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks

[...]

Kimin Lee¹, Kibok Lee², Honglak Lee³, Jinwoo Shin¹•Institutions (3)

KAIST¹, University of Michigan², Google³

01 Jan 2018

TL;DR: This paper proposes a simple yet effective method for detecting any abnormal samples, which is applicable to any pre-trained softmax neural classifier, and obtains the class conditional Gaussian distributions with respect to (low- and upper-level) features of the deep models under Gaussian discriminant analysis.

...read moreread less

Abstract: Detecting test samples drawn sufficiently far away from the training distribution statistically or adversarially is a fundamental requirement for deploying a good classifier in many real-world machine learning applications. However, deep neural networks with the softmax classifier are known to produce highly overconfident posterior distributions even for such abnormal samples. In this paper, we propose a simple yet effective method for detecting any abnormal samples, which is applicable to any pre-trained softmax neural classifier. We obtain the class conditional Gaussian distributions with respect to (low- and upper-level) features of the deep models under Gaussian discriminant analysis, which result in a confidence score based on the Mahalanobis distance. While most prior methods have been evaluated for detecting either out-of-distribution or adversarial samples, but not both, the proposed method achieves the state-of-the-art performances for both cases in our experiments. Moreover, we found that our proposed method is more robust in harsh cases, e.g., when the training dataset has noisy labels or small number of samples. Finally, we show that the proposed method enjoys broader usage by applying it to class-incremental learning: whenever out-of-distribution samples are detected, our classification rule can incorporate new classes well without further training deep models.

...read moreread less

1,022 citations

Journal Article•DOI•

An institutional approach to cross-national distance

[...]

Heather Berry¹, Mauro F. Guillén¹, Nan Zhou¹•Institutions (1)

University of Pennsylvania¹

01 Jul 2010-Journal of International Business Studies

TL;DR: This article proposed a set of multidimensional measures, including economic, financial, political, administrative, cultural, demographic, knowledge, and global connectedness, as well as geographic distance.

...read moreread less

Abstract: Cross-national distance is a key concept in the field of management. Previous research has conceptualized and measured cross-national differences mostly in terms of dyadic cultural distance, and has used the Euclidean approach to measuring it. In contrast, our goal is to disaggregate the construct of distance by proposing a set of multidimensional measures, including economic, financial, political, administrative, cultural, demographic, knowledge, and global connectedness as well as geographic distance. We ground our analysis and choice of empirical dimensions on institutional theories of national business, governance, and innovation systems. In order to overcome the methodological limitations of the Euclidean approach, we calculate dyadic distances using the Mahalanobis method, which is scale-invariant and takes into consideration the variance–covariance matrix. We empirically analyze four different foreign expansion choices of US companies to illustrate the importance of disaggregating the distance construct and the usefulness of our distance calculations, which we make freely available to managers and scholars.

...read moreread less

981 citations

Collapse

Network Information

Performance

Metrics

5,278

Papers

110,076

Citations

No. of papers in the topic in previous years
Year	Papers
2024	1
2023	208
2022	452
2021	232
2020	239
2019	249

Mahalanobis distance

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics