scispace - formally typeset
Search or ask a question
Topic

Mahalanobis distance

About: Mahalanobis distance is a research topic. Over the lifetime, 4616 publications have been published within this topic receiving 95294 citations.


Papers
More filters
Book ChapterDOI
Charu C. Aggarwal1
01 Jan 2013
TL;DR: The proximity of a data point may be defined in a variety of ways, which are subtly different from one another, but are similar enough to merit a unified treatment within a single chapter.
Abstract: Proximity-based techniques define a data point as an outlier, if its locality (or proximity) is sparsely populated. The proximity of a data point may be defined in a variety of ways, which are subtly different from one another, but are similar enough to merit a unified treatment within a single chapter.

22 citations

Book ChapterDOI
01 Jan 1996
TL;DR: An iterative and a non-iterative method to calculate the estimates in a method-performance study are presented and a new method based on a score-function allows to characterise the performance of laboratories both as groups and individually.
Abstract: Interlaboratory analytical study is the general term of an experiment organised by a committee and involving several laboratories to achieve a common goal. Two important types of studies are the method-performance studies and the laboratory-performance studies. The purpose of a method-performance study is to determine the precision and bias characteristics of an analytical test method. A laboratory-performance study ascertains whether the laboratories conform to stated standards in their testing activities. An iterative and a non-iterative method to calculate the estimates in a method-performance study are presented and a new method based on a score-function allows to characterise the performance of laboratories both as groups and individually. This score is a squared Mahalanobis distance with robust estimates of means and covariances. For the latters’ determination the specific structure of the interlaboratory-test data is taken into account. Instructive graphical displays support the classification of the laboratories.

22 citations

Journal ArticleDOI
TL;DR: A relatively new bees inspired optimization algorithm is utilized, the bumble bees mating optimization algorithm, to implement a feature subset selection procedure while the nearest neighbor classification method is used for the classification task.
Abstract: The feature selection problem is an interesting and important topic which is relevant for a variety of database applications. This paper utilizes a relatively new bees inspired optimization algorithm, the bumble bees mating optimization algorithm, to implement a feature subset selection procedure while the nearest neighbor classification method is used for the classification task. Several metrics are used in the nearest neighbor classification method, such as the euclidean distance, the standardized euclidean distance, the mahalanobis distance, the city block metric, the cosine distance and the correlation distance, in order to identify the most significant metric for the nearest neighbor classifier. The performance of the proposed algorithm is tested using various benchmark data sets from the UCI machine learning repository. The algorithm is compared with two other bees inspired algorithms, the one is based on the foraging behavior of the bees, the discrete artificial bee colony, and the other is based on the mating behavior of the bees, the honey bees mating optimization algorithm. The algorithm is, also, compared with a particle swarm optimization algorithm, an ant colony optimization algorithm, a genetic algorithm and with a number of algorithms from the literature.

22 citations

Journal ArticleDOI
01 Jan 2020-Energy
TL;DR: The implementation of the k-Nearest Neighbour (k-NN) classification model shows that k–NN can serve as a handy tool for biomass resources classification irrespective of the sources and origins.

22 citations

Journal ArticleDOI
TL;DR: This study shows that anthropometrics can be extremely useful in assessing population structure and history, differential gene flow into populations can have a major impact on local genetic structure, and microevolutionary processes can have different effects on biological characters and surnames.
Abstract: The analysis of anthropometric data often allows investigation of patterns of genetic structure in historical populations. This paper focuses on interpopulational anthropometric variation in seven populations in Ireland using data collected in the 1890s. The seven populations were located within a 120-km range along the west coast of Ireland and include islands and mainland isolates. Two of the populations (the Aran Islands and Inishbofin) have a known history of English admixture in earlier centuries. Ten anthropometric measures (head length, breadth, and height; nose length and breadth; bizygomatic and bigonial breadth; stature; hand length; and forearm length) on 259 adult Irish males were analyzed following age adjustment. Discriminant and canonical variates analysis were used to determine the degree and pattern of among-group variation. Mahalanobis' distance measure, D2, was computed between each pair of populations and compared to distance measures based on geographic distance and English admixture (a binary measure indicating whether either of a pair of populations had historical indications of admixture). In addition, surname frequencies were used to construct distance measures based on random isonymy. Correlations were computed between distance measures, and their probabilities were derived using the Mantel matrix permutation method. English admixture has the greatest effect on anthropometric variation among these populations, followed by geographic distance. The correlation between anthropometric distance and geographic distance is not significant (r = -0.081, P = .590), but the correlation of admixture and anthropometric distance is significant (r = 0.829, P = .047). When the two admixed populations are removed from the analysis the correlation between geographic and anthropometric distance becomes significant (r = 0.718, P = .025). Isonymy distance shows a significant correlation with geographic distance (r = 0.425, P = .046) but not with admixture distance (r = -0.052, P = .524). The fact that anthropometrics show past patterns of gene flow and surnames do not reflects the greater impact of stochastic processes on surnames, along with the continued extinction of surnames. This study shows that 1) anthropometrics can be extremely useful in assessing population structure and history, 2) differential gene flow into populations can have a major impact on local genetic structure, and 3) microevolutionary processes can have different effects on biological characters and surnames.

22 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
79% related
Artificial neural network
207K papers, 4.5M citations
79% related
Feature extraction
111.8K papers, 2.1M citations
77% related
Convolutional neural network
74.7K papers, 2M citations
77% related
Image processing
229.9K papers, 3.5M citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
2023208
2022452
2021232
2020239
2019249