scispace - formally typeset
Search or ask a question
Topic

Mahalanobis distance

About: Mahalanobis distance is a research topic. Over the lifetime, 4616 publications have been published within this topic receiving 95294 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A novel pairwise similarity measure that advances existing models by i) expanding traditional linear projections into affine transformations and ii) fusing affine Mahalanobis distance and Cosine similarity by a data-driven combination is presented.
Abstract: Cross-domain visual data matching is one of the fundamental problems in many real-world vision tasks, e.g., matching persons across ID photos and surveillance videos. Conventional approaches to this problem usually involves two steps: i) projecting samples from different domains into a common space, and ii) computing (dis-)similarity in this space based on a certain distance. In this paper, we present a novel pairwise similarity measure that advances existing models by i) expanding traditional linear projections into affine transformations and ii) fusing affine Mahalanobis distance and Cosine similarity by a data-driven combination. Moreover, we unify our similarity measure with feature representation learning via deep convolutional neural networks. Specifically, we incorporate the similarity measure matrix into the deep architecture, enabling an end-to-end way of model optimization. We extensively evaluate our generalized similarity model in several challenging cross-domain matching tasks: person re-identification under different views and face verification over different modalities (i.e., faces from still images and videos, older and younger faces, and sketch and photo portraits). The experimental results demonstrate superior performance of our model over other state-of-the-art methods.

143 citations

Book ChapterDOI
08 Oct 2016
TL;DR: In this article, a metric learning formulation called Weighted Approximate Rank Component Analysis (WARCA) is proposed to optimize the precision at top ranks by combining the WARP loss with a regularizer that favors orthonormal linear mappings and avoids rank-deficient embeddings.
Abstract: We are interested in the large-scale learning of Mahalanobis distances, with a particular focus on person re-identification. We propose a metric learning formulation called Weighted Approximate Rank Component Analysis (WARCA). WARCA optimizes the precision at top ranks by combining the WARP loss with a regularizer that favors orthonormal linear mappings and avoids rank-deficient embeddings. Using this new regularizer allows us to adapt the large-scale WSABIE procedure and to leverage the Adam stochastic optimization algorithm, which results in an algorithm that scales gracefully to very large data-sets. Also, we derive a kernelized version which allows to take advantage of state-of-the-art features for re-identification when data-set size permits kernel computation. Benchmarks on recent and standard re-identification data-sets show that our method beats existing state-of-the-art techniques both in terms of accuracy and speed. We also provide experimental analysis to shade lights on the properties of the regularizer we use, and how it improves performance.

142 citations

Journal ArticleDOI
TL;DR: A suite of supervised metric learning approaches that answer the above questions about patient similarity assessment and a clinical decision support prototype system powered by the proposed patient similarity methods are presented.
Abstract: Patient similarity assessment is an important task in the context of patient cohort identif cation for comparative effectiveness studies and clinical decision support applications The goal is to derive clinically meaningful distance metric to measure the similarity between patients represented by their key clinical indicators How to incorporate physician feedback with regard to the retrieval results? How to interactively update the underlying similarity measure based on the feedback? Moreover, often different physicians have different understandings of patient similarity based on their patient cohorts The distance metric learned for each individual physician often leads to a limited view of the true underlying distance metric How to integrate the individual distance metrics from each physician into a globally consistent unif ed metric?We describe a suite of supervised metric learning approaches that answer the above questions In particular, we present Locally Supervised Metric Learning (LSML) to learn a generalized Mahalanobis distance that is tailored toward physician feedback Then we describe the interactive metric learning (iMet) method that can incrementally update an existing metric based on physician feedback in an online fashion To combine multiple similarity measures from multiple physicians, we present Composite Distance Integration (Comdi) method In this approach we f rst construct discriminative neighborhoods from each individual metrics, then combine them into a single optimal distance metric Finally, we present a clinical decision support prototype system powered by the proposed patient similarity methods, and evaluate the proposed methods using real EHR data against several baselines

142 citations

Journal ArticleDOI
TL;DR: A LogDet divergence-based metric learning with triplet constraint model which can learn Mahalanobis matrix with high precision and robustness is established.
Abstract: Multivariate time series (MTS) datasets broadly exist in numerous fields, including health care, multimedia, finance, and biometrics. How to classify MTS accurately has become a hot research topic since it is an important element in many computer vision and pattern recognition applications. In this paper, we propose a Mahalanobis distance-based dynamic time warping (DTW) measure for MTS classification. The Mahalanobis distance builds an accurate relationship between each variable and its corresponding category. It is utilized to calculate the local distance between vectors in MTS. Then we use DTW to align those MTS which are out of synchronization or with different lengths. After that, how to learn an accurate Mahalanobis distance function becomes another key problem. This paper establishes a LogDet divergence-based metric learning with triplet constraint model which can learn Mahalanobis matrix with high precision and robustness. Furthermore, the proposed method is applied on nine MTS datasets selected from the University of California, Irvine machine learning repository and Robert T. Olszewski’s homepage, and the results demonstrate the improved performance of the proposed approach.

140 citations

Journal ArticleDOI
TL;DR: Both the proposed resampling by the half-means method and the smallest half-volume method are simple to use, are conceptually clear, and provide results superior to MVT and the current best-performing technique, MCD.
Abstract: The unreliability of multivariate outlier detection techniques such as Mahalanobis distance and hat matrix leverage has been known in the statistical community for well over a decade. However, only within the past few years has a serious effort been made to introduce robust methods for the detection of multivariate outliers into the chemical literature. Techniques such as the minimum volume ellipsoid (MVE), multivariate trimming (MVT), and M-estimators (e.g., PROP), and others similar to them, such as the minimum covariance determinant (MCD), rely upon algorithms that are difficult to program and may require significant processing times. While MCD and MVE have been shown to be statistically sound, we found MVT unreliable due to the method's use of the Mahalanobis distance measure in its initial step. We examined the performance of MCD and MVT on selected data sets and in simulations and compared the results with two methods of our own devising. Both the proposed resampling by the half-means method and the smallest half-volume method are simple to use, are conceptually clear, and provide results superior to MVT and the current best-performing technique, MCD. Either proposed method is recommended for the detection of multiple outliers in multivariate data.

139 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
79% related
Artificial neural network
207K papers, 4.5M citations
79% related
Feature extraction
111.8K papers, 2.1M citations
77% related
Convolutional neural network
74.7K papers, 2M citations
77% related
Image processing
229.9K papers, 3.5M citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
2023208
2022452
2021232
2020239
2019249