scispace - formally typeset
Search or ask a question
Posted Content•

Scalable Large-Margin Mahalanobis Distance Metric Learning

TL;DR: This work proposes a fast and scalable algorithm to learn a Mahalanobis distance metric and suggests that, compared with state-of-the-art metric learning algorithms, this algorithm can achieve a comparable classification accuracy with reduced computational complexity.
Abstract: For many machine learning algorithms such as $k$-Nearest Neighbor ($k$-NN) classifiers and $ k $-means clustering, often their success heavily depends on the metric used to calculate distances between different data points. An effective solution for defining such a metric is to learn it from a set of labeled training samples. In this work, we propose a fast and scalable algorithm to learn a Mahalanobis distance metric. By employing the principle of margin maximization to achieve better generalization performances, this algorithm formulates the metric learning as a convex optimization problem and a positive semidefinite (psd) matrix is the unknown variable. a specialized gradient descent method is proposed. our algorithm is much more efficient and has a better performance in scalability compared with existing methods. Experiments on benchmark data sets suggest that, compared with state-of-the-art metric learning algorithms, our algorithm can achieve a comparable classification accuracy with reduced computational complexity.
Citations
More filters
Journal Article•DOI•
Bo Du1, Liangpei Zhang1•
TL;DR: This paper proposes a new anomaly detection method by effectively exploiting a robust anomaly degree metric for increasing the separability between anomaly pixels and other background pixels, using discriminative information.
Abstract: Due to the high spectral resolution, anomaly detection from hyperspectral images provides a new way to locate potential targets in a scene, especially those targets that are spectrally different from the majority of the data set. Conventional Mahalanobis-distance-based anomaly detection methods depend on the background statistics to construct the anomaly detection metric. One of the main problems with these methods is that the Gaussian distribution assumption of the background may not be reasonable. Furthermore, these methods are also susceptible to contamination of the conventional background covariance matrix by anomaly pixels. This paper proposes a new anomaly detection method by effectively exploiting a robust anomaly degree metric for increasing the separability between anomaly pixels and other background pixels, using discriminative information. First, the manifold feature is used so as to divide the pixels into the potential anomaly part and the potential background part. This procedure is called discriminative information learning. A metric learning method is then performed to obtain the robust anomaly degree measurements. Experiments with three hyperspectral data sets reveal that the proposed method outperforms other current anomaly detection methods. The sensitivity of the method to several important parameters is also investigated.

253 citations

Journal Article•DOI•
Yanni Dong1, Bo Du1, Liangpei Zhang1•
TL;DR: This paper proposes an efficient metric learning detector based on random forests, named the random forest metric learning (RFML) algorithm, which combines semimultiple metrics with random forests to better separate the desired targets and background.
Abstract: Target detection is aimed at detecting and identifying target pixels based on specific spectral signatures, and is of great interest in hyperspectral image (HSI) processing. Target detection can be considered as essentially a binary classification. Random forests have been effectively applied to the classification of HSI data. However, random forests need a huge amount of labeled data to achieve a good performance, which can be difficult to obtain in target detection. In this paper, we propose an efficient metric learning detector based on random forests, named the random forest metric learning (RFML) algorithm, which combines semimultiple metrics with random forests to better separate the desired targets and background. The experimental results demonstrate that the proposed method outperforms both the state-of-the-art target detection algorithms and the other classical metric learning methods.

112 citations


Cites background or methods from "Scalable Large-Margin Mahalanobis D..."

  • ...The large margin nearest neighbor (LMNN) method [42], [43] trains the metric via a maximum margin framework....

    [...]

  • ...Meanwhile, NCA is also sensitive to the initial points, and it cannot obtain the optimal value if the parameters are not selected appropriately [43]....

    [...]

Journal Article•DOI•
TL;DR: This paper proposes an ensemble discriminative local metric learning (EDLML) algorithm for HSI analysis that aims to learn a subspace to keep all the samples in the same class are as near as possible, while those from different classes are separated.
Abstract: The high-dimensional data space of hyperspectral images (HSIs) often result in ill-conditioned formulations, which finally leads to many of the high-dimensional feature spaces being empty and the useful data existing primarily in a subspace. To avoid these problems, we use distance metric learning for dimensionality reduction. The goal of distance metric learning is to incorporate abundant discriminative information by reducing the dimensionality of the data. Considering that global metric learning is not appropriate for all training samples, this paper proposes an ensemble discriminative local metric learning (EDLML) algorithm for HSI analysis. The EDLML algorithm learns robust local metrics from both the training samples and the relative neighborhood of them and considers the different local discriminative distance metrics by dealing with the data region by region. It aims to learn a subspace to keep all the samples in the same class are as near as possible, while those from different classes are separated. The learned local metrics are then used to build an ensemble metric. Experiments on a number of different hyperspectral data sets confirm the effectiveness of the proposed EDLML algorithm compared with that of the other dimension reduction methods.

103 citations

Posted Content•
TL;DR: In this paper, a kernel classification framework is established, which can not only generalize many popular metric learning methods such as LMNN and ITML, but also suggest new metric learning method which can be efficiently implemented, interestingly, by using the standard SVM solvers.
Abstract: Learning a distance metric from the given training samples plays a crucial role in many machine learning tasks, and various models and optimization algorithms have been proposed in the past decade. In this paper, we generalize several state-of-the-art metric learning methods, such as large margin nearest neighbor (LMNN) and information theoretic metric learning (ITML), into a kernel classification framework. First, doublets and triplets are constructed from the training samples, and a family of degree-2 polynomial kernel functions are proposed for pairs of doublets or triplets. Then, a kernel classification framework is established, which can not only generalize many popular metric learning methods such as LMNN and ITML, but also suggest new metric learning methods, which can be efficiently implemented, interestingly, by using the standard support vector machine (SVM) solvers. Two novel metric learning methods, namely doublet-SVM and triplet-SVM, are then developed under the proposed framework. Experimental results show that doublet-SVM and triplet-SVM achieve competitive classification accuracies with state-of-the-art metric learning methods such as ITML and LMNN but with significantly less training time.

83 citations

Proceedings Article•DOI•
12 Aug 2012
TL;DR: This work adopts a new angle on the metric learning problem and learns a single metric that is able to implicitly adapt its distance function throughout the feature space and is an order of magnitude faster than state of the art multi-metric methods.
Abstract: Metric learning makes it plausible to learn semantically meaningful distances for complex distributions of data using label or pairwise constraint information. However, to date, most metric learning methods are based on a single Mahalanobis metric, which cannot handle heterogeneous data well. Those that learn multiple metrics throughout the feature space have demonstrated superior accuracy, but at a severe cost to computational efficiency. Here, we adopt a new angle on the metric learning problem and learn a single metric that is able to implicitly adapt its distance function throughout the feature space. This metric adaptation is accomplished by using a random forest-based classifier to underpin the distance function and incorporate both absolute pairwise position and standard relative position into the representation. We have implemented and tested our method against state of the art global and multi-metric methods on a variety of data sets. Overall, the proposed method outperforms both types of method in terms of accuracy (consistently ranked first) and is an order of magnitude faster than state of the art multi-metric methods (16x faster in the worst case).

65 citations


Cites background from "Scalable Large-Margin Mahalanobis D..."

  • ...Categories and Subject Descriptors H.2.8 [Database Management]: Database Applications data mining; I.5.1 [Computing Methodologies]: Pattern Recognition models General Terms Algorithms Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted…...

    [...]

References
More filters
Book•
01 Mar 2004
TL;DR: In this article, the focus is on recognizing convex optimization problems and then finding the most appropriate technique for solving them, and a comprehensive introduction to the subject is given. But the focus of this book is not on the optimization problem itself, but on the problem of finding the appropriate technique to solve it.
Abstract: Convex optimization problems arise frequently in many different fields. A comprehensive introduction to the subject, this book shows in detail how such problems can be solved numerically with great efficiency. The focus is on recognizing convex optimization problems and then finding the most appropriate technique for solving them. The text contains many worked examples and homework exercises and will appeal to students, researchers and practitioners in fields such as engineering, computer science, mathematics, statistics, finance, and economics.

33,341 citations

01 Jan 1998
TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Abstract: A comprehensive look at learning and generalization theory. The statistical theory of learning and generalization concerns the problem of choosing desired functions on the basis of empirical data. Highly applicable to a variety of computer science and robotics fields, this book offers lucid coverage of the theory as a whole. Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

26,531 citations


"Scalable Large-Margin Mahalanobis D..." refers background in this paper

  • ...It has been shown in the statistical learning theory [23] that increasing the margin between different classes helps to reduce the generalization error....

    [...]

Book•
01 Nov 2008
TL;DR: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization, responding to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems.
Abstract: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization. It responds to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems. For this new edition the book has been thoroughly updated throughout. There are new chapters on nonlinear interior methods and derivative-free methods for optimization, both of which are used widely in practice and the focus of much current research. Because of the emphasis on practical methods, as well as the extensive illustrations and exercises, the book is accessible to a wide audience. It can be used as a graduate text in engineering, operations research, mathematics, computer science, and business. It also serves as a handbook for researchers and practitioners in the field. The authors have strived to produce a text that is pleasant to read, informative, and rigorous - one that reveals both the beautiful nature of the discipline and its practical side.

17,420 citations

Proceedings Article•DOI•
20 Sep 1999
TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

16,989 citations


"Scalable Large-Margin Mahalanobis D..." refers methods in this paper

  • ...For each image, a number of interest regions are identified by the Harris-Affine detector [13] and the visual content in each re gion is characterized by the SIFT descriptor [12]....

    [...]

01 Jan 1998

12,940 citations


"Scalable Large-Margin Mahalanobis D..." refers methods in this paper

  • ...The Wine, Balance, Vehicle, Breast-Cancer and Diabetes data sets are obtained from University of California, Irvine, Machine Learning Repository [16]; USPS is from S. Roweis’ website;1 MNIST and Letter are from Libsvm [3]....

    [...]

  • ...The Wine, Balance, Vehicle, Breast-C ncer and Diabetes data sets are obtained from UCI Machine Learnin g Repository [14], and USPS, MNIST and Letter are from LibSVM [3] For MNIST, we only use its test data in our experiment....

    [...]