scispace - formally typeset
Search or ask a question

Showing papers by "Fumin Shen published in 2013"


Proceedings ArticleDOI
23 Jun 2013
TL;DR: This work describes an efficient, inductive solution to the out-of-sample data problem, and a process by which non-parametric manifold learning may be used as the basis of a hashing method, which allows the development of a range of new hashing techniques exploiting the flexibility of the wide variety of manifold learning approaches available.
Abstract: Learning based hashing methods have attracted considerable attention due to their ability to greatly increase the scale at which existing algorithms may operate. Most of these methods are designed to generate binary codes that preserve the Euclidean distance in the original space. Manifold learning techniques, in contrast, are better able to model the intrinsic structure embedded in the original high-dimensional data. The complexity of these models, and the problems with out-of-sample data, have previously rendered them unsuitable for application to large-scale embedding, however. In this work, we consider how to learn compact binary embeddings on their intrinsic manifolds. In order to address the above-mentioned difficulties, we describe an efficient, inductive solution to the out-of-sample data problem, and a process by which non-parametric manifold learning may be used as the basis of a hashing method. Our proposed approach thus allows the development of a range of new hashing techniques exploiting the flexibility of the wide variety of manifold learning approaches available. We particularly show that hashing on the basis of t-SNE [29] outperforms state-of-the-art hashing methods on large-scale benchmark datasets, and is very effective for image classification with very short code lengths.

235 citations


Proceedings ArticleDOI
TL;DR: In this article, an efficient, inductive solution to the out-of-sample data problem, and a process by which non-parametric manifold learning may be used as the basis of a hashing method was proposed.
Abstract: Learning based hashing methods have attracted considerable attention due to their ability to greatly increase the scale at which existing algorithms may operate. Most of these methods are designed to generate binary codes that preserve the Euclidean distance in the original space. Manifold learning techniques, in contrast, are better able to model the intrinsic structure embedded in the original high-dimensional data. The complexity of these models, and the problems with out-of-sample data, have previously rendered them unsuitable for application to large-scale embedding, however. In this work, we consider how to learn compact binary embeddings on their intrinsic manifolds. In order to address the above-mentioned difficulties, we describe an efficient, inductive solution to the out-of-sample data problem, and a process by which non-parametric manifold learning may be used as the basis of a hashing method. Our proposed approach thus allows the development of a range of new hashing techniques exploiting the flexibility of the wide variety of manifold learning approaches available. We particularly show that hashing on the basis of t-SNE .

44 citations


Journal ArticleDOI
TL;DR: This work proposes an iterative procedure to approximately solve LTS' approximate complementary problem, which is convex minimization, and shows that this complementary problem can be efficiently solved as a second order cone program.
Abstract: The least trimmed sum of squares (LTS) regression estimation criterion is a robust statistical method for model fitting in the presence of outliers. Compared with the classical least squares estimator, which uses the entire data set for regression and is consequently sensitive to outliers, LTS identifies the outliers and fits to the remaining data points for improved accuracy. Exactly solving an LTS problem is NP-hard, but as we show here, LTS can be formulated as a concave minimization problem. Since it is usually tractable to globally solve a convex minimization or concave maximization problem in polynomial time, inspired by , we instead solve LTS' approximate complementary problem, which is convex minimization. We show that this complementary problem can be efficiently solved as a second order cone program. We thus propose an iterative procedure to approximately solve the original LTS problem. Our extensive experiments demonstrate that the proposed method is robust, efficient and scalable in dealing with problems where data are contaminated with outliers. We show several applications of our method in image analysis.

35 citations


Journal ArticleDOI
TL;DR: Based on the locality constrained representation, three algorithms are presented which outperform the state-of-the-art on the AR and Extended Yale B datasets for both the occlusion and single sample per person (SSPP) problems.

19 citations


Posted Content
TL;DR: The main finding of this work is that the standard image classification pipeline, which consists of dictionary learning, feature encoding, spatial pyramid pooling and linear classification, outperforms all state-of-the-art face recognition methods on the tested benchmark datasets.
Abstract: The main finding of this work is that the standard image classification pipeline, which consists of dictionary learning, feature encoding, spatial pyramid pooling and linear classification, outperforms all state-of-the-art face recognition methods on the tested benchmark datasets (we have tested on AR, Extended Yale B, the challenging FERET, and LFW-a datasets). This surprising and prominent result suggests that those advances in generic image classification can be directly applied to improve face recognition systems. In other words, face recognition may not need to be viewed as a separate object classification problem. While recently a large body of residual based face recognition methods focus on developing complex dictionary learning algorithms, in this work we show that a dictionary of randomly extracted patches (even from non-face images) can achieve very promising results using the image classification pipeline. That means, the choice of dictionary learning methods may not be important. Instead, we find that learning multiple dictionaries using different low-level image features often improve the final classification accuracy. Our proposed face recognition approach offers the best reported results on the widely-used face recognition benchmark datasets. In particular, on the challenging FERET and LFW-a datasets, we improve the best reported accuracies in the literature by about 20% and 30% respectively.

2 citations


Posted Content
TL;DR: It is shown how the L_\infty$ norm minimization problem can be broken up into smaller sub-problems, which can be solved extremely efficiently, and the radical reduction in computation time is shown, along with robustness against large numbers of outliers in a few model-fitting problems.
Abstract: Minimization of the $L_\infty$ norm, which can be viewed as approximately solving the non-convex least median estimation problem, is a powerful method for outlier removal and hence robust regression. However, current techniques for solving the problem at the heart of $L_\infty$ norm minimization are slow, and therefore cannot scale to large problems. A new method for the minimization of the $L_\infty$ norm is presented here, which provides a speedup of multiple orders of magnitude for data with high dimension. This method, termed Fast $L_\infty$ Minimization, allows robust regression to be applied to a class of problems which were previously inaccessible. It is shown how the $L_\infty$ norm minimization problem can be broken up into smaller sub-problems, which can then be solved extremely efficiently. Experimental results demonstrate the radical reduction in computation time, along with robustness against large numbers of outliers in a few model-fitting problems.

1 citations


Proceedings ArticleDOI
01 Jan 2013
TL;DR: A boosting-like algorithm framework by embedding semi-supervised subspace learning methods that selects weak classifiers based on class-separability and obtains superior performances over their supervised counterparts and AdaBoost.
Abstract: Boosting algorithms have attracted great attention since the first real-time face detector by Viola & Jones through feature selection and strong classifier learning simultaneously. On the other hand, researchers have proposed to decouple such two procedures to improve the performance of Boosting algorithms. Motivated by this, we propose a boosting-like algorithm framework by embedding semi-supervised subspace learning methods. It selects weak classifiers based on class-separability. Combination weights of selected weak classifiers can be obtained by subspace learning. Three typical algorithms are proposed under this framework and evaluated on public data sets. As shown by our experimental results, the proposed methods obtain superior performances over their supervised counterparts and AdaBoost.