scispace - formally typeset
Search or ask a question

Showing papers by "Fumin Shen published in 2015"


Proceedings ArticleDOI
07 Jun 2015
TL;DR: This work proposes a new supervised hashing framework, where the learning objective is to generate the optimal binary hash codes for linear classification, and introduces an auxiliary variable to reformulate the objective such that it can be solved substantially efficiently by employing a regularization algorithm.
Abstract: Recently, learning based hashing techniques have attracted broad research interests because they can support efficient storage and retrieval for high-dimensional data such as images, videos, documents, etc. However, a major difficulty of learning to hash lies in handling the discrete constraints imposed on the pursued hash codes, which typically makes hash optimizations very challenging (NP-hard in general). In this work, we propose a new supervised hashing framework, where the learning objective is to generate the optimal binary hash codes for linear classification. By introducing an auxiliary variable, we reformulate the objective such that it can be solved substantially efficiently by employing a regularization algorithm. One of the key steps in this algorithm is to solve a regularization sub-problem associated with the NP-hard binary optimization. We show that the sub-problem admits an analytical solution via cyclic coordinate descent. As such, a high-quality discrete solution can eventually be obtained in an efficient computing manner, therefore enabling to tackle massive datasets. We evaluate the proposed approach, dubbed Supervised Discrete Hashing (SDH), on four large image datasets and demonstrate its superiority to the state-of-the-art hashing methods in large-scale image retrieval.

923 citations


Posted Content
TL;DR: Supervised Discrete Hashing (SDH) as mentioned in this paper proposes a new supervised hashing framework, where the learning objective is to generate the optimal binary hash codes for linear classification, which can support efficient storage and retrieval for high-dimensional data such as images, videos, documents, etc.
Abstract: Recently, learning based hashing techniques have attracted broad research interests because they can support efficient storage and retrieval for high-dimensional data such as images, videos, documents, etc. However, a major difficulty of learning to hash lies in handling the discrete constraints imposed on the pursued hash codes, which typically makes hash optimizations very challenging (NP-hard in general). In this work, we propose a new supervised hashing framework, where the learning objective is to generate the optimal binary hash codes for linear classification. By introducing an auxiliary variable, we reformulate the objective such that it can be solved substantially efficiently by employing a regularization algorithm. One of the key steps in this algorithm is to solve a regularization sub-problem associated with the NP-hard binary optimization. We show that the sub-problem admits an analytical solution via cyclic coordinate descent. As such, a high-quality discrete solution can eventually be obtained in an efficient computing manner, therefore enabling to tackle massive datasets. We evaluate the proposed approach, dubbed Supervised Discrete Hashing (SDH), on four large image datasets and demonstrate its superiority to the state-of-the-art hashing methods in large-scale image retrieval.

807 citations


Proceedings ArticleDOI
07 Dec 2015
TL;DR: This paper investigates learning binary codes to exclusively handle the MIPS problem, and proposes an asymmetric binary code learning framework based on inner product fitting, dubbed Asymmetric Inner-product Binary Coding (AIBC), which is evaluated on several large-scale image datasets.
Abstract: Binary coding or hashing techniques are recognized to accomplish efficient near neighbor search, and have thus attracted broad interests in the recent vision and learning studies. However, such studies have rarely been dedicated to Maximum Inner Product Search (MIPS), which plays a critical role in various vision applications. In this paper, we investigate learning binary codes to exclusively handle the MIPS problem. Inspired by the latest advance in asymmetric hashing schemes, we propose an asymmetric binary code learning framework based on inner product fitting. Specifically, two sets of coding functions are learned such that the inner products between their generated binary codes can reveal the inner products between original data vectors. We also propose an alternative simpler objective which maximizes the correlations between the inner products of the produced binary codes and raw data vectors. In both objectives, the binary codes and coding functions are simultaneously learned without continuous relaxations, which is the key to achieving high-quality binary codes. We evaluate the proposed method, dubbed Asymmetric Inner-product Binary Coding (AIBC), relying on the two objectives on several large-scale image datasets. Both of them are superior to the state-of-the-art binary coding and hashing methods in performing MIPS tasks.

157 citations


Journal ArticleDOI
TL;DR: It is shown that hashing on the basis of t-distributed stochastic neighbor embedding outperforms state-of-the-art hashing methods on large-scale benchmark data sets, and is very effective for image classification with very short code lengths, and the proposed framework can be further improved.
Abstract: Learning-based hashing methods have attracted considerable attention due to their ability to greatly increase the scale at which existing algorithms may operate. Most of these methods are designed to generate binary codes preserving the Euclidean similarity in the original space. Manifold learning techniques, in contrast, are better able to model the intrinsic structure embedded in the original high-dimensional data. The complexities of these models, and the problems with out-of-sample data, have previously rendered them unsuitable for application to large-scale embedding, however. In this paper, how to learn compact binary embeddings on their intrinsic manifolds is considered. In order to address the above-mentioned difficulties, an efficient, inductive solution to the out-of-sample data problem, and a process by which nonparametric manifold learning may be used as the basis of a hashing method are proposed. The proposed approach thus allows the development of a range of new hashing techniques exploiting the flexibility of the wide variety of manifold learning approaches available. It is particularly shown that hashing on the basis of t-distributed stochastic neighbor embedding outperforms state-of-the-art hashing methods on large-scale benchmark data sets, and is very effective for image classification with very short code lengths. It is shown that the proposed framework can be further improved, for example, by minimizing the quantization error with learned orthogonal rotations without much computation overhead. In addition, a supervised inductive manifold hashing framework is developed by incorporating the label information, which is shown to greatly advance the semantic retrieval performance.

131 citations


Journal ArticleDOI
TL;DR: This work proposes a novel unsupervised hashing approach, namely robust discrete hashing (RDSH), to facilitate large-scale semantic indexing of image data and integrates a flexible `2;p loss with nonlinear kernel embedding to adapt to different noise levels.
Abstract: In big data era, the ever-increasing image data has posed significant challenge on modern image retrieval. It is of great importance to index images with semantic keywords efficiently and effectively, especially confronted with fast-evolving property of the web. Learning-based hashing has shown its power in handling large-scale high-dimensional applications, such as image retrieval. Existing solutions normally separate the process of learning binary codes and hash functions into two independent stages to bypass challenge of the discrete constraints on binary codes. In this work, we propose a novel unsupervised hashing approach, namely robust discrete hashing (RDSH), to facilitate large-scale semantic indexing of image data. Specifically, RDSH simultaneously learns discrete binary codes as well as robust hash functions within a unified model. In order to suppress the influence of unreliable binary codes and learn robust hash functions, we also integrate a flexible $\ell _{2,p}$ loss with nonlinear kernel embedding to adapt to different noise levels. Finally, we devise an alternating algorithm to efficiently optimize RDSH model. Given a test image, we first conduct $r$ -nearest-neighbor search based on Hamming distance of binary codes, and then propagate semantic keywords of neighbors to the test image. Extensive experiments have been conducted on various real-world image datasets to show its superiority to the state-of-the-arts in large-scale semantic indexing.

85 citations


Proceedings ArticleDOI
13 Oct 2015
TL;DR: This paper proposes a novel unsupervised hashing approach, dubbed multi-view latent hashing (MVLH), to effectively incorporate multi- view data into hash code learning, and provides a novel scheme to directly learn the codes without resorting to continuous relaxations.
Abstract: Hashing techniques have attracted broad research interests in recent multimedia studies. However, most of existing hashing methods focus on learning binary codes from data with only one single view, and thus cannot fully utilize the rich information from multiple views of data. In this paper, we propose a novel unsupervised hashing approach, dubbed multi-view latent hashing (MVLH), to effectively incorporate multi-view data into hash code learning. Specifically, the binary codes are learned by the latent factors shared by multiple views from an unified kernel feature space, where the weights of different views are adaptively learned according to the reconstruction error with each view. We then propose to solve the associate optimization problem with an efficient alternating algorithm. To obtain high-quality binary codes, we provide a novel scheme to directly learn the codes without resorting to continuous relaxations, where each bit is efficiently computed in a closed form. We evaluate the proposed method on several large-scale datasets and the results demonstrate the superiority of our method over several other state-of-the-art methods.

62 citations


Journal ArticleDOI
TL;DR: This work proposes a novel image classification method for robust face recognition, named Low-Rank Representation-based Classification (LRRC), based on seeking the lowest-rank representation of a set of test samples with respect to aSet of training samples, which has the natural discrimination to perform classification.

43 citations


Proceedings ArticleDOI
13 Oct 2015
TL;DR: An on-line semantic coding model, which simultaneously exploits the rich hierarchical semantic prior knowledge in the learned dictionary, reflects semantic sparse property of visual codes, and explores semantic relationships among concepts in the semantic hierarchy is devised.
Abstract: In recent years, tremendous research endeavours have been dedicated to seeking effective visual representations for facilitating various multimedia applications, such as visual annotation and retrieval. Nonetheless, existing approaches can hardly achieve satisfactory performance due to the scarcity of fully exploring semantic properties of visual codes. In this paper, we present a novel visual coding approach, termed as hierarchical semantic visual coding (HSVC), to effectively encode visual objects (e.g., image and video) in a semantic hierarchy. Specifically, we first construct a semantic-enriched dictionary hierarchy, which is comprised of dictionaries corresponding to all concepts in a semantic hierarchy as well as their hierarchical semantic correlation. Moreover, we devise an on-line semantic coding model, which simultaneously 1) exploits the rich hierarchical semantic prior knowledge in the learned dictionary, 2) reflects semantic sparse property of visual codes, and 3) explores semantic relationships among concepts in the semantic hierarchy. To this end, we propose to integrate concept-level group sparsity constraint and semantic correlation matrix into a unified regularization term. We design an effective algorithm to optimize the proposed model, and a rigorous mathematical analysis has been provided to guarantee that the algorithm converges to a global optima. Extensive experiments on various multimedia datasets have been conducted to illustrate the superiority of our proposed approach as compared to state-of-the-art methods.

40 citations


Proceedings ArticleDOI
13 Oct 2015
TL;DR: This work proposes a novel model, termed as multi-view semi-supervised learning (MVSSL), for robust image annotation task, which exploits both labeled images and unlabeled images to uncover the intrinsic data structural information.
Abstract: With the explosive increasing of web image data, image annotation has become a critical research issue for image semantic index and search. In this work, we propose a novel model, termed as multi-view semi-supervised learning (MVSSL), for robust image annotation task. Specifically, we exploit both labeled images and unlabeled images to uncover the intrinsic data structural information. Meanwhile, to comprehensively describe an individual datum, we take advantage of the correlated and complemental information derived from multiple facets of image data (i.e., multiple views or features). We devise a robust pair-wise constraint on outcomes of different views to achieve annotation consistency. Furthermore, we integrate a robust classifier learning component via l2,1 loss, which can provide effective noise identification power during the learning process. Finally, we devise an efficient iterative algorithm to solve the optimization problem in MVSSL. We conduct extensive experiments on the NUS-WIDE dataset, and the results illustrate that our proposed approach is promising for large scale web image annotation task.

6 citations


Book ChapterDOI
16 Sep 2015
TL;DR: A new supervised hashing method to generate class-specific hash codes, which uses an inductive process based on the Inductive Manifold Hashing IMH model and leverage supervised information into hash codes generation to address these difficulties and boost the hashing quality.
Abstract: Recent years have witnessed the effectiveness and efficiency of learning-based hashing methods which generate short binary codes preserving the Euclidean similarity in the original space of high dimension. However, because of their complexities and out-of-sample problems, most of methods are not appropriate for embedding of large-scale datasets. In this paper, we have proposed a new supervised hashing method to generate class-specific hash codes, which uses an inductive process based on the Inductive Manifold Hashing IMH model and leverage supervised information into hash codes generation to address these difficulties and boost the hashing quality. It is experimentally shown that this method gets excellent performance of image classification and retrieval on large-scale multimedia dataset just with very short binary codes.

4 citations