scispace - formally typeset
Search or ask a question
Topic

Locality-sensitive hashing

About: Locality-sensitive hashing is a research topic. Over the lifetime, 1894 publications have been published within this topic receiving 69362 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A novel multimodal hashing method, termed as semantic neighbor graph hashing (SNGH), which aims to preserve the fine-grained similarity metric based on the semantic graph that is constructed by jointly pursuing the semantic supervision and the local neighborhood structure is proposed.
Abstract: Hashing methods have been widely used for approximate nearest neighbor search in recent years due to its computational and storage effectiveness. Most existing multimodal hashing methods try to preserve the similarity relationship based on either metric distances or semantic labels in a procrustean way, while ignoring the intra-class and inter-class variations inherent in the metric space. In this paper, we propose a novel multimodal hashing method, termed as semantic neighbor graph hashing (SNGH), which aims to preserve the fine-grained similarity metric based on the semantic graph that is constructed by jointly pursuing the semantic supervision and the local neighborhood structure. Specifically, the semantic graph is constructed to capture the local similarity structure for the image modality and the text modality, respectively. Furthermore, we define a function based on the local similarity in particular to adaptively calculate multi-level similarities by encoding the intra-class and inter-class variations. After obtaining the unified hash codes, the logistic regression with kernel trick is employed to learn view-specific hash functions independently for each modality. Extensive experiments are conducted on four widely used multimodal data sets. The experimental results demonstrate the superiority of the proposed SNGH method compared with the state-of-the-art multimodal hashing methods.

35 citations

Proceedings Article
25 Jul 2015
TL;DR: A novel hashing approach to deal with Partial Multi-Modal data is presented, in which the hashing codes are learned by simultaneously ensuring the data consistency among different modalities via latent subspace learning, and preserving data similarity within the same modality through graph Laplacian.
Abstract: Hashing approach becomes popular for fast similarity search in many large scale applications. Real world data are usually with multiple modalities or having different representations from multiple sources. Various hashing methods have been proposed to generate compact binary codes from multi-modal data. However, most existing multimodal hashing techniques assume that each data example appears in all modalities, or at least there is one modality containing all data examples. But in real applications, it is often the case that every modality suffers from the missing of some data and therefore results in many partial examples, i.e., examples with some modalities missing. In this paper, we present a novel hashing approach to deal with Partial Multi-Modal data. In particular, the hashing codes are learned by simultaneously ensuring the data consistency among different modalities via latent subspace learning, and preserving data similarity within the same modality through graph Laplacian. We then further improve the codes via orthogonal rotation based on the orthogonal invariant property of our formulation. Experiments on two multi-modal datasets demonstrate the superior performance of the proposed approach over several state-of-the-art multi-modal hashing methods.

34 citations

Journal ArticleDOI
TL;DR: A big data-driven and nonparametric model aided by 6G is proposed in this article to extract similar traffic patterns over time for accurate and efficient short-term traffic flow prediction in massive IoT, which is mainly based on time-aware locality-sensitive hashing (LSH).
Abstract: With the advent of the Internet of Things (IoT) and the increasing popularity of the intelligent transportation system, a large number of sensing devices are installed on the road for monitoring traffic dynamics in real time. These sensors can collect streaming traffic data distributed across different traffic sites, which constitute the main source of big traffic data. Analyzing and mining such big traffic data in massive IoT can help traffic administrations to make scientific and reasonable traffic scheduling decisions, so as to avoid prospective traffic congestions in the future. However, the above traffic decision making often requires frequent and massive data transmissions between distributed sensors and centralized cloud computing centers, which calls for lightweight data integrations and accurate data analyses based on large-scale traffic data. In view of this challenge, a big data-driven and nonparametric model aided by 6G is proposed in this article to extract similar traffic patterns over time for accurate and efficient short-term traffic flow prediction in massive IoT, which is mainly based on time-aware locality-sensitive hashing (LSH). We design a wide range of experiments based on a real-world big traffic data set to validate the feasibility of our proposal. Experimental reports demonstrate that the prediction accuracy and efficiency of our proposal are increased by 32.6% and 97.3%, respectively, compared with the other two competitive approaches.

34 citations

Journal ArticleDOI
TL;DR: This paper proposes a semi-supervised technique for spam detection in Twitter by employing ensemble based framework comprising of four classifiers in various stages which provide fast results with less computational effort.

34 citations

Journal ArticleDOI
TL;DR: Cross-Modal Self-Taught Hashing (CMSTH) is proposed for large-scale cross-modal and unimodal image retrieval and can effectively capture the semantic correlation from unlabeled training data.

34 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
84% related
Feature extraction
111.8K papers, 2.1M citations
83% related
Convolutional neural network
74.7K papers, 2M citations
83% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Support vector machine
73.6K papers, 1.7M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202343
2022108
202188
2020110
2019104
2018139