scispace - formally typeset
Search or ask a question
Topic

Locality-sensitive hashing

About: Locality-sensitive hashing is a research topic. Over the lifetime, 1894 publications have been published within this topic receiving 69362 citations.


Papers
More filters
Patent
24 Apr 2001
TL;DR: In this article, the authors describe an implementation of a technology for recognizing the perceptual similarity of the content of digital goods, which produces hash values for digital goods that are proximally near each other, when the digital goods contain similar content.
Abstract: An implementation of a technology is described herein for recognizing the perceptual similarity of the content of digital goods. At least one implementation, described herein, introduces a new hashing technique. More particularly, this hashing technique produces hash values for digital goods that are proximally near each other, when the digital goods contain perceptually similar content. In other words, if the content of digital goods are perceptually similar, then their hash values are, likewise, similar. The hash values are proximally near each other. This is unlike conventional hashing techniques where the hash values of goods with perceptually similar content are far apart with high probability in some distance sense (e.g., Hamming). This abstract itself is not intended to limit the scope of this patent. The scope of the present invention is pointed out in the appending claims.

82 citations

Journal ArticleDOI
TL;DR: Experimental results show that the proposed multiple feature kernel hashing framework can achieve superior accuracy and efficiency over state-of-the-art methods, and alternating optimization ways efficiently learn hashing functions and the kernel space.

82 citations

Proceedings ArticleDOI
24 Feb 2008
TL;DR: This paper presents a general pattern-based behavior synthesis framework which can efficiently extract similar structures in programs and applies it to FPGA resource optimization with the observation that multiplexors are particularly expensive on FPGAs.
Abstract: Pattern-based synthesis has drawn wide interest from researchers who tried to utilize the regularity in applications for design optimizations. In this paper we present a general pattern-based behavior synthesis framework which can efficiently extract similar structures in programs. Our approach is very scalable in benefit of advanced pruning techniques that include locality sensitive hashing and characteristic vectors. The similarity of structures is captured by a mismatch-tolerant metric: graph edit distance. The edit distance between two graphs is the minimum number of vertex/edge insertion, deletion, substitution operations to transform one graph into the other. Graph edit distance can naturally handle various program variations such as bit-width variations, structure variations and port variations. In addition, we apply our pattern-based synthesis system to FPGA resource optimization with the observation that multiplexors are particularly expensive on FPGA platforms. Considering knowledge of discovered patterns, the resource binding step can intelligently generate the data-path to reduce interconnect costs. Experiments show our approach can, on average, reduce the total area by about 20% with 7% latency overhead on the Xilinx Virtex-4 FPGAs, compared to the traditional behavior synthesis flow

82 citations

Journal ArticleDOI
TL;DR: This work proposes an adaptive binary quantization (ABQ) method that learns a discriminative hash function with prototypes associated with small unique binary codes and devise a distributed framework for the large-scale learning, which can significantly speed up the training of ABQ in the distributed environment.
Abstract: Hashing has been proved an attractive technique for fast nearest neighbor search over big data. Compared with the projection based hashing methods, prototype-based ones own stronger power to generate discriminative binary codes for the data with complex intrinsic structure. However, existing prototype-based methods, such as spherical hashing and K-means hashing, still suffer from the ineffective coding that utilizes the complete binary codes in a hypercube. To address this problem, we propose an adaptive binary quantization (ABQ) method that learns a discriminative hash function with prototypes associated with small unique binary codes. Our alternating optimization adaptively discovers the prototype set and the code set of a varying size in an efficient way, which together robustly approximate the data relations. Our method can be naturally generalized to the product space for long hash codes, and enjoys the fast training linear to the number of the training data. We further devise a distributed framework for the large-scale learning, which can significantly speed up the training of ABQ in the distributed environment that has been widely deployed in many areas nowadays. The extensive experiments on four large-scale (up to 80 million) data sets demonstrate that our method significantly outperforms state-of-the-art hashing methods, with up to 58.84% performance gains relatively.

82 citations

Journal ArticleDOI
Di Wang1, Xinbo Gao1, Xiumei Wang1, Lihuo He1, Bo Yuan1 
TL;DR: The proposed MDBE can preserve both discriminability and similarity for hash codes, and will enhance retrieval accuracy, compared with the state-of-the-art methods for large-scale cross-modal retrieval task.
Abstract: Multimodal hashing, which conducts effective and efficient nearest neighbor search across heterogeneous data on large-scale multimedia databases, has been attracting increasing interest, given the explosive growth of multimedia content on the Internet. Recent multimodal hashing research mainly aims at learning the compact binary codes to preserve semantic information given by labels. The overwhelming majority of these methods are similarity preserving approaches which approximate pairwise similarity matrix with Hamming distances between the to-be-learnt binary hash codes. However, these methods ignore the discriminative property in hash learning process, which results in hash codes from different classes undistinguished, and therefore reduces the accuracy and robustness for the nearest neighbor search. To this end, we present a novel multimodal hashing method, named multimodal discriminative binary embedding (MDBE), which focuses on learning discriminative hash codes. First, the proposed method formulates the hash function learning in terms of classification, where the binary codes generated by the learned hash functions are expected to be discriminative. And then, it exploits the label information to discover the shared structures inside heterogeneous data. Finally, the learned structures are preserved for hash codes to produce similar binary codes in the same class. Hence, the proposed MDBE can preserve both discriminability and similarity for hash codes, and will enhance retrieval accuracy. Thorough experiments on benchmark data sets demonstrate that the proposed method achieves excellent accuracy and competitive computational efficiency compared with the state-of-the-art methods for large-scale cross-modal retrieval task.

81 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
84% related
Feature extraction
111.8K papers, 2.1M citations
83% related
Convolutional neural network
74.7K papers, 2M citations
83% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Support vector machine
73.6K papers, 1.7M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202343
2022108
202188
2020110
2019104
2018139