scispace - formally typeset
Search or ask a question
Topic

Locality-sensitive hashing

About: Locality-sensitive hashing is a research topic. Over the lifetime, 1894 publications have been published within this topic receiving 69362 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This work presents and evaluates two indexing strategies for robust image hashes created by the ForBild tool, based on generic indexing approaches for Hamming spaces, i.e. spaces of bit vectors equipped with the Hamming distance.

22 citations

Journal ArticleDOI
TL;DR: The classification accuracy of the proposed Bitwise Locality Sensitive method for two applications could approach the state-of-the-art method, while BitHash only requires a significantly smaller storage space.
Abstract: Locality Sensitive Hashing has been applied to detecting near-duplicate images, videos and web documents. In this paper we present a Bitwise Locality Sensitive method by using only one bit per hash value (BitHash), the storage space for storing hash values is significantly reduced, and the estimator can be computed much faster. The method provides an unbiased estimate of pairwise Jaccard similarity, and the estimator is a linear function of Hamming distance, which is very simple. We rigorously analyze the variance of One-Bit Min-Hash (BitHash), showing that for high Jaccard similarity. BitHash may provide accurate estimation, and as the pairwise Jaccard similarity increases, the variance ratio of BitHash over the original min-hash decreases. Furthermore, BitHash compresses each data sample into a compact binary hash code while preserving the pairwise similarity of the original data. The binary code can be used as a compressed and informative representation in replacement of the original data for subsequent processing. For example, it can be naturally integrated with a classifier like SVM. We apply BitHash to two typical applications, near-duplicate image detection and sentiment analysis. Experiments on real user’s photo collection and a popular sentiment analysis data set show that, the classification accuracy of our proposed method for two applications could approach the state-of-the-art method, while BitHash only requires a significantly smaller storage space.

22 citations

Patent
10 Jan 2014
TL;DR: In this article, a system, method, and tangible computing apparatus is disclosed for the detection of anomalies in an integrated data network, which comprises the creation and construction of a mathematical model that utilizes multi-dimensional mutual information to detect interactions and interrelationships between pairs of data streams and among pluralities of data stream.
Abstract: A system, method, and tangible computing apparatus is disclosed for the detection of anomalies in an integrated data network. Said system, method and apparatus comprises the creation and construction of a mathematical model that utilizes multi-dimensional mutual information to detect interactions and interrelationships between pairs of data streams and among pluralities of data streams. Real-time analysis of the operations of an integrated data network is enhanced and expedited via use of locality sensitive hashing that relies on density determinations of clusters of data.

22 citations

Journal ArticleDOI
TL;DR: Experiments show that the proposed multiple histograms based image hashing is robust against content-preserving manipulations such as JPEG compression, watermark embedding, scaling, rotation, brightness and contrast adjustment, gamma correction and Gaussian low-pass filtering.
Abstract: Image hashing is a novel technology of multimedia and finds many applications such as image retrieval, image copy detection, digital watermarking and image indexing. This paper proposes a multiple histograms based image hashing, which can reach an acceptable trade-off between rotation robustness and discrimination. The proposed hashing is done by converting the input image into a normalized image, dividing it into different rings, extracting ring-based histograms and compressing them by discrete wavelet transform. Hash similarity is evaluated by L2 norm. Experiments show that our hashing is robust against content-preserving manipulations such as JPEG compression, watermark embedding, scaling, rotation, brightness and contrast adjustment, gamma correction and Gaussian low-pass filtering. Receiver operating characteristics (ROC) curve comparisons indicate that our hashing has better performances than two existing algorithms in classification between perceptual robustness and discriminative capability.

22 citations

Proceedings Article
Debing Zhang1, Genmao Yang1, Yao Hu1, Zhongming Jin1, Deng Cai1, Xiaofei He1 
03 Aug 2013
TL;DR: This paper proposes a novel unified approximate nearest neighbor search scheme to combine the advantages of both the effective data structure and the fast Hamming distance computation in hashing methods so that the searching procedure can be further accelerated.
Abstract: Nowadays, Nearest Neighbor Search becomes more and more important when facing the challenge of big data. Traditionally, to solve this problem, researchers mainly focus on building effective data structures such as hierarchical k-means tree or using hashing methods to accelerate the query process. In this paper, we propose a novel unified approximate nearest neighbor search scheme to combine the advantages of both the effective data structure and the fast Hamming distance computation in hashing methods. In this way, the searching procedure can be further accelerated. Computational complexity analysis and extensive experiments have demonstrated the effectiveness of our proposed scheme.

22 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
84% related
Feature extraction
111.8K papers, 2.1M citations
83% related
Convolutional neural network
74.7K papers, 2M citations
83% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Support vector machine
73.6K papers, 1.7M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202343
2022108
202188
2020110
2019104
2018139