Topic
Locality-sensitive hashing
About: Locality-sensitive hashing is a research topic. Over the lifetime, 1894 publications have been published within this topic receiving 69362 citations.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: This work presents and evaluates two indexing strategies for robust image hashes created by the ForBild tool, based on generic indexing approaches for Hamming spaces, i.e. spaces of bit vectors equipped with the Hamming distance.
22 citations
••
TL;DR: The classification accuracy of the proposed Bitwise Locality Sensitive method for two applications could approach the state-of-the-art method, while BitHash only requires a significantly smaller storage space.
Abstract: Locality Sensitive Hashing has been applied to detecting near-duplicate images, videos and web documents. In this paper we present a Bitwise Locality Sensitive method by using only one bit per hash value (BitHash), the storage space for storing hash values is significantly reduced, and the estimator can be computed much faster. The method provides an unbiased estimate of pairwise Jaccard similarity, and the estimator is a linear function of Hamming distance, which is very simple. We rigorously analyze the variance of One-Bit Min-Hash (BitHash), showing that for high Jaccard similarity. BitHash may provide accurate estimation, and as the pairwise Jaccard similarity increases, the variance ratio of BitHash over the original min-hash decreases. Furthermore, BitHash compresses each data sample into a compact binary hash code while preserving the pairwise similarity of the original data. The binary code can be used as a compressed and informative representation in replacement of the original data for subsequent processing. For example, it can be naturally integrated with a classifier like SVM. We apply BitHash to two typical applications, near-duplicate image detection and sentiment analysis. Experiments on real user’s photo collection and a popular sentiment analysis data set show that, the classification accuracy of our proposed method for two applications could approach the state-of-the-art method, while BitHash only requires a significantly smaller storage space.
22 citations
•
10 Jan 2014
TL;DR: In this article, a system, method, and tangible computing apparatus is disclosed for the detection of anomalies in an integrated data network, which comprises the creation and construction of a mathematical model that utilizes multi-dimensional mutual information to detect interactions and interrelationships between pairs of data streams and among pluralities of data stream.
Abstract: A system, method, and tangible computing apparatus is disclosed for the detection of anomalies in an integrated data network. Said system, method and apparatus comprises the creation and construction of a mathematical model that utilizes multi-dimensional mutual information to detect interactions and interrelationships between pairs of data streams and among pluralities of data streams. Real-time analysis of the operations of an integrated data network is enhanced and expedited via use of locality sensitive hashing that relies on density determinations of clusters of data.
22 citations
••
TL;DR: Experiments show that the proposed multiple histograms based image hashing is robust against content-preserving manipulations such as JPEG compression, watermark embedding, scaling, rotation, brightness and contrast adjustment, gamma correction and Gaussian low-pass filtering.
Abstract: Image hashing is a novel technology of multimedia and finds many applications such as image retrieval, image copy detection, digital watermarking and image indexing. This paper proposes a multiple histograms based image hashing, which can reach an acceptable trade-off between rotation robustness and discrimination. The proposed hashing is done by converting the input image into a normalized image, dividing it into different rings, extracting ring-based histograms and compressing them by discrete wavelet transform. Hash similarity is evaluated by L2 norm. Experiments show that our hashing is robust against content-preserving manipulations such as JPEG compression, watermark embedding, scaling, rotation, brightness and contrast adjustment, gamma correction and Gaussian low-pass filtering. Receiver operating characteristics (ROC) curve comparisons indicate that our hashing has better performances than two existing algorithms in classification between perceptual robustness and discriminative capability.
22 citations
•
03 Aug 2013TL;DR: This paper proposes a novel unified approximate nearest neighbor search scheme to combine the advantages of both the effective data structure and the fast Hamming distance computation in hashing methods so that the searching procedure can be further accelerated.
Abstract: Nowadays, Nearest Neighbor Search becomes more and more important when facing the challenge of big data. Traditionally, to solve this problem, researchers mainly focus on building effective data structures such as hierarchical k-means tree or using hashing methods to accelerate the query process. In this paper, we propose a novel unified approximate nearest neighbor search scheme to combine the advantages of both the effective data structure and the fast Hamming distance computation in hashing methods. In this way, the searching procedure can be further accelerated. Computational complexity analysis and extensive experiments have demonstrated the effectiveness of our proposed scheme.
22 citations