scispace - formally typeset
Search or ask a question
Topic

Locality-sensitive hashing

About: Locality-sensitive hashing is a research topic. Over the lifetime, 1894 publications have been published within this topic receiving 69362 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A multi-layer neural network is developed to learn compact and discriminative binary codes by exploiting both the structural information between different frames within a video and the nonlinear relationship between video samples, and employs a subspace clustering method to cluster frames into different scenes.
Abstract: In this paper, we propose a nonlinear structural hashing approach to learn compact binary codes for scalable video search. Unlike most existing video hashing methods which consider image frames within a video separately for binary code learning, we develop a multi-layer neural network to learn compact and discriminative binary codes by exploiting both the structural information between different frames within a video and the nonlinear relationship between video samples. To be specific, we learn these binary codes under two different constraints at the output of our network: 1) the distance between the learned binary codes for frames within the same scene is minimized and 2) the distance between the learned binary matrices for a video pair with the same label is less than a threshold and that for a video pair with different labels is larger than a threshold. To better measure the structural information of the scenes from videos, we employ a subspace clustering method to cluster frames into different scenes. Moreover, we design multiple hierarchical nonlinear transformations to preserve the nonlinear relationship between videos. Experimental results on three video data sets show that our method outperforms state-of-the-art hashing approaches on the scalable video search task.

46 citations

Book ChapterDOI
05 Sep 2010
TL;DR: Two new algorithms that extend Spectral Hashing to non-Euclidean spaces and are able to retrieve similar objects in as low as O(K) time complexity, where K is the number of clusters in the data.
Abstract: Approximate Nearest Neighbor (ANN) methods such as Locality Sensitive Hashing, Semantic Hashing, and Spectral Hashing, provide computationally efficient procedures for finding objects similar to a query object in large datasets. These methods have been successfully applied to search web-scale datasets that can contain millions of images. Unfortunately, the key assumption in these procedures is that objects in the dataset lie in a Euclidean space. This assumption is not always valid and poses a challenge for several computer vision applications where data commonly lies in complex non-Euclidean manifolds. In particular, dynamic data such as human activities are commonly represented as distributions over bags of video words or as dynamical systems. In this paper, we propose two new algorithms that extend Spectral Hashing to non-Euclidean spaces. The first method considers the Riemannian geometry of the manifold and performs Spectral Hashing in the tangent space of the manifold at several points. The second method divides the data into subsets and takes advantage of the kernel trick to perform non-Euclidean Spectral Hashing. For a data set of N samples the proposed methods are able to retrieve similar objects in as low as O(K) time complexity, where K is the number of clusters in the data. Since K ≪ N, our methods are extremely efficient. We test and evaluate our methods on synthetic data generated from the Unit Hypersphere and the Grassmann manifold. Finally, we show promising results on a human action database.

46 citations

Journal ArticleDOI
TL;DR: OSH: an Online Supervised Hashing technique that is based on Error Correcting Output Codes is proposed, which considers a stochastic setting where the data arrives sequentially and the method learns and adapts its hashing functions in a discriminative manner and yields state-of-the-art retrieval performance.

46 citations

Proceedings ArticleDOI
12 Mar 2013
TL;DR: This work presents a new algorithm for similarity preserving hashing based on the idea of majority voting in conjunction with run length encoding to compress the input data and uses Bloom filters to represent the fingerprint called mvHash-B, which is almost as fast as SHA-1 and thus faster than any other SPH algorithm.
Abstract: The handling of hundreds of thousands of files is a major challenge in today's IT forensic investigations. In order to cope with this information overload, investigators use fingerprints (hash values) to identify known files automatically using blacklists or whitelists. Besides detecting exact duplicates it is helpful to locate similar files by using similarity preserving hashing (SPH), too. We present a new algorithm for similarity preserving hashing. It is based on the idea of majority voting in conjunction with run length encoding to compress the input data and uses Bloom filters to represent the fingerprint. It is therefore called mvHash-B. Our assessment shows that mvHash-B is superior to other SPHs with respect to run time efficiency: It is almost as fast as SHA-1 and thus faster than any other SPH algorithm. Additionally the hash value length is approximately 0.5% of the input length and hence outperforms most existing algorithms. Finally, we show that the robustness of mvHash-B against active manipulation is sufficient for practical purposes.

46 citations

Proceedings ArticleDOI
01 Oct 2010
TL;DR: A novel k-nearest neighbor search algorithm (KNNS) for proximity computation in motion planning algorithm that exploits the computational capabilities of many-core GPUs and exploits the multiple cores and data parallelism effectively.
Abstract: We present a novel k-nearest neighbor search algorithm (KNNS) for proximity computation in motion planning algorithm that exploits the computational capabilities of many-core GPUs. Our approach uses locality sensitive hashing and cuckoo hashing to construct an efficient KNNS algorithm that has linear space and time complexity and exploits the multiple cores and data parallelism effectively. In practice, we see magnitude improvement in speed and scalability over prior GPU-based KNNS algorithm. On some benchmarks, our KNNS algorithm improves the performance of overall planner by 20–40 times for CPU-based planner and up to 2 times for GPU-based planner.

46 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
84% related
Feature extraction
111.8K papers, 2.1M citations
83% related
Convolutional neural network
74.7K papers, 2M citations
83% related
Feature (computer vision)
128.2K papers, 1.7M citations
82% related
Support vector machine
73.6K papers, 1.7M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202343
2022108
202188
2020110
2019104
2018139