Topic
Locality-sensitive hashing
About: Locality-sensitive hashing is a research topic. Over the lifetime, 1894 publications have been published within this topic receiving 69362 citations.
Papers published on a yearly basis
Papers
More filters
••
25 Jul 2005TL;DR: This paper explores the idea of applying current R-tree based indexes to approximate and on-line nearest neighbors with bounds and provides guidelines on how this can be useful in a practical sense.
Abstract: We explore using index structures for effective approximate and on-line nearest neighbor queries. While many index structures have showed to suffer from the dimensionality curse, we believe that indexes can still be useful in providing quick approximate solutions to the nearest neighbor queries. Moreover, the information provided by the indexes can provide certain bounds that can be invaluable for on-line nearest neighbor queries. This paper explores the idea of applying current R-tree based indexes to approximate and on-line nearest neighbors with bounds. We experiment with various heuristics and compare the trade-off between accuracy and efficiency. Our results are compared to locality sensitive hashing (LSH) and they show the effectiveness of the proposed scheme. We also provide guidelines on how this can be useful in a practical sense.
12 citations
••
03 Sep 2012TL;DR: This paper addresses the problem of speeding-up the prediction phase of linear Support Vector Machines via Locality Sensitive Hashing by building efficient hash based classifiers that are applied in a first stage in order to approximate the exact results and filter the hypothesis space.
Abstract: How-to train effective classifiers on huge amount of multimedia data is clearly a major challenge that is attracting more and more research works across several communities. Less efforts however are spent on the counterpart scalability issue: how to apply big trained models efficiently on huge non annotated media collections ? In this paper, we address the problem of speeding-up the prediction phase of linear Support Vector Machines via Locality Sensitive Hashing. We propose building efficient hash based classifiers that are applied in a first stage in order to approximate the exact results and filter the hypothesis space. Experiments performed with millions of one-against-one classifiers show that the proposed hash-based classifier can be more than two orders of magnitude faster than the exact classifier with minor losses in quality.
12 citations
•
TL;DR: The Indyk-Motwani locality sensitive hashing (LSH) framework as mentioned in this paper is a general technique for constructing a data structure to answer approximate near neighbor queries by using a distribution of hash functions over locality sensitive hash functions that partition space.
Abstract: The Indyk-Motwani Locality-Sensitive Hashing (LSH) framework (STOC 1998) is a general technique for constructing a data structure to answer approximate near neighbor queries by using a distribution $\mathcal{H}$ over locality-sensitive hash functions that partition space. For a collection of $n$ points, after preprocessing, the query time is dominated by $O(n^{\rho} \log n)$ evaluations of hash functions from $\mathcal{H}$ and $O(n^{\rho})$ hash table lookups and distance computations where $\rho \in (0,1)$ is determined by the locality-sensitivity properties of $\mathcal{H}$. It follows from a recent result by Dahlgaard et al. (FOCS 2017) that the number of locality-sensitive hash functions can be reduced to $O(\log^2 n)$, leaving the query time to be dominated by $O(n^{\rho})$ distance computations and $O(n^{\rho} \log n)$ additional word-RAM operations. We state this result as a general framework and provide a simpler analysis showing that the number of lookups and distance computations closely match the Indyk-Motwani framework, making it a viable replacement in practice. Using ideas from another locality-sensitive hashing framework by Andoni and Indyk (SODA 2006) we are able to reduce the number of additional word-RAM operations to $O(n^\rho)$.
12 citations
••
01 Apr 2017TL;DR: This paper proposes LSH-DDP, an approximate algorithm that exploits Locality Sensitive Hashing for partitioning data, performs local computation, and aggregates local results to approximate the final results, and presents formal analysis of this algorithm.
Abstract: -Density Peaks (DP) is a recently proposed clustering algorithm that has distinctive advantages over existing clustering algorithms It has already been used in a wide range of applications However, DP requires computing the distance between every pair of input points, therefore incurring quadratic computation overhead, which is prohibitive for large data sets In this paper, we propose an efficient distributed algorithm LSHDDP, which is an approximate algorithm that exploits Locality Sensitive Hashing We present formal analysis of LSH-DDP, and show that the approximation quality and the runtime can be controlled by tuning the parameters of LSH-DDP Experimental results on both a local cluster and EC2 show that LSH-DDP achieves a factor of 17-70x speedup over the na¨ive distributed DP implementation and 2x speedup over the state-of-the-art EDDPC approach, while returning comparable cluster results
12 citations
•
04 May 2015
TL;DR: To remove the drawbacks of simple hashing technique, the LSBF must be implemented to store data in the bloom filter which will help to search the most approximate result by using the Locality Sensitive Hashing approach.
Abstract: For faster access of data or in network bloom filter plays an important part in searching technique. It process data in short amount of time and frequently with probabilistic analysis. Bloom Filter also decreases the cost of analyzing data. Various applications are using this technology for accessing and processing the data. Thus by implementing Bloom's Filter over big data will result into efficient query accessing in big data. In this paper, an approach to implement Locality Sensitive Bloom Filter (LSBF) technique in big data is proposed. To remove the drawbacks of simple hashing technique, the LSBF must be implemented to store data in the bloom filter which will help to search the most approximate result by using the Locality Sensitive Hashing approach.
12 citations