Topic

Locality-sensitive hashing

About: Locality-sensitive hashing is a research topic. Over the lifetime, 1894 publications have been published within this topic receiving 69362 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Utilizing indexes for approximate and on-line nearest neighbor queries

[...]

King-Ip Lin¹, M. Nolen¹, K. Kommeneni¹•Institutions (1)

University of Memphis¹

25 Jul 2005

TL;DR: This paper explores the idea of applying current R-tree based indexes to approximate and on-line nearest neighbors with bounds and provides guidelines on how this can be useful in a practical sense.

...read moreread less

Abstract: We explore using index structures for effective approximate and on-line nearest neighbor queries. While many index structures have showed to suffer from the dimensionality curse, we believe that indexes can still be useful in providing quick approximate solutions to the nearest neighbor queries. Moreover, the information provided by the indexes can provide certain bounds that can be invaluable for on-line nearest neighbor queries. This paper explores the idea of applying current R-tree based indexes to approximate and on-line nearest neighbors with bounds. We experiment with various heuristics and compare the trade-off between accuracy and efficiency. Our results are compared to locality sensitive hashing (LSH) and they show the effectiveness of the proposed scheme. We also provide guidelines on how this can be useful in a practical sense.

...read moreread less

12 citations

Proceedings Article•DOI•

Hash-Based Support Vector Machines Approximation for Large Scale Prediction

[...]

Saloua Litayem¹, Alexis Joly¹, Nozha Boujemaa¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

03 Sep 2012

TL;DR: This paper addresses the problem of speeding-up the prediction phase of linear Support Vector Machines via Locality Sensitive Hashing by building efficient hash based classifiers that are applied in a first stage in order to approximate the exact results and filter the hypothesis space.

...read moreread less

Abstract: How-to train effective classifiers on huge amount of multimedia data is clearly a major challenge that is attracting more and more research works across several communities. Less efforts however are spent on the counterpart scalability issue: how to apply big trained models efficiently on huge non annotated media collections ? In this paper, we address the problem of speeding-up the prediction phase of linear Support Vector Machines via Locality Sensitive Hashing. We propose building efficient hash based classifiers that are applied in a first stage in order to approximate the exact results and filter the hypothesis space. Experiments performed with millions of one-against-one classifiers show that the proposed hash-based classifier can be more than two orders of magnitude faster than the exact classifier with minor losses in quality.

...read moreread less

12 citations

Posted Content•

Fast Locality-Sensitive Hashing Frameworks for Approximate Near Neighbor Search

[...]

Tobias Christiani¹•Institutions (1)

Maersk¹

25 Aug 2017-arXiv: Data Structures and Algorithms

TL;DR: The Indyk-Motwani locality sensitive hashing (LSH) framework as mentioned in this paper is a general technique for constructing a data structure to answer approximate near neighbor queries by using a distribution of hash functions over locality sensitive hash functions that partition space.

...read moreread less

Abstract: The Indyk-Motwani Locality-Sensitive Hashing (LSH) framework (STOC 1998) is a general technique for constructing a data structure to answer approximate near neighbor queries by using a distribution $\mathcal{H}$ over locality-sensitive hash functions that partition space. For a collection of $n$ points, after preprocessing, the query time is dominated by $O(n^{\rho} \log n)$ evaluations of hash functions from $\mathcal{H}$ and $O(n^{\rho})$ hash table lookups and distance computations where $\rho \in (0,1)$ is determined by the locality-sensitivity properties of $\mathcal{H}$. It follows from a recent result by Dahlgaard et al. (FOCS 2017) that the number of locality-sensitive hash functions can be reduced to $O(\log^2 n)$, leaving the query time to be dominated by $O(n^{\rho})$ distance computations and $O(n^{\rho} \log n)$ additional word-RAM operations. We state this result as a general framework and provide a simpler analysis showing that the number of lookups and distance computations closely match the Indyk-Motwani framework, making it a viable replacement in practice. Using ideas from another locality-sensitive hashing framework by Andoni and Indyk (SODA 2006) we are able to reduce the number of additional word-RAM operations to $O(n^\rho)$.

...read moreread less

12 citations

Proceedings Article•DOI•

Efficient Distributed Density Peaks for Clustering Large Data Sets in MapReduce

[...]

Yanfeng Zhang¹, Shimin Cheny, Ge Yu¹•Institutions (1)

Northeastern University¹

01 Apr 2017

TL;DR: This paper proposes LSH-DDP, an approximate algorithm that exploits Locality Sensitive Hashing for partitioning data, performs local computation, and aggregates local results to approximate the final results, and presents formal analysis of this algorithm.

...read moreread less

Abstract: -Density Peaks (DP) is a recently proposed clustering algorithm that has distinctive advantages over existing clustering algorithms It has already been used in a wide range of applications However, DP requires computing the distance between every pair of input points, therefore incurring quadratic computation overhead, which is prohibitive for large data sets In this paper, we propose an efficient distributed algorithm LSHDDP, which is an approximate algorithm that exploits Locality Sensitive Hashing We present formal analysis of LSH-DDP, and show that the approximation quality and the runtime can be controlled by tuning the parameters of LSH-DDP Experimental results on both a local cluster and EC2 show that LSH-DDP achieves a factor of 17-70x speedup over the na¨ive distributed DP implementation and 2x speedup over the state-of-the-art EDDPC approach, while returning comparable cluster results

...read moreread less

12 citations

Proceedings Article•

Big data query optimization by using Locality Sensitive Bloom Filter

[...]

Mayank Bhushan, Monica Singh, Sumit Yadav

04 May 2015

TL;DR: To remove the drawbacks of simple hashing technique, the LSBF must be implemented to store data in the bloom filter which will help to search the most approximate result by using the Locality Sensitive Hashing approach.

...read moreread less

Abstract: For faster access of data or in network bloom filter plays an important part in searching technique. It process data in short amount of time and frequently with probabilistic analysis. Bloom Filter also decreases the cost of analyzing data. Various applications are using this technology for accessing and processing the data. Thus by implementing Bloom's Filter over big data will result into efficient query accessing in big data. In this paper, an approach to implement Locality Sensitive Bloom Filter (LSBF) technique in big data is proposed. To remove the drawbacks of simple hashing technique, the LSBF must be implemented to store data in the bloom filter which will help to search the most approximate result by using the Locality Sensitive Hashing approach.

...read moreread less

12 citations

Collapse

Network Information

Performance

Metrics

2,048

Papers

77,891

Citations

No. of papers in the topic in previous years
Year	Papers
2023	43
2022	108
2021	88
2020	110
2019	104
2018	139

Locality-sensitive hashing

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics