Topic

Locality-sensitive hashing

About: Locality-sensitive hashing is a research topic. Over the lifetime, 1894 publications have been published within this topic receiving 69362 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Fast Item-Based Collaborative Filtering

[...]

David Ben Shimon¹, Lior Rokach¹, Bracha Shapira¹, Guy Shani¹•Institutions (1)

Ben-Gurion University of the Negev¹

10 Jan 2015

TL;DR: Two methods for reducing the number of item pairs comparisons, through simple clustering, where similar items tend to be in the same cluster are proposed, and one that uses Locality Sensitive Hashing (LSH) and another that uses the item consumption cardinality.

...read moreread less

Abstract: Item-based Collaborative Filtering (CF) models offer good recommendations with low latency. Still, constructing such models is often slow, requiring the comparison of all item pairs, and then caching for each item the list of most similar items. In this paper we suggest methods for reducing the number of item pairs comparisons, through simple clustering, where similar items tend to be in the same cluster. We propose two methods, one that uses Locality Sensitive Hashing (LSH), and another that uses the item consumption cardinality. We evaluate the two methods demonstrating the cardinality based method reduce the computation time dramatically without damage the accuracy.

...read moreread less

9 citations

Proceedings Article•DOI•

Curvelet-based locality sensitive hashing for mammogram retrieval in large-scale datasets

[...]

Amira Jouirou¹, Abir Baazaoui¹, Walid Barhoumi¹, Ezzeddine Zagrouba¹•Institutions (1)

Tunis University¹

01 Nov 2015

TL;DR: Realized experiments on the challenging Digital Database for Screening Mammography (DDSM) dataset proved the performance of the proposed CBIR method for the retrieval of the most relevant mammograms in a large-scale dataset.

...read moreread less

Abstract: Content-based image retrieval (CBIR) is a primordial task to provide the most similar images especially in the context of medical imaging for diagnosis aid. In this paper, we propose a CBIR method for a large-scale mammogram datasets. In fact, to extract region of interest (ROI) signatures, four moment descriptors were defined after computing the curvelet coefficients for each level of the ROI. Then, an unsupervised technique based on locality sensitive hashing was adopted for indexing the extracted signatures. The main contribution of the suggested method resides in the variance-based filtering within the retrieval phase in order to extract the suitable buckets in the shortest time, while optimizing the memory requirement. After that, an accurate searching in Hamming space is performed in order to identify the similar ROIs to the query case. Realized experiments on the challenging Digital Database for Screening Mammography (DDSM) dataset proved the performance of the proposed method for the retrieval of the most relevant mammograms in a large-scale dataset. It achieves a mean retrieval precision rate of 97.1% over a total of 11218 mammogram ROIs.

...read moreread less

9 citations

Book Chapter•DOI•

Improving Locality Sensitive Hashing by Efficiently Finding Projected Nearest Neighbors

[...]

Omid Jafari¹, Parth Nagarkar¹, Jonathan Montaño¹•Institutions (1)

New Mexico State University¹

30 Sep 2020

TL;DR: This work presents a novel index structure called radius-optimized Locality Sensitive Hashing (roLSH), and extensive experimental analysis on real datasets shows the performance benefit of roLSH over existing state-of-the-art LSH techniques.

...read moreread less

Abstract: Similarity search in high-dimensional spaces is an important task for many multimedia applications. Due to the notorious curse of dimensionality, approximate nearest neighbor techniques are preferred over exact searching techniques since they can return good enough results at a much better speed. Locality Sensitive Hashing (LSH) is a very popular random hashing technique for finding approximate nearest neighbors. Existing state-of-the-art Locality Sensitive Hashing techniques that focus on improving performance of the overall process, mainly focus on minimizing the total number of IOs while sacrificing the overall processing time. The main time-consuming process in LSH techniques is the process of finding neighboring points in projected spaces. We present a novel index structure called radius-optimized Locality Sensitive Hashing (roLSH). With the help of sampling techniques and Neural Networks, we present two techniques to find neighboring points in projected spaces efficiently, without sacrificing the accuracy of the results. Our extensive experimental analysis on real datasets shows the performance benefit of roLSH over existing state-of-the-art LSH techniques.

...read moreread less

9 citations

Proceedings Article•DOI•

Bidirectionally Densifying LSH Sketches with Empty Bins

[...]

Peng Jia¹, Pinghui Wang¹, Junzhou Zhao¹, Shuo Zhang¹, Yiyan Qi¹, Min Hu², Chao Deng², Xiaohong Guan¹ - Show less +4 more•Institutions (2)

Xi'an Jiaotong University¹, China Mobile Research Institute²

09 Jun 2021

TL;DR: BiDens as mentioned in this paper proposes a novel densification method, i.e., BiDens, which is more efficient to fill a sketch's empty bins with values of its non-empty bins in either the forward or backward directions.

...read moreread less

Abstract: As an efficient tool for approximate similarity computation and search, Locality Sensitive Hashing (LSH) has been widely used in many research areas including databases, data mining, information retrieval, and machine learning. Classical LSH methods typically require to perform hundreds or even thousands of hashing operations when computing the LSH sketch for each input item (e.g., a set or a vector); however, this complexity is still too expensive and even impractical for applications requiring processing data in real-time. To address this issue, several fast methods such as OPH and BCWS have been proposed to efficiently compute the LSH sketches; however, these methods may generate many sketches with empty bins, which may introduce large errors for similarity estimation and also limit their usage for fast similarity search. To solve this issue, we propose a novel densification method, i.e., BiDens. Compared with existing densification methods, our BiDens is more efficient to fill a sketch's empty bins with values of its non-empty bins in either the forward or backward directions. Furthermore, it also densifies empty bins to satisfy the densification principle (i.e., the LSH property). Theoretical analysis and experimental results on similarity estimation, fast similarity search, and kernel linearization using real-world datasets demonstrate that our BiDens is up to 106 times faster than state-of-the-art methods while achieving the same or even better accuracy.

...read moreread less

9 citations

Patent•

Approximate nearest neighbor search device, approximate nearest neighbor search method, and program

[...]

雅一岩村, 智一佐藤, 浩一黄瀬

28 Feb 2013

TL;DR: In this paper, an approximate nearest neighbor search device is applied which comprises: a database storage unit which, when a plurality of points which are represented with vector data is inputted, computes a hash index by applying a hash function to each point, and stores each point in a multi-dimensional hash table by projecting each point into a multidimensional space which is segmented into a plurality by the multi dimensional hash table bins; a search range establishment unit which is used to establish a location of the query within the space, establishes estimate values of the distance from the query to

...read moreread less

Abstract: An objective of the present invention is to implement an approximate nearest neighbor search rapidly and with high precision in searching by appropriately reducing the number of nearest neighbor candidates. An approximate nearest neighbor search device is applied which comprises: a database storage unit which, when a plurality of points which are represented with vector data is inputted, computes a hash index by applying a hash function to each point, and stores each point in a multi-dimensional hash table by projecting each point in a multi-dimensional space which is segmented into a plurality of regions by the multi-dimensional hash table bins; a search range establishment unit which, when a query is inputted, applies the hash function to the query, establishes a location of the query within the space, establishes estimate values of the distance from the query to each region within the space, and establishes regions to be searched on the basis of the estimate values; and a nearest neighbor establishment unit which calculates the distance from each point within the search region to the query, and computes the nearest point to the query to be the nearest neighbor to the query. The search range establishment unit refers to the index of each region and derives a representative point of the region, establishes the estimate value on the basis of the distance between the query and each representative point, applies a branch and bound technique, excluding the regions which cannot be the regions to be searched, and establishes the regions to be searched.

...read moreread less

9 citations

Collapse

Network Information

Performance

Metrics

2,048

Papers

77,891

Citations

No. of papers in the topic in previous years
Year	Papers
2023	43
2022	108
2021	88
2020	110
2019	104
2018	139

Locality-sensitive hashing

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics