Topic

Locality-sensitive hashing

About: Locality-sensitive hashing is a research topic. Over the lifetime, 1894 publications have been published within this topic receiving 69362 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Fast recommendation on latent collaborative relations

[...]

Chien-Liang Liu¹, Xuan-Wei Wu¹•Institutions (1)

National Chiao Tung University¹

01 Oct 2016-Knowledge Based Systems

TL;DR: A novel latent factor model called latent collaborative relations (LCR) is proposed, which transforms the recommendation problem into a nearest neighbor search problem by using the proposed scoring function, and provides an elegant way to incorporate with locality sensitive hashing to provide a fast recommendation while retaining recommendation accuracy and coverage.

...read moreread less

Abstract: Devise a recommendation algorithm based on latent factor model.Combines latent factors and ź2 norm to formulate the recommendation problem as a k-nearest-neighbor problem, in which we further use locality sensitive hashing (LSH) to reduce search time complexity.Speedup the retrieval by 5X -313X on three data sets used in the experiments. One important property of collaborative filtering recommender systems is that popular items are recommended disproportionately often because they provide extensive usage data and, thus, can be recommended to more users. Compared to popular products, the niches can be as economically attractive as mainstream fare for online retailers. The online retailers can stock virtually everything, and the number of available niche products exceeds the hits by several orders of magnitude. This work addresses accuracy, coverage and prediction time issues to propose a novel latent factor model called latent collaborative relations (LCR), which transforms the recommendation problem into a nearest neighbor search problem by using the proposed scoring function. We project users and items to the latent space, and calculate their similarities based on Euclidean metric. Additionally, the proposed model provides an elegant way to incorporate with locality sensitive hashing (LSH) to provide a fast recommendation while retaining recommendation accuracy and coverage. The experimental results indicate that the speedup is significant, especially when one is confronted with large-scale data sets. As for recommendation accuracy and coverage, the proposed method is competitive on three data sets.

...read moreread less

18 citations

Journal Article•DOI•

A scalable approach for content based image retrieval in cloud datacenter

[...]

Jianxin Liao¹, Di Yang¹, Tonghong Li², Jingyu Wang¹, Qi Qi¹, Xiaomin Zhu¹ - Show less +2 more•Institutions (2)

Beijing University of Posts and Telecommunications¹, Technical University of Madrid²

01 Mar 2014-Information Systems Frontiers

TL;DR: A scalable image retrieval framework which can efficiently support content similarity search and semantic search in the distributed environment is proposed and it is shown that the approach yields high recall rate with good load balance and only requires a few number of hops.

...read moreread less

Abstract: The emergence of cloud datacenters enhances the capability of online data storage. Since massive data is stored in datacenters, it is necessary to effectively locate and access interest data in such a distributed system. However, traditional search techniques only allow users to search images over exact-match keywords through a centralized index. These techniques cannot satisfy the requirements of content based image retrieval (CBIR). In this paper, we propose a scalable image retrieval framework which can efficiently support content similarity search and semantic search in the distributed environment. Its key idea is to integrate image feature vectors into distributed hash tables (DHTs) by exploiting the property of locality sensitive hashing (LSH). Thus, images with similar content are most likely gathered into the same node without the knowledge of any global information. For searching semantically close images, the relevance feedback is adopted in our system to overcome the gap between low-level features and high-level features. We show that our approach yields high recall rate with good load balance and only requires a few number of hops.

...read moreread less

18 citations

Proceedings Article•DOI•

Fast content identification in high-dimensional feature spaces using Sparse Ternary Codes

[...]

Sohrab Ferdowsi¹, Slava Voloshynovskiy¹, Dimche Kostadinov¹, Taras Holotyak¹•Institutions (1)

University of Geneva¹

01 Dec 2016

TL;DR: A framework of Sparse Ternary Codes (STC) is proposed resulting in sparse, but robust representation and sub-linear complexity of search and is compared with the Locality Sensitive Hashing and the memory vectors on several large-scale synthetic and public image databases showing its superiority.

...read moreread less

Abstract: We consider the problem of fast content identification in high-dimensional feature spaces where a sub-linear search complexity is required. By formulating the problem as sparse approximation of projected coefficients, a closed-form solution can be found which we approximate as a ternary representation. Hence, as opposed to dense binary codes, a framework of Sparse Ternary Codes (STC) is proposed resulting in sparse, but robust representation and sub-linear complexity of search. The proposed method is compared with the Locality Sensitive Hashing (LSH) and the memory vectors on several large-scale synthetic and public image databases, showing its superiority.

...read moreread less

18 citations

Proceedings Article•DOI•

Efficient k-Nearest Neighbors Search in High Dimensions Using MapReduce

[...]

Pingfei Zhu, Xiangwen Zhan, Wenming Qiu¹•Institutions (1)

Dalian University of Technology¹

26 Aug 2015

TL;DR: This work proposes a novel LSH-based inverted index scheme and design an efficient search algorithm, called H-c2kNN, which enables fast high-dimensional kNN search with excellent quality and low space cost, and implements this approach using MapReduce.

...read moreread less

Abstract: Finding the k-Nearest Neighbors (kNN) of a query object for a given dataset S is a primitive operation in many application domains. kNN search is very costly, especially many applications witness a quick increase in the amount and dimension of data to be processed. Locality sensitive hashing (LSH) has become a very popular method for this problem. However, most such methods can't obtain good performance in terms of search quality, search efficiency and space cost at the same time, such as RankReduce, which gains good search efficiency at the sacrifice of the search quality. Motivated by these, we propose a novel LSH-based inverted index scheme and design an efficient search algorithm, called H-c2kNN, which enables fast high-dimensional kNN search with excellent quality and low space cost. For efficiency and scalability concerns, we implemented our proposed approach to solve the kNN search in high dimensional space using MapReduce, which is a well-known framework for data-intensive applications and conducted extensive experiments to evaluate our proposed approach using both synthetic and real datasets. The results show that our proposed approach outperforms baseline methods in high dimensional space.

...read moreread less

18 citations

Journal Article•DOI•

[...]

Osman Durmaz¹, Hasan Sakir Bilge¹•Institutions (1)

Gazi University¹

01 Dec 2019-Pattern Recognition Letters

TL;DR: This study proposes an approach called as Randomized Distributed Hashing (RDH) which uses Locality Sensitive Hashes (LSH) in a distributed scheme which is promising for searching images in large datasets with multiple nodes.

...read moreread less

18 citations

Collapse

Network Information

Performance

Metrics

2,048

Papers

77,891

Citations

No. of papers in the topic in previous years
Year	Papers
2023	43
2022	108
2021	88
2020	110
2019	104
2018	139

Locality-sensitive hashing

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics