Topic

Locality-sensitive hashing

About: Locality-sensitive hashing is a research topic. Over the lifetime, 1894 publications have been published within this topic receiving 69362 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Spline Regression Hashing for Fast Image Search

[...]

Yang Liu¹, Fei Wu¹, Yi Yang², Yueting Zhuang¹, Alexander G. Hauptmann² - Show less +1 more•Institutions (2)

Zhejiang University¹, Carnegie Mellon University²

01 Oct 2012-IEEE Transactions on Image Processing

TL;DR: This paper proposes a spline regression hashing method, in which both the local and global data similarity structures are exploited, and outperforms the state-of-the-art techniques on generating hash codes.

...read moreread less

Abstract: Techniques for fast image retrieval over large databases have attracted considerable attention due to the rapid growth of web images. One promising way to accelerate image search is to use hashing technologies, which represent images by compact binary codewords. In this way, the similarity between images can be efficiently measured in terms of the Hamming distance between their corresponding binary codes. Although plenty of methods on generating hash codes have been proposed in recent years, there are still two key points that needed to be improved: 1) how to precisely preserve the similarity structure of the original data and 2) how to obtain the hash codes of the previously unseen data. In this paper, we propose our spline regression hashing method, in which both the local and global data similarity structures are exploited. To better capture the local manifold structure, we introduce splines developed in Sobolev space to find the local data mapping function. Furthermore, our framework simultaneously learns the hash codes of the training data and the hash function for the unseen data, which solves the out-of-sample problem. Extensive experiments conducted on real image datasets consisting of over one million images show that our proposed method outperforms the state-of-the-art techniques.

...read moreread less

29 citations

Journal Article•DOI•

A Generic Method for Accelerating LSH-Based Similarity Join Processing

[...]

Chenyun Yu¹, Sarana Nutanong¹, Hangyu Li¹, Cong Wang¹, Xingliang Yuan¹ - Show less +1 more•Institutions (1)

City University of Hong Kong¹

01 Apr 2017-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This paper proposes a generic method to speed up the process of joining two large datasets using LSH by identifying a set of representative points to reduce the number of LSH lookups and demonstrates the generality of the method by showing that the same principle can be applied to LSH algorithms for three different metrics.

...read moreread less

Abstract: Locality sensitive hashing (LSH) is an efficient method for solving the problem of approximate similarity search in high-dimensional spaces. Through LSH, a high-dimensional similarity join can be processed in the same way as hash join, making the cost of joining two large datasets linear. By judicially analyzing the properties of multiple LSH algorithms, we propose a generic method to speed up the process of joining two large datasets using LSH. The crux of our method lies in the way which we identify a set of representative points to reduce the number of LSH lookups. Theoretical analyzes show that our proposed method can greatly reduce the number of lookup operations and retain the same result accuracy compared to executing LSH lookups for every query point. Furthermore, we demonstrate the generality of our method by showing that the same principle can be applied to LSH algorithms for three different metrics: the Euclidean distance (QALSH), Jaccard similarity measure (MinHash), and Hamming distance (sequence hashing). Results from experimental studies using real datasets confirm our error analyzes and show significant improvements of our method over the state-of-the-art LSH method: to achieve over 0.95 recall, we only need to operate LSH lookups for at most 15 percent of the query points.

...read moreread less

29 citations

Proceedings Article•DOI•

Discrete Multi-view Hashing for Effective Image Retrieval

[...]

Rui Yang¹, Yuliang Shi¹, Xin-Shun Xu¹•Institutions (1)

Shandong University¹

06 Jun 2017

TL;DR: A novel hashing method, i.e., Discrete Multi-view Hashing (DMVH), which can work on multi-view data directly and make full use of rich information in multi-View data, and a novel approach to construct similarity matrix, which can not only preserve local similarity structure, but also keep semantic similarity between data points.

...read moreread less

Abstract: Recently, hashing techniques have witnessed an increase in popularity due to their low storage cost and high query speed for large scale data retrieval task, eg, image retrieval Many methods have been proposed; however, most existing hashing techniques focus on single view data In many scenarios, there are multiple views in data samples Thus, those methods working on single view can not make full use of rich information contained in multi-view data Although some methods have been proposed for multi-view data; they usually relax binary constraints or separate the process of learning hash functions and binary codes into two independent stages to bypass the obstacle of handling the discrete constraints on binary codes for optimization, which may generate large quantization error To consider these problems, in this paper, we propose a novel hashing method, ie, Discrete Multi-view Hashing (DMVH), which can work on multi-view data directly and make full use of rich information in multi-view data Moreover, in DMVH, we optimize discrete codes directly instead of relaxing the binary constraints so that we could obtain high-quality hash codes Simultaneously, we present a novel approach to construct similarity matrix, which can not only preserve local similarity structure, but also keep semantic similarity between data points To solve the optimization problem in DMVH, we further propose an alternate algorithm We test the proposed model on three large scale data sets Experimental results show that it outperforms or is comparable to several state-of-the-arts

...read moreread less

29 citations

Journal Article•DOI•

Robust Speech Hashing for Content Authentication

[...]

Yuhua Jiao¹, Liping Ji, Xiamu Niu¹•Institutions (1)

Harbin Institute of Technology¹

19 Jun 2009-IEEE Signal Processing Letters

TL;DR: A novel key-dependent robust speech hashing based on speech production model is proposed in this letter, which is highly robust to content preserving operations as well as having high accuracy of tampering localization.

...read moreread less

Abstract: Robust hashing for multimedia authentication is an emerging research area. A novel key-dependent robust speech hashing based on speech production model is proposed in this letter. Robust hash is calculated based on linear spectrum frequencies (LSFs) which model the vocal tract. The correlation between LSFs is decoupled by discrete cosine transformation (DCT). A randomization scheme controlled by a secret key is applied in hash generation for random feature selection. The hash function is key-dependent and collision resistant. Meanwhile, it is highly robust to content preserving operations as well as having high accuracy of tampering localization.

...read moreread less

28 citations

Proceedings Article•DOI•

Scaling object recognition: Benchmark of current state of the art techniques

[...]

Mohamed Aly¹, Peter Welinder¹, Mario E. Munich², Pietro Perona¹•Institutions (2)

California Institute of Technology¹, Evolution Robotics²

01 Sep 2009

TL;DR: This work investigates and benchmark the scalability properties of the state-of-the-art object recognition techniques: the forest of k-d trees, the locality sensitive hashing (LSH) method, and the approximate clustering procedure with the tf-idf inverted index.

...read moreread less

Abstract: Scaling from hundreds to millions of objects is the next challenge in visual recognition. We investigate and benchmark the scalability properties (memory requirements, runtime, recognition performance) of the state-of-the-art object recognition techniques: the forest of k-d trees, the locality sensitive hashing (LSH) method, and the approximate clustering procedure with the tf-idf inverted index. The characterization of the images was performed with SIFT features. We conduct experiments on two new datasets of more than 100,000 images each, and quantify the performance using artificial and natural deformations. We analyze the results and point out the pitfalls of each of the compared methodologies suggesting potential new research avenues for the field.

...read moreread less

28 citations

Collapse

Network Information

Performance

Metrics

2,048

Papers

77,891

Citations

No. of papers in the topic in previous years
Year	Papers
2023	43
2022	108
2021	88
2020	110
2019	104
2018	139

Locality-sensitive hashing

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics