Topic

Locality-sensitive hashing

About: Locality-sensitive hashing is a research topic. Over the lifetime, 1894 publications have been published within this topic receiving 69362 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Bit-Scalable Deep Hashing With Regularized Similarity Learning for Image Retrieval and Person Re-Identification

[...]

Ruimao Zhang¹, Liang Lin¹, Rui Zhang¹, Wangmeng Zuo², Lei Zhang³ - Show less +1 more•Institutions (3)

Sun Yat-sen University¹, Harbin Institute of Technology², Hong Kong Polytechnic University³

11 Aug 2015-IEEE Transactions on Image Processing

TL;DR: Zhang et al. as mentioned in this paper proposed a supervised learning framework to generate compact and bit-scalable hashing codes directly from raw images, where they pose hashing learning as a problem of regularized similarity learning.

...read moreread less

Abstract: Extracting informative image features and learning effective approximate hashing functions are two crucial steps in image retrieval. Conventional methods often study these two steps separately, e.g., learning hash functions from a predefined hand-crafted feature space. Meanwhile, the bit lengths of output hashing codes are preset in the most previous methods, neglecting the significance level of different bits and restricting their practical flexibility. To address these issues, we propose a supervised learning framework to generate compact and bit-scalable hashing codes directly from raw images. We pose hashing learning as a problem of regularized similarity learning. In particular, we organize the training images into a batch of triplet samples, each sample containing two images with the same label and one with a different label. With these triplet samples, we maximize the margin between the matched pairs and the mismatched pairs in the Hamming space. In addition, a regularization term is introduced to enforce the adjacency consistency, i.e., images of similar appearances should have similar codes. The deep convolutional neural network is utilized to train the model in an end-to-end fashion, where discriminative image features and hash functions are simultaneously optimized. Furthermore, each bit of our hashing codes is unequally weighted, so that we can manipulate the code lengths by truncating the insignificant bits. Our framework outperforms state-of-the-arts on public benchmarks of similar image search and also achieves promising results in the application of person re-identification in surveillance. It is also shown that the generated bit-scalable hashing codes well preserve the discriminative powers with shorter code lengths.

...read moreread less

457 citations

Proceedings Article•

Spherical hashing

[...]

Jae-Pil Heo¹, Youngwoon Lee¹, Junfeng He², Shih-Fu Chang², Sung-Eui Yoon¹ - Show less +1 more•Institutions (2)

KAIST¹, Columbia University²

16 Jun 2012

TL;DR: The extensive experiments show that the spherical hashing technique significantly outperforms six state-of-the-art hashing techniques based on hyperplanes across various image benchmarks of sizes ranging from one to 75 million of GIST descriptors, which confirms the unique merits of the proposed idea in using hyperspheres to encode proximity regions in high-dimensional spaces.

...read moreread less

Abstract: Many binary code encoding schemes based on hashing have been actively studied recently, since they can provide efficient similarity search, especially nearest neighbor search, and compact data representations suitable for handling large scale image databases in many computer vision problems. Existing hashing techniques encode high-dimensional data points by using hyperplane-based hashing functions. In this paper we propose a novel hypersphere-based hashing function, spherical hashing, to map more spatially coherent data points into a binary code compared to hyperplane-based hashing functions. Furthermore, we propose a new binary code distance function, spherical Hamming distance, that is tailored to our hypersphere-based binary coding scheme, and design an efficient iterative optimization process to achieve balanced partitioning of data points for each hash function and independence between hashing functions. Our extensive experiments show that our spherical hashing technique significantly outperforms six state-of-the-art hashing techniques based on hyperplanes across various image benchmarks of sizes ranging from one to 75 million of GIST descriptors. The performance gains are consistent and large, up to 100% improvements. The excellent results confirm the unique merits of the proposed idea in using hyperspheres to encode proximity regions in high-dimensional spaces. Finally, our method is intuitive and easy to implement.

...read moreread less

455 citations

ANN: library for approximate nearest neighbor searching

[...]

Sunil Arya, David M. Mount

01 Jan 1998

TL;DR: ANN is a library of C++ objects and procedures that supports approximate nearest neighbor searching, and is written as a testbed for a class of nearest neighbour searching algorithms, particularly those based on orthogonal decompositions of space.

...read moreread less

Abstract: ANN is a library of C++ objects and procedures that supports approximate nearest neighbor searching. In nearest neighbor searching, we are given a set of data points S in real d-dimensional space, R d , and are to build a data structure such that, given any query point q 2 R d , the nearest data point to q can be found eeciently. In general, we are given k 1, and are asked to return the k-nearest neighbors to q in S. In approximate nearest neighbor searching, an error bound 0 is also given. The search algorithm returns k distinct points of S, such that the ratio between the distance to the ith point reported and the true ith nearest neighbor is at most 1 +. Among the features of ANN are the following. It supports k-nearest neighbor searching, by specifying k with the query. It supports both exact and approximate nearest neighbor searching, by specifying an approximation factor 0 with the query. It supports all Minkowski distance metrics, including the L 1 (Manhattan), L 2 (Eu-clidean), and L 1 (Max) metrics. There are no exponential factors in space, implying that the data structure is practical even for very large data sets in high dimensional spaces, irrespective of. ANN is written as a testbed for a class of nearest neighbor searching algorithms, particularly those based on orthogonal decompositions of space. These include k-d trees 3, 4], balanced box-decomposition trees 2] and other related spatial data structures (see Samet 5]). The library supports a number of diierent methods for building search structures. It also supports two methods for searching these structures: standard tree-ordered search 1] and priority search 2]. In priority search, the cells of the data structure are visited in increasing order of distance from the query point. In addition to the library there are two programs provided for testing and evaluating the performance of various search methods. The rst, called ann test, provides a primitive script language that allows the user to generate data sets and query sets, either by reading from a le or randomly through the use of a number of built-in point distributions. Any of a

...read moreread less

438 citations

Proceedings Article•DOI•

K-Means Hashing: An Affinity-Preserving Quantization Method for Learning Binary Compact Codes

[...]

Kaiming He¹, Fang Wen¹, Jian Sun¹•Institutions (1)

Microsoft¹

23 Jun 2013

TL;DR: A novel Affinity-Preserving K-means algorithm which simultaneously performs k-mean clustering and learns the binary indices of the quantized cells and outperforms various state-of-the-art hashing encoding methods.

...read moreread less

Abstract: In computer vision there has been increasing interest in learning hashing codes whose Hamming distance approximates the data similarity. The hashing functions play roles in both quantizing the vector space and generating similarity-preserving codes. Most existing hashing methods use hyper-planes (or kernelized hyper-planes) to quantize and encode. In this paper, we present a hashing method adopting the k-means quantization. We propose a novel Affinity-Preserving K-means algorithm which simultaneously performs k-means clustering and learns the binary indices of the quantized cells. The distance between the cells is approximated by the Hamming distance of the cell indices. We further generalize our algorithm to a product space for learning longer codes. Experiments show our method, named as K-means Hashing (KMH), outperforms various state-of-the-art hashing encoding methods.

...read moreread less

437 citations

Proceedings Article•DOI•

Fast Supervised Hashing with Decision Trees for High-Dimensional Data

[...]

Guosheng Lin¹, Chunhua Shen¹, Qinfeng Shi¹, Anton van den Hengel¹, David Suter¹ - Show less +1 more•Institutions (1)

University of Adelaide¹

23 Jun 2014

TL;DR: Experiments demonstrate that the proposed method significantly outperforms most state-of-the-art methods in retrieval precision and training time, and is orders of magnitude faster than many methods in terms of training time.

...read moreread less

Abstract: Supervised hashing aims to map the original features to compact binary codes that are able to preserve label based similarity in the Hamming space. Non-linear hash functions have demonstrated their advantage over linear ones due to their powerful generalization capability. In the literature, kernel functions are typically used to achieve non-linearity in hashing, which achieve encouraging retrieval perfor- mance at the price of slow evaluation and training time. Here we propose to use boosted decision trees for achieving non-linearity in hashing, which are fast to train and evalu- ate, hence more suitable for hashing with high dimensional data. In our approach, we first propose sub-modular for- mulations for the hashing binary code inference problem and an efficient GraphCut based block search method for solving large-scale inference. Then we learn hash func- tions by training boosted decision trees to fit the binary codes. Experiments demonstrate that our proposed method significantly outperforms most state-of-the-art methods in retrieval precision and training time. Especially for high- dimensional data, our method is orders of magnitude faster than many methods in terms of training time.

...read moreread less

418 citations

Collapse

Network Information

Performance

Metrics

2,048

Papers

77,891

Citations

No. of papers in the topic in previous years
Year	Papers
2023	43
2022	108
2021	88
2020	110
2019	104
2018	139

Locality-sensitive hashing

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics