Topic
Locality-sensitive hashing
About: Locality-sensitive hashing is a research topic. Over the lifetime, 1894 publications have been published within this topic receiving 69362 citations.
Papers published on a yearly basis
Papers
More filters
••
21 Oct 2013TL;DR: Support Vector Machine (SVM) based Data Redundancy Elimination for Data Aggregation in WSN (SDRE) has been proposed in this work and it minimizes the redundancy and to eliminate the false data to improve performance of WSN.
Abstract: The data aggregation is most important in Wireless Sensor Networks (WSN) due to constraint of resources. There is lot of data redundancy in WSN due to dense deployment. So, it is necessary to minimize the data redundancy by adopting suitable aggregation techniques. To resolve this problem, Support Vector Machine (SVM) based Data Redundancy Elimination for Data Aggregation in WSN (SDRE) has been proposed in this work. First, we build aggregation tree for the given size of the sensor network. Then, SVM method was applied on tree to eliminate the redundant data. The Locality Sensitive Hashing (LSH) is used minimize the data redundancy and to eliminate the false data based on the similarity. The LSH codes are sent to the aggregation supervisor node. The aggregation supervisor finds sensor nodes that have the same data and selects only one sensor node among them to send actual data. The benefit of this approach is it minimizes the redundancy and to eliminate the false data to improve performance of WSN. The performance of proposed approach is measured using the network parameters such as Delay, Energy, Packet drops and Overheads. The SDRE perform better in all the scenarios for different size network and varying data rate.
26 citations
••
01 Jun 2014TL;DR: A two-stage unsupervised hashing framework which harmoniously integrates two state-of-theart hashing algorithms Locality Sensitive Hashing and Iterative Quantization is proposed, which capitalizes on both term and topic similarity among documents, leading to precise document retrieval.
Abstract: This work fulfills sublinear time Nearest Neighbor Search (NNS) in massivescale document collections. The primary contribution is to propose a two-stage unsupervised hashing framework which harmoniously integrates two state-of-theart hashing algorithms Locality Sensitive Hashing (LSH) and Iterative Quantization (ITQ). LSH accounts for neighbor candidate pruning, while ITQ provides an efficient and effective reranking over the neighbor pool captured by LSH. Furthermore, the proposed hashing framework capitalizes on both term and topic similarity among documents, leading to precise document retrieval. The experimental results convincingly show that our hashing based document retrieval approach well approximates the conventional Information Retrieval (IR) method in terms of retrieving semantically similar documents, and meanwhile achieves a speedup of over one order of magnitude in query time.
26 citations
••
24 Jul 2016TL;DR: This paper proposes a supervised hashing learning method based on a well designed deep convolutional neural network, which tries to learn hashing code and compact representations of data simultaneously.
Abstract: Hashing-based methods seek compact and efficient binary codes that preserve the similarity between data. For most existing hashing methods, an input (e.g. image) is first encoded as a vector of hand-crafted visual feature, followed by a hash projection and quantization step to obtain the compact binary vector. Most of hand-crafted features only encode low-level information of the input, the feature may not preserve semantic similarities of pairwise inputs. Meanwhile, the hash function learning process is independent with the feature representation, so that the feature may not be optimal for the hash projection. In this paper, we propose a supervised hashing learning method based on a well designed deep convolutional neural network, which tries to learn hashing code and compact representations of data simultaneously. Particularly, the proposed model learns binary codes by adding a compact sigmoid layer before the classifier layer. Experiments on several image data sets show that the proposed model outperforms other state-of-the-art hashing learning approaches.
26 citations
••
11 Jul 2021TL;DR: Xia et al. as discussed by the authors proposed a method based on Locality Sensitive Hashing (LSH) that can detect near-duplicates in sublinear time for a given query.
Abstract: Recently, research on explainable recommender systems has drawn much attention from both academia and industry, resulting in a variety of explainable models. As a consequence, their evaluation approaches vary from model to model, which makes it quite difficult to compare the explainability of different models. To achieve a standard way of evaluating recommendation explanations, we provide three benchmark datasets for EXplanaTion RAnking (denoted as EXTRA), on which explainability can be measured by ranking-oriented metrics. Constructing such datasets, however, poses great challenges. First, user-item-explanation triplet interactions are rare in existing recommender systems, so how to find alternatives becomes a challenge. Our solution is to identify nearly identical sentences from user reviews. This idea then leads to the second challenge, i.e., how to efficiently categorize the sentences in a dataset into different groups, since it has quadratic runtime complexity to estimate the similarity between any two sentences. To mitigate this issue, we provide a more efficient method based on Locality Sensitive Hashing (LSH) that can detect near-duplicates in sub-linear time for a given query. Moreover, we make our code publicly available to allow researchers in the community to create their own datasets.
26 citations
•
12 Dec 2011TL;DR: This paper considers a new framework that applies supervised learning to directly optimize a data structure that supports efficient large scale search and significantly outperforms the start-of-the-art learning to hash methods, as well as state of theart high dimensional search algorithms.
Abstract: High dimensional similarity search in large scale databases becomes an important challenge due to the advent of Internet. For such applications, specialized data structures are required to achieve computational efficiency. Traditional approaches relied on algorithmic constructions that are often data independent (such as Locality Sensitive Hashing) or weakly dependent (such as kd-trees, k-means trees). While supervised learning algorithms have been applied to related problems, those proposed in the literature mainly focused on learning hash codes optimized for compact embedding of the data rather than search efficiency. Consequently such an embedding has to be used with linear scan or another search algorithm. Hence learning to hash does not directly address the search efficiency issue. This paper considers a new framework that applies supervised learning to directly optimize a data structure that supports efficient large scale search. Our approach takes both search quality and computational cost into consideration. Specifically, we learn a boosted search forest that is optimized using pair-wise similarity labeled examples. The output of this search forest can be efficiently converted into an inverted indexing data structure, which can leverage modern text search infrastructure to achieve both scalability and efficiency. Experimental results show that our approach significantly outperforms the start-of-the-art learning to hash methods (such as spectral hashing), as well as state-of-the-art high dimensional search algorithms (such as LSH and k-means trees).
26 citations