scispace - formally typeset
Search or ask a question

Showing papers on "Locality-sensitive hashing published in 2003"


Proceedings ArticleDOI
13 Oct 2003
TL;DR: A new algorithm is introduced that learns a set of hashing functions that efficiently index examples relevant to a particular estimation task, and can rapidly and accurately estimate the articulated pose of human figures from a large database of example images.
Abstract: Example-based methods are effective for parameter estimation problems when the underlying system is simple or the dimensionality of the input is low. For complex and high-dimensional problems such as pose estimation, the number of required examples and the computational complexity rapidly become prohibitively high. We introduce a new algorithm that learns a set of hashing functions that efficiently index examples relevant to a particular estimation task. Our algorithm extends locality-sensitive hashing, a recently developed method to find approximate neighbors in time sublinear in the number of examples. This method depends critically on the choice of hash functions that are optimally relevant to a particular estimation problem. Experiments demonstrate that the resulting algorithm, which we call parameter-sensitive hashing, can rapidly and accurately estimate the articulated pose of human figures from a large database of example images.

929 citations


01 Jan 2003
TL;DR: This work employs an adaptive pattern discovery technique to learn the distribution patterns of relevant records in the data space, and drastically reduces irrelevant data records, and returns the top-k nearest neighbors of the query to the user.
Abstract: To query high-dimensional databases, similarity search (or k nearest neighbor search) is the most extensively used method. However, since each attribute of high dimensional data records only contains very small amount of information, the distance of two high-dimensional records may not always correctly reflect their similarity. So, a multi-dimensional query may have a k-nearest-neighbor set which only contains few relevant records. To address this issue, we present an adaptive pattern discovery method to search high dimensional data spaces both effectively and efficiently. With our method, the user is allowed to participate in the database search by labeling the returned records as relevant or irrelevant. By using user-labeled data records as training samples, our method employs an adaptive pattern discovery technique to learn the distribution patterns of relevant records in the data space, and drastically reduces irrelevant data records. From the reduced data set, our approach returns the top-k nearest neighbors of the query to the user – this interaction between the user and the DBMS can be repeated multiple times. To achieve the adaptive pattern discovery, we employ a pattern classification algorithm called random forests, which is a machine learning algorithm with proven good performance on many traditional classification problems. By using a novel two-level resampling method, we adapt the original random forests to an interactive algorithm, which achieves noticeable gains in efficiency over the original algorithm. We empirically compare our method with previously well-known related approaches on large-scaled, high-dimensional and real-world data sets, and report promising results of our method.

3 citations


Patent
30 May 2003
TL;DR: A method of hashing biometric data for generation of a public template is disclosed in this paper, where a hash function is provided for hashing of feature data thehash function is a function of data within the biometric information and determinable the refrom but resulting in a hashing result from which the original feature data is unclear.
Abstract: A method of hashing biometric data for generation of a public template is disclosed. According to the method, a hash function is provided for hashing of feature data the hash function a function of data within the biometric information and determinable therefrom but resulting in a hashing result from which the original feature data is indiscernible.