scispace - formally typeset
Search or ask a question
Topic

Feature hashing

About: Feature hashing is a research topic. Over the lifetime, 993 publications have been published within this topic receiving 51462 citations.


Papers
More filters
Proceedings ArticleDOI
Huan Zhao1, Shaofang He1
01 Aug 2016
TL;DR: Experimental results indicate that the perceptual hashing generated from multifractal characteristics shows better distinctiveness and robustness than those derived from time and frequency domain features in existing methods.
Abstract: In order to further improve the robustness and discrimination of perceptual hashing and retrieval speed in large-scale data, a novel retrieval algorithm over encrypted speech is proposed. Before encrypted speech is uploaded, perceptual hashing sequences must be embedded as a digital watermark. In the process of generating perceptual hashing, multifractal characteristic of speech that has good distinctiveness and robustness is introduced, and the technology of piecewise aggregate approximation is used for compressing data size. The retrieval process does not need decryption but requires the generation of perceptual hashing sequence of query speech segment. Each perceptual hashing set in system hash table should then be matched successively. Experimental results indicate that the perceptual hashing generated from multifractal characteristics shows better distinctiveness and robustness than those derived from time and frequency domain features in existing methods. Furthermore, because of employing the technology of piecewise aggregate approximation, the generated perceptual hashing has small amounts of data, which leads to the greatly improvement of retrieval speed. And finally, the proposed retrieval algorithm achieves high recall and precision ratios in terms of the variety of content holding operation.

19 citations

Book ChapterDOI
27 Aug 2007
TL;DR: A general biometric hash generation scheme based on vector quantization of multiple feature subsets selected with genetic optimization that overcomes the dimensionality problem of other hash generation algorithms and enables to exploit all the discriminative information found in large feature sets.
Abstract: We present a general biometric hash generation scheme based on vector quantization of multiple feature subsets selected with genetic optimization. The quantization of subsets overcomes the dimensionality problem of other hash generation algorithms, while the feature selection step using an integer-coding genetic algorithm enables to exploit all the discriminative information found in large feature sets. We provide experimental results of the proposed hashing for verification of on-line signatures. Development and evaluation experiments are reported on the MCYT signature database, comprising 16, 500 signatures from 330 subjects.

19 citations

Proceedings ArticleDOI
25 Oct 2010
TL;DR: Data-Oriented LSH is proposed to reduce memory consumption when data are non-uniformly distributed and focused on the hash table construction, and thus the query-directed methods can be applied to the index to improve further.
Abstract: Locality Sensitive Hashing (LSH) has been proposed as a scalable and high-dimensional index for approximate similarity search. Euclidean LSH is a variation of LSH and has been successfully used in many multimedia applications. However, hash functions of the basic Euclidean LSH project data points over randomly selected directions, which reduces accuracy when data are non-uniformly distributed. So more hash tables are needed to guarantee the accuracy, and thus more memory is consumed. Since heavy memory cost is a significant drawback of Euclidean LSH, we propose Data-Oriented LSH to reduce memory consumption when data are non-uniformly distributed. Most of existing methods are query-directed, such as multi-probe and query expansion methods. We focused on the hash table construction, and thus the query-directed methods can be applied to our index to improve further. The experiment shows that to achieve the same accuracy, our method uses less time and less memory compared with original Euclidean LSH.

19 citations

Patent
Simon Tong1, Noam Shazeer1
09 Aug 2011
TL;DR: In this article, a system may track statistics for a number of features using an approximate counting technique by subjecting each feature to multiple, different hash functions to generate multiple different hash values, where each of the hash values may identify a particular location in a memory, and storing statistics for each feature at the particular locations identified by the hash value.
Abstract: A system may track statistics for a number of features using an approximate counting technique by: subjecting each feature to multiple, different hash functions to generate multiple, different hash values, where each of the hash values may identify a particular location in a memory, and storing statistics for each feature at the particular locations identified by the hash values. The system may generate rules for a model based on the tracked statistics.

19 citations

Journal ArticleDOI
TL;DR: The perfect hashing function described in this article has been used to create minimal perfect hashing functions for unsegmented word sets of up to 5000 words and is a significant improvement in terms of both time and space efficiency.

19 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
84% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Support vector machine
73.6K papers, 1.7M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202333
202289
202111
202016
201916
201838