scispace - formally typeset
Search or ask a question
Topic

Feature hashing

About: Feature hashing is a research topic. Over the lifetime, 993 publications have been published within this topic receiving 51462 citations.


Papers
More filters
Proceedings ArticleDOI
30 Jun 2004
TL;DR: This paper proposes a geometry-invariant image hashing scheme, which can be employed for content copy detection and tracing and exhaustive experimental results obtained from benchmark attacks have confirmed the performance of the proposed method.
Abstract: Due to the desired non-invasive property, non-data hiding (called media hashing here) is considered to be an alternative to achieve many applications previously accomplished with watermarking. Recently, media hashing techniques for content identification have been gradually emerging. However, none of them are really resistant against geometrical attacks. In this paper, our aim is to propose a geometry-invariant image hashing scheme, which can be employed for content copy detection and tracing. Our system is mainly composed of three components: (i) robust mesh extraction; (iii) mesh-based robust hash extraction; and (iii) hash matching for similarity measurement. Exhaustive experimental results obtained from benchmark attacks have confirmed the performance of the proposed method

62 citations

Proceedings Article
Wei Liu1, Jun Wang2, Yadong Mu1, Sanjiv Kumar3, Shih-Fu Chang1 
26 Jun 2012
TL;DR: The key idea is the bilinear form of the proposed hash functions, which leads to higher collision probability than the existing hyperplane hash functions when using random projections, which boosts the search performance over the random projection based solutions.
Abstract: Hyperplane hashing aims at rapidly searching nearest points to a hyperplane, and has shown practical impact in scaling up active learning with SVMs. Unfortunately, the existing randomized methods need long hash codes to achieve reasonable search accuracy and thus suffer from reduced search speed and large memory overhead. To this end, this paper proposes a novel hyperplane hashing technique which yields compact hash codes. The key idea is the bilinear form of the proposed hash functions, which leads to higher collision probability than the existing hyperplane hash functions when using random projections. To further increase the performance, we propose a learning based framework in which the bilinear functions are directly learned from the data. This results in short yet discriminative codes, and also boosts the search performance over the random projection based solutions. Large-scale active learning experiments carried out on two datasets with up to one million samples demonstrate the overall superiority of the proposed approach.

61 citations

Journal ArticleDOI
01 Sep 2014-Optik
TL;DR: A robust image hashing with dominant discrete cosine transform (DCT) coefficients is proposed that converts the input image to a normalized image, divides it into non-overlapping blocks, extracts dominant DCT coefficients in the first row/column of each block to construct feature matrices, and finally conducts matrix compression by calculating and quantifying column distances.

61 citations

Proceedings ArticleDOI
13 Jun 2010
TL;DR: It is shown that with hashing, the sparse representation can be recovered with a high probability because hashing preserves the restrictive isometry property and is presented a theoretical analysis on the recognition rate.
Abstract: We propose a face recognition approach based on hashing. The approach yields comparable recognition rates with the random l 1 approach [18], which is considered the state-of-the-art. But our method is much faster: it is up to 150 times faster than [18] on the YaleB dataset. We show that with hashing, the sparse representation can be recovered with a high probability because hashing preserves the restrictive isometry property. Moreover, we present a theoretical analysis on the recognition rate of the proposed hashing approach. Experiments show a very competitive recognition rate and significant speedup compared with the state-of-the-art.

61 citations

Proceedings ArticleDOI
28 Jul 2013
TL;DR: Experimental results indicate that the modeling of tag information and utilizing topic modeling are beneficial for improving the effectiveness of hashing separately, while the combination of these two techniques in the unified framework obtains even better results.
Abstract: It is an important research problem to design efficient and effective solutions for large scale similarity search. One popular strategy is to represent data examples as compact binary codes through semantic hashing, which has produced promising results with fast search speed and low storage cost. Many existing semantic hashing methods generate binary codes for documents by modeling document relationships based on similarity in a keyword feature space. Two major limitations in existing methods are: (1) Tag information is often associated with documents in many real world applications, but has not been fully exploited yet; (2) The similarity in keyword feature space does not fully reflect semantic relationships that go beyond keyword matching. This paper proposes a novel hashing approach, Semantic Hashing using Tags and Topic Modeling (SHTTM), to incorporate both the tag information and the similarity information from probabilistic topic modeling. In particular, a unified framework is designed for ensuring hashing codes to be consistent with tag information by a formal latent factor model and preserving the document topic/semantic similarity that goes beyond keyword matching. An iterative coordinate descent procedure is proposed for learning the optimal hashing codes. An extensive set of empirical studies on four different datasets has been conducted to demonstrate the advantages of the proposed SHTTM approach against several other state-of-the-art semantic hashing techniques. Furthermore, experimental results indicate that the modeling of tag information and utilizing topic modeling are beneficial for improving the effectiveness of hashing separately, while the combination of these two techniques in the unified framework obtains even better results.

60 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
84% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Support vector machine
73.6K papers, 1.7M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202333
202289
202111
202016
201916
201838