scispace - formally typeset
Search or ask a question
Topic

Feature hashing

About: Feature hashing is a research topic. Over the lifetime, 993 publications have been published within this topic receiving 51462 citations.


Papers
More filters
01 Jan 2012
TL;DR: A new association rule mining algorithm called Hash Based Frequent Item sets-Double Hashing (HBFI-DH) in which hashing technology is used to store the database in vertical data format in order to avoid hash collision and secondary clustering problem in hashing.
Abstract: Data mining is the use of automated data analysis techniques to uncover previously undetected relationships among data items. In data mining, association rule learning is a popular and well researched method for discovering interesting relations between variables in large databases. Mining frequent patterns is probably one of the most important concepts in data mining. It plays an essential role in many data mining tasks that try to find interesting patterns from databases, such as association rules, correlations, sequences, classifiers and clusters . In this paper, we propose a new association rule mining algorithm called Hash Based Frequent Item sets-Double Hashing (HBFI-DH) in which hashing technology is used to store the database in vertical data format. To avoid hash collision and secondary clustering problem in hashing, double hashing technique is utilized her. The advantages of this new hashing technique are easy to compute the hash function, fast access of data and efficiency. This algorithm provides facilities to avoid unnecessary scans to the database.

9 citations

Journal ArticleDOI
TL;DR: This paper presents a vector data hashing method based on the polyline curvature for design drawings and digital maps that had a very low false detection probability during geometrical modifications, rearrangements, and similar transformations of objects and layers.
Abstract: The growth in applications for vector data such as CAD design drawings and GIS digital maps has increased the requirements for authentication, copy detection, and retrieval of vector data. Vector data hashing is one of the main techniques for meeting these requirements. Its design must be robust, secure, and unique, which is similar to image or video hashing. This paper presents a vector data hashing method based on the polyline curvature for design drawings and digital maps. Our hashing method extracts the feature values by projecting the polyline curvatures, which are obtained from groups of vector data using GMM clustering, onto random values, before generating the final binary hash by binarization. A robustness evaluation showed that our hashing method had a very low false detection probability during geometrical modifications, rearrangements, and similar transformations of objects and layers. A security evaluation based on differential entropy showed that the level of uncertainty was very high with our hashing method. Furthermore, a uniqueness evaluation showed that the Hamming distances between hashes were very low.

9 citations

Proceedings ArticleDOI
03 Nov 2014
TL;DR: A transfer deep learning algorithm has been employed to learn the robust image representation, and the neighborhood-structure preserved method has been used to mapped the image into discriminative hash codes in hamming space, ensuring a good feature representation and a fast query speed without depending on large amounts of labeled data.
Abstract: With the explosive increase of online images, fast similarity search is increasingly critical for large scale image retrieval. Several hashing methods have been proposed to accelerate image retrieval, a promising way is semantic hashing which designs compact binary codes for a large number of images so that semantically similar images are mapped to similar codes. Supervised methods can handle such semantic similarity but they are prone to overfitting when the labeled data is few or noisy. In this paper, we concentrate on this issue and propose a novel Inductive Transfer Deep Hashing (ITDH) approach for semantic hashing based image retrieval. A transfer deep learning algorithm has been employed to learn the robust image representation, and the neighborhood-structure preserved method has been used to mapped the image into discriminative hash codes in hamming space. The combination of the two techniques ensures that we obtain a good feature representation and a fast query speed without depending on large amounts of labeled data. Experimental results demonstrate that the proposed approach is superior to some state-of-the-art methods.

9 citations

Journal ArticleDOI
TL;DR: A non-expansive hashing scheme wherein any set of size from a large universe may be stored in a memory of size (any, and ), and where retrieval takes operations.
Abstract: hashing scheme, similar inputs are stored in memory locations which are close. We develop a non-expansive hashing scheme wherein any set of size from a large universe may be stored in a memory of size (any , and ), and where retrieval takes operations. We explain how to use non-expansive hashing schemes for efficient storage and retrieval of noisy data. A dynamic version of this hashing scheme is presented as well.

9 citations

Proceedings ArticleDOI
10 Jun 2016
TL;DR: It has been proved that the proposed method works well in authentication scenario with less hash length and more discrimination power.
Abstract: This paper presents a novel image hashing method for authentication and tampering detection. Creation, modification and transfer of multimedia data becomes an easy task due to digitization. Integrity is very important for crucial and sensitive matters like medical records, legal matters, scientific research, forensic investigations and government documents. Image hashing is one of the popular method to maintain integrity of image. We proposed AQ-CSLBP (Average Compressed Center Symmetric Local Binary Pattern) as a feature descriptor for image hashing. First, image is divided into sub blocks, AQ-CSLBP is applied on each sub block to generate 8 bin histogram as a feature. Then, we used double bit quantization to generate hash code for the image. The proposed method is compared with the existing methods by way of quantitative analysis, it has been proved that the proposed method works well in authentication scenario with less hash length and more discrimination power.

9 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
84% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Support vector machine
73.6K papers, 1.7M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202333
202289
202111
202016
201916
201838