Topic
Feature hashing
About: Feature hashing is a research topic. Over the lifetime, 993 publications have been published within this topic receiving 51462 citations.
Papers published on a yearly basis
Papers
More filters
•
TL;DR: This paper proposes a supervised hashing method based on a well designed deep convolutional neural network, which tries to learn hashing code and compact representations of data simultaneously.
Abstract: Hashing-based methods seek compact and efficient binary codes that preserve the neighborhood structure in the original data space. For most existing hashing methods, an image is first encoded as a vector of hand-crafted visual feature, followed by a hash projection and quantization step to get the compact binary vector. Most of the hand-crafted features just encode the low-level information of the input, the feature may not preserve the semantic similarities of images pairs. Meanwhile, the hashing function learning process is independent with the feature representation, so the feature may not be optimal for the hashing projection. In this paper, we propose a supervised hashing method based on a well designed deep convolutional neural network, which tries to learn hashing code and compact representations of data simultaneously. The proposed model learn the binary codes by adding a compact sigmoid layer before the loss layer. Experiments on several image data sets show that the proposed model outperforms other state-of-the-art methods.
6 citations
••
08 Sep 2015TL;DR: TF-IDF-CF is chosen as the feature selection method and an accuracy of 98.2612 with F-measure 0.9841 is obtained which depicts the effectiveness of proposed scheme.
Abstract: An efficient email spam filtering system by selecting relevant features to reduce the dimensions has become a pivotal aspect in the field of machine learning based spam filtering. To deal with noisy features, TF-IDF-CF is chosen as the feature selection method in this study. The selected relevant feature sets are submitted to LibSVM and MNB classifiers to construct ham and spam models. An accuracy of 98.2612 with F-measure 0.9841 is obtained which depicts the effectiveness of proposed scheme.
6 citations
09 Jan 2014
TL;DR: This study focuses on the second group of hashing algorithms and criticizes the hashing algorithms using Feistel Network which are widely utilized in text mining studies and proposes a new approach which is mainly built on the substitution boxes (sboxes) and processes the text faster than the other implementations.
Abstract: This study focuses on the second group of hashing algorithms and criticizes the hashing algorithms using Feistel Networkwhich are widely utilized in text mining studies. We propose a new approach which is mainly built on the substitution boxes (sboxes),which is in the core of all Feistel Networks and processes the text faster than the other implementations.
6 citations
••
14 Jul 2014
TL;DR: This paper proposes a cross-media hashing approach based on kernel regression (abbreviated as KRCMH) to obtain the hash codes for the data objects across different modalities and achieves superior cross- media retrieval performance comparing with the state-of-the-art methods.
Abstract: Cross-media retrieval is a challenging problem in multimedia retrieval area. In the real-world, many applications involve multi-modal data, e.g., web pages containing both images and texts. How to utilize the intrinsic intra-modality and inter-modality similarity to learn the appropriate relationships of the data objects and provide efficient search across different modalities is the core of cross-media retrieval. Inspired by the fact that hashing methods well address the fast retrieval problem in the large-scale data settings, designing a cross-media hashing approach which can perform efficient retrieval over heterogenous high-dimensional feature spaces is highly desirable. In this paper, we propose a cross-media hashing approach based on kernel regression (abbreviated as KRCMH) to obtain the hash codes for the data objects across different modalities. The experiments on two real-world data sets show that KRCMH achieves superior cross-media retrieval performance comparing with the state-of-the-art methods.
6 citations
••
18 Sep 2005TL;DR: This work investigates the class of hash functions based on checksums to encode the type signatures of MPI datatype and finds that hash functionsbased on Galois Field enables good hashing, computation of the signature of unidatatype in $\mathcal{O}$(1) and computation ofThe concatenation of two datatypes in $\ mathcal{ O}$ (1) additionally.
Abstract: Detecting misuse of datatypes in an application code is a desirable feature for an MPI library. To support this goal we investigate the class of hash functions based on checksums to encode the type signatures of MPI datatype. The quality of these hash functions is assessed in terms of hashing, timing and comparing to other functions published for this particular problem (Gropp, 7th European PVM/MPI Users’ Group Meeting, 2000) or for other applications (CRCs). In particular hash functions based on Galois Field enables good hashing, computation of the signature of unidatatype in $\mathcal{O}$(1) and computation of the concatenation of two datatypes in $\mathcal{O}$(1) additionally.
6 citations