Topic

Feature hashing

About: Feature hashing is a research topic. Over the lifetime, 993 publications have been published within this topic receiving 51462 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Patent•

Hash codes for images

[...]

Yadong Mu¹, Zhu Liu¹•Institutions (1)

AT&T¹

05 Jun 2015

TL;DR: In this paper, the authors proposed a method for training a neural network by iteratively adjusting parameters of the neural network based on concurrent application of multiple loss functions to the subset of images.

...read moreread less

Abstract: A method includes receiving, at a neural network, a subset of images of a plurality of images of a training image set. The method includes training the neural network by iteratively adjusting parameters of the neural network based on concurrent application of multiple loss functions to the subset of images. The multiple loss functions include a classification loss function and a hashing loss function. The classification loss function is associated with an image classification function that extracts image features from an image. The hashing loss function is associated with a hashing function that generates a hash code for the image.

...read moreread less

24 citations

Proceedings Article•DOI•

Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems

[...]

Hao-Jun Michael Shi¹, Dheevatsa Mudigere², Maxim Naumov², Jiyan Yang²•Institutions (2)

Northwestern University¹, Facebook²

04 Sep 2019-arXiv: Learning

TL;DR: In this article, the authors propose an end-to-end approach to reduce the size of the embedding tables by exploiting complementary partitions of the category set to produce a unique embedding vector for each category without explicit definition.

...read moreread less

Abstract: Modern deep learning-based recommendation systems exploit hundreds to thousands of different categorical features, each with millions of different categories ranging from clicks to posts. To respect the natural diversity within the categorical data, embeddings map each category to a unique dense representation within an embedded space. Since each categorical feature could take on as many as tens of millions of different possible categories, the embedding tables form the primary memory bottleneck during both training and inference. We propose a novel approach for reducing the embedding size in an end-to-end fashion by exploiting complementary partitions of the category set to produce a unique embedding vector for each category without explicit definition. By storing multiple smaller embedding tables based on each complementary partition and combining embeddings from each table, we define a unique embedding for each category at smaller memory cost. This approach may be interpreted as using a specific fixed codebook to ensure uniqueness of each category's representation. Our experimental results demonstrate the effectiveness of our approach over the hashing trick for reducing the size of the embedding tables in terms of model loss and accuracy, while retaining a similar reduction in the number of parameters.

...read moreread less

24 citations

Two Effective Functions on Hashing URL

[...]

LI Xiao-Ming¹, Feng Wang-Sen•Institutions (1)

Peking University¹

01 Jan 2004

TL;DR: The finding is that the well-known function for hashing sequence of symbols, ELFhash, is not very good in this regard, and the other two functions are better and thus recommended.

...read moreread less

Abstract: Hashing large collection of URLs is an inevitable problem in many Web research activities. Through a large scale experiment, three hash functions are compared in this paper. Two metrics were developed for the comparison, which are related to web structure analysis and Web crawling, respectively. The finding is that the well-known function for hashing sequence of symbols, ELFhash, is not very good in this regard, and the other two functions are better and thus recommended.

...read moreread less

24 citations

Proceedings Article•

Semi-Supervised SimHash for Efficient Document Similarity Search

[...]

Qixia Jiang¹, Maosong Sun¹•Institutions (1)

Tsinghua University¹

19 Jun 2011

TL;DR: This paper proposes a novel (semi-)supervised hashing method named Semi-Supervised SimHash (S3H) for high-dimensional data similarity search that learns the optimal feature weights from prior knowledge to relocate the data such that similar data have similar hash codes.

...read moreread less

Abstract: Searching documents that are similar to a query document is an important component in modern information retrieval. Some existing hashing methods can be used for efficient document similarity search. However, unsupervised hashing methods cannot incorporate prior knowledge for better hashing. Although some supervised hashing methods can derive effective hash functions from prior knowledge, they are either computationally expensive or poorly discriminative. This paper proposes a novel (semi-)supervised hashing method named Semi-Supervised SimHash (S3H) for high-dimensional data similarity search. The basic idea of S3H is to learn the optimal feature weights from prior knowledge to relocate the data such that similar data have similar hash codes. We evaluate our method with several state-of-the-art methods on two large datasets. All the results show that our method gets the best performance.

...read moreread less

24 citations

Proceedings Article•DOI•

Image retrieval based on deep Convolutional Neural Networks and binary hashing learning

[...]

Tian-qiang Peng, Fang Li

28 Feb 2017

TL;DR: Experimental results on the three benchmark datasets show that the binary hash codes generated by the proposed method has superior performance gains over other state-of-the-art hashing methods.

...read moreread less

Abstract: With the increasing amount of image data, the image retrieval methods have several drawbacks, such as the low expression ability of visual feature, high dimension of feature, low precision of image retrieval and so on To solve these problems, a learning method of binary hashing based on deep convolutional neural networks is proposed The basic idea is to add a hash layer into the deep learning framework and simultaneously learn image features and hash functions which should satisfy independence and quantization error minimized First, convolutional neural network is employed to learn the intrinsic implications of training images so as to improve the distinguish ability and expression ability of visual feature Second, the visual feature is putted into the hash layer, in which hash functions are learned And the learned hash functions should satisfy the classification error and quantization error minimized and the independence constraint Finally, given an input image, hash codes are generated by the output of the hash layer of the proposed framework and large scale image retrieval can be accomplished in low-dimensional hamming space Experimental results on the three benchmark datasets show that the binary hash codes generated by the proposed method has superior performance gains over other state-of-the-art hashing methods

...read moreread less

24 citations

Collapse

Network Information

Performance

Metrics

1,120

Papers

57,460

Citations

No. of papers in the topic in previous years
Year	Papers
2023	33
2022	89
2021	11
2020	16
2019	16
2018	38

Feature hashing

Papers published on a yearly basis

Papers

Trending Questions (2)

Network Information

Related Topics (5)

Performance

Metrics