scispace - formally typeset
Open AccessJournal ArticleDOI

The Bitwise Hashing Trick for Personalized Search

TLDR
In this article, the use of feature bit vectors using the hashing trick for improving relevance in personalized search and other personalization applications is introduced. But they use a single bit per dimension instead of floating point results in an order of magnitude decrease in data structure size while preserving or even improving quality.
Abstract
Many real world problems require fast and efficient lexical comparison of large numbers of short text strings. Search personalization is one such domain. We introduce the use of feature bit vectors using the hashing trick for improving relevance in personalized search and other personalization applications. We present results of several lexical hashing and comparison methods. These methods are applied to a user's historical behavior and are used to predict future behavior. Using a single bit per dimension instead of floating point results in an order of magnitude decrease in data structure size, while preserving or even improving quality. We use real data to simulate a search personalization task. A simple method for combining bit vectors demonstrates an order of magnitude improvement in compute time on the task with only a small decrease in accuracy.

read more

Citations
More filters
Proceedings ArticleDOI

An Efficient and Accurate Detection of Fake News Using Capsule Transient Auto Encoder

TL;DR: Adaptive Capsule Transient Auto Encoder (ACTAE) as discussed by the authors is a combined approach of a classifier named capsule auto encoder and an algorithm called adaptive transient search optimization algorithm.
References
More filters
Book

Mining of Massive Datasets

Din J. Wasem
TL;DR: Determining relevant data is key to delivering value from massive amounts of data and big data is defined less by volume which is a constantly moving target than by its ever-increasing variety, velocity, variability and complexity.
Posted Content

Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations

TL;DR: A binary matrix multiplication GPU kernel is programmed with which it is possible to run the MNIST QNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy.
Proceedings ArticleDOI

Feature hashing for large scale multitask learning

TL;DR: In this article, the authors provide exponential tail bounds for feature hashing and show that the interaction between random subspaces is negligible with high probability, and demonstrate the feasibility of this approach with experimental results for a new use case.
Proceedings ArticleDOI

Personalizing search via automated analysis of interests and activities

TL;DR: This research suggests that rich representations of the user and the corpus are important for personalization, but that it is possible to approximate these representations and provide efficient client-side algorithms for personalizing search.
Journal Article

Quantized neural networks: training neural networks with low precision weights and activations

TL;DR: In this paper, a method to train quantized neural networks (QNNs) with extremely low precision (e.g., 1-bit) weights and activations, at run-time is introduced.
Related Papers (5)