scispace - formally typeset
Open AccessProceedings ArticleDOI

Cuckoo feature hashing: Dynamic weight sharing for sparse analytics

TLDR
Experimental results on prediction tasks with hundred-millions of features demonstrate that CCFH can achieve the same level of performance by using only 15%-25% parameters compared with conventional feature hashing.
Abstract
Feature hashing is widely used to process large scale sparse features for learning of predictive models. Collisions inherently happen in the hashing process and hurt the model performance. In this paper, we develop a feature hashing scheme called Cuckoo Feature Hashing(CCFH) based on the principle behind Cuckoo hashing, a hashing scheme designed to resolve collisions. By providing multiple possible hash locations for each feature, CCFH prevents the collisions between predictive features by dynamically hashing them into alternative locations during model training. Experimental results on prediction tasks with hundred-millions of features demonstrate that CCFH can achieve the same level of performance by using only 15%-25% parameters compared with conventional feature hashing.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Mixed Dimension Embeddings with Application to Memory-Efficient Recommendation Systems

TL;DR: This work proposes mixed dimension embedding layers in which the dimension of a particular embedding vector can depend on the frequency of the item, which drastically reduces the memory requirement for the embedding, while maintaining and sometimes improving the ML performance.
Journal ArticleDOI

Towards Reliable Learning for High Stakes Applications.

TL;DR: This paper proposes an exploratory solution called GALVE (Generative Adversarial Learning with Variance Expansion) which adopts generative adversarial learning to implicitly measure the region where the model achieve good generalization performance and achieved an error rate less than half of which straightforwardly measured by confidence in CIFAR10 and SVHN computer vision tasks.
Posted Content

PANDA: Facilitating Usable AI Development

TL;DR: A new perspective on developing AI solutions is taken, and a solution for making AI usable is presented that will enable all subject matter experts (eg. Clinicians) to exploit AI like data scientists.
Proceedings ArticleDOI

A New Feature Hashing Approach Based on Term Weight for Dimensional Reduction

TL;DR: This paper proposed a new feature hashing approach that hashes similar features to the same bin based on their weight known as "weight term" while minimizing certain collisions, which effectively reduces the collisions between dissimilar features, thus improving model performance.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal ArticleDOI

R: A Language for Data Analysis and Graphics

TL;DR: In this article, the authors discuss their experience designing and implementing a statistical computing language, which combines what they felt were useful features from two existing computer languages, and they feel that the new language provides advantages in the areas of portability, computational efficiency, memory management, and scope.
Proceedings Article

Spark: cluster computing with working sets

TL;DR: Spark can outperform Hadoop by 10x in iterative machine learning jobs, and can be used to interactively query a 39 GB dataset with sub-second response time.
Proceedings Article

Locality Preserving Projections

TL;DR: These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data set by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold.
Proceedings ArticleDOI

Fisher discriminant analysis with kernels

TL;DR: In this article, a non-linear classification technique based on Fisher's discriminant is proposed and the main ingredient is the kernel trick which allows the efficient computation of Fisher discriminant in feature space.
Related Papers (5)