scispace - formally typeset
Proceedings ArticleDOI

Cuckoo Linear Algebra

Reads0
Chats0
TLDR
This paper presents a novel data structure for sparse vectors based on Cuckoo hashing that is highly memory efficient and allows for random access at near dense vector level rates.
Abstract
In this paper we present a novel data structure for sparse vectors based on Cuckoo hashing. It is highly memory efficient and allows for random access at near dense vector level rates. This allows us to solve sparse l1 programming problems exactly and without preprocessing at a cost that is identical to dense linear algebra both in terms of memory and speed. Our approach provides a feasible alternative to the hash kernel and it excels whenever exact solutions are required, such as for feature selection.

read more

Citations
More filters
Proceedings ArticleDOI

Cuckoo feature hashing: Dynamic weight sharing for sparse analytics

TL;DR: Experimental results on prediction tasks with hundred-millions of features demonstrate that CCFH can achieve the same level of performance by using only 15%-25% parameters compared with conventional feature hashing.
Proceedings ArticleDOI

A New Feature Hashing Approach Based on Term Weight for Dimensional Reduction

TL;DR: This paper proposed a new feature hashing approach that hashes similar features to the same bin based on their weight known as "weight term" while minimizing certain collisions, which effectively reduces the collisions between dissimilar features, thus improving model performance.
References
More filters
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Journal ArticleDOI

A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

TL;DR: A new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically.
Journal Article

LIBLINEAR: A Library for Large Linear Classification

TL;DR: LIBLINEAR is an open source library for large-scale linear classification that supports logistic regression and linear support vector machines and provides easy-to-use command-line tools and library calls for users and developers.
Journal ArticleDOI

Simultaneous analysis of lasso and dantzig selector

TL;DR: In this article, the Lasso estimator and the Dantzig selector exhibit similar behavior under a sparsity scenario, and they derive, in parallel, oracle inequalities for the prediction risk in the general nonparametric regression model, as well as bounds on the l p estimation loss for 1 ≤ p ≤ 2.
Related Papers (5)