Dan Alistarh

Researcher at Institute of Science and Technology Austria

Publications - 213

Citations - 4887

Dan Alistarh is an academic researcher from Institute of Science and Technology Austria. The author has contributed to research in topics: Computer science & Stochastic gradient descent. The author has an hindex of 27, co-authored 175 publications receiving 3761 citations. Previous affiliations of Dan Alistarh include ETH Zurich & Microsoft.

Papers

PDF

Open Access

More filters

Proceedings Article

QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding

Dan Alistarh, +4 more

TL;DR: Quantized SGD (QSGD) as discussed by the authors is a family of compression schemes for gradient updates which provides convergence guarantees for convex and nonconvex objectives, under asynchrony, and can be extended to stochastic variance-reduced techniques.

...read moreread less

Posted Content

QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding

Dan Alistarh, +4 more

- 07 Oct 2016 -

arXiv: Learning

TL;DR: Quantized SGD is proposed, a family of compression schemes for gradient updates which provides convergence guarantees and leads to significant reductions in end-to-end training time, and can be extended to stochastic variance-reduced techniques.

...read moreread less

Posted Content

Model compression via distillation and quantization

Antonio Polino, +2 more

- 15 Feb 2018 -

arXiv: Neural and Evolutionary Computing

TL;DR: This paper proposed quantized distillation and differentiable quantization to optimize the location of quantization points through stochastic gradient descent to better fit the behavior of the teacher model, and showed that quantized shallow students can reach similar accuracy levels to full-precision teacher models.

...read moreread less

Proceedings Article

The Convergence of Sparsified Gradient Methods

Dan Alistarh, +5 more

TL;DR: The authors showed that sparsifying gradients by magnitude with local error correction provides convergence guarantees, for both convex and non-convex smooth objectives, for data-parallel SGD.

...read moreread less

Proceedings Article

ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning

Hantian Zhang, +5 more

TL;DR: The ZipML framework is able to execute training at low precision with no bias, guaranteeing convergence, whereas naive quantization would introduce significant bias, and it enables an FPGA prototype that is up to 6.5× faster than an implementation using full 32-bit precision.

...read moreread less

Collapse