D
Dan Alistarh
Researcher at Institute of Science and Technology Austria
Publications - 213
Citations - 4887
Dan Alistarh is an academic researcher from Institute of Science and Technology Austria. The author has contributed to research in topics: Computer science & Stochastic gradient descent. The author has an hindex of 27, co-authored 175 publications receiving 3761 citations. Previous affiliations of Dan Alistarh include ETH Zurich & Microsoft.
Papers
More filters
Proceedings Article
QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding
TL;DR: Quantized SGD (QSGD) as discussed by the authors is a family of compression schemes for gradient updates which provides convergence guarantees for convex and nonconvex objectives, under asynchrony, and can be extended to stochastic variance-reduced techniques.
Posted Content
QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding
TL;DR: Quantized SGD is proposed, a family of compression schemes for gradient updates which provides convergence guarantees and leads to significant reductions in end-to-end training time, and can be extended to stochastic variance-reduced techniques.
Posted Content
Model compression via distillation and quantization
TL;DR: This paper proposed quantized distillation and differentiable quantization to optimize the location of quantization points through stochastic gradient descent to better fit the behavior of the teacher model, and showed that quantized shallow students can reach similar accuracy levels to full-precision teacher models.
Proceedings Article
The Convergence of Sparsified Gradient Methods
Dan Alistarh,Torsten Hoefler,Mikael Johansson,Nikola Konstantinov,Sarit Khirirat,Cedric Renggli +5 more
TL;DR: The authors showed that sparsifying gradients by magnitude with local error correction provides convergence guarantees, for both convex and non-convex smooth objectives, for data-parallel SGD.
Proceedings Article
ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning
TL;DR: The ZipML framework is able to execute training at low precision with no bias, guaranteeing convergence, whereas naive quantization would introduce significant bias, and it enables an FPGA prototype that is up to 6.5× faster than an implementation using full 32-bit precision.