scispace - formally typeset
D

Dan Alistarh

Researcher at Institute of Science and Technology Austria

Publications -  213
Citations -  4887

Dan Alistarh is an academic researcher from Institute of Science and Technology Austria. The author has contributed to research in topics: Computer science & Stochastic gradient descent. The author has an hindex of 27, co-authored 175 publications receiving 3761 citations. Previous affiliations of Dan Alistarh include ETH Zurich & Microsoft.

Papers
More filters
Posted Content

Compressive Sensing with Low Precision Data Representation: Theory and Applications

TL;DR: A theoretical analysis of the Iterative Hard Thresholding (IHT) algorithm when all input data, that is, the measurement matrix and the observation, are quantized aggressively to as little as 2 bits per value shows that there exists a variant of low precision IHT that can still provide recovery guarantees.
Journal ArticleDOI

Error Feedback Can Accurately Compress Preconditioners

TL;DR: EFCP as discussed by the authors compresses the gradient information via sparsification or low-rank compression before feeding the compression error back into future iterations, which can compress full-matrix preconditioners by up to two orders of magnitude in practice.
Journal ArticleDOI

Hybrid Decentralized Optimization: First- and Zeroth-Order Optimizers Can Be Jointly Leveraged For Faster Convergence

TL;DR: This work essentially shows that, under reasonable parameter settings, a hybrid decentralized optimization system can not only withstand noisier zeroth-order agents, but can even benefit from integrating such agents into the optimization process, rather than ignoring their information.
Posted Content

A Formally-Verified Framework for Fair Synchronization in Kotlin Coroutines.

TL;DR: The CancellableQueueSynchronizer (CQS) as mentioned in this paper is a framework that enables efficient fair and abortable implementations of fundamental synchronization primitives such as mutexes, semaphores, barriers, count-down-latches, and blocking pools.
Posted Content

NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization

TL;DR: In this article, the authors proposed a new gradient quantization scheme for data-parallel stochastic gradient descent (QSGD), which has both stronger theoretical guarantees than QSGD, and matches and exceeds the empirical performance of the qSGDinf heuristic and of other compression methods.