D
Dan Alistarh
Researcher at Institute of Science and Technology Austria
Publications - 213
Citations - 4887
Dan Alistarh is an academic researcher from Institute of Science and Technology Austria. The author has contributed to research in topics: Computer science & Stochastic gradient descent. The author has an hindex of 27, co-authored 175 publications receiving 3761 citations. Previous affiliations of Dan Alistarh include ETH Zurich & Microsoft.
Papers
More filters
Posted Content
Compressive Sensing with Low Precision Data Representation: Theory and Applications
Nezihe Merve Gürel,Kaan Kara,Alen Stojanov,Tyler M. Smith,Dan Alistarh,Markus Püschel,Ce Zhang +6 more
TL;DR: A theoretical analysis of the Iterative Hard Thresholding (IHT) algorithm when all input data, that is, the measurement matrix and the observation, are quantized aggressively to as little as 2 bits per value shows that there exists a variant of low precision IHT that can still provide recovery guarantees.
Journal ArticleDOI
Error Feedback Can Accurately Compress Preconditioners
TL;DR: EFCP as discussed by the authors compresses the gradient information via sparsification or low-rank compression before feeding the compression error back into future iterations, which can compress full-matrix preconditioners by up to two orders of magnitude in practice.
Journal ArticleDOI
Hybrid Decentralized Optimization: First- and Zeroth-Order Optimizers Can Be Jointly Leveraged For Faster Convergence
TL;DR: This work essentially shows that, under reasonable parameter settings, a hybrid decentralized optimization system can not only withstand noisier zeroth-order agents, but can even benefit from integrating such agents into the optimization process, rather than ignoring their information.
Posted Content
A Formally-Verified Framework for Fair Synchronization in Kotlin Coroutines.
TL;DR: The CancellableQueueSynchronizer (CQS) as mentioned in this paper is a framework that enables efficient fair and abortable implementations of fundamental synchronization primitives such as mutexes, semaphores, barriers, count-down-latches, and blocking pools.
Posted Content
NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization
TL;DR: In this article, the authors proposed a new gradient quantization scheme for data-parallel stochastic gradient descent (QSGD), which has both stronger theoretical guarantees than QSGD, and matches and exceeds the empirical performance of the qSGDinf heuristic and of other compression methods.