D
Dan Alistarh
Researcher at Institute of Science and Technology Austria
Publications - 213
Citations - 4887
Dan Alistarh is an academic researcher from Institute of Science and Technology Austria. The author has contributed to research in topics: Computer science & Stochastic gradient descent. The author has an hindex of 27, co-authored 175 publications receiving 3761 citations. Previous affiliations of Dan Alistarh include ETH Zurich & Microsoft.
Papers
More filters
Min-Max Hypergraph Partitioning
TL;DR: This paper introduces an approximation algorithm for the offline version of the problem of partitioning the set of items into a given number of partitions, and shows that a simple greedy online assignment of items is able to recover a hidden co-clustering of vertices under a natural set of recovery conditions.
Proceedings ArticleDOI
Fast and Scalable Channels in Kotlin Coroutines
TL;DR: In this paper , a fast and scalable algorithm for both rendezvous and buffered channels is presented, which is based on an infinite array with two positional counters for send(e) and receive() operations, leveraging the unconditional Fetch-and-Add instruction to update them.
Journal ArticleDOI
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
Tim Dettmers,Vage Egiazarian,D. D. Kuznedelev,Elias Frantar,Saleh Ashkboos,Alexander Borzunov,Torsten Hoefler,Dan Alistarh +7 more
TL;DR: Sparse-Quantized Representation (SpQR) as mentioned in this paper is a new compressed format and quantization technique which enables near-lossless compression of LLMs across model scales, while reaching similar compression levels to previous methods.
Proceedings Article
Dynamic averaging load balancing on cycles
TL;DR: In this paper, an upper bound of ǫ (√n log n) on the expected gap of the load-balancing process for cycles of length n was provided.
Posted Content
Adaptive Gradient Quantization for Data-Parallel SGD.
TL;DR: In this paper, two adaptive quantization schemes, ALQ and AMQ, are proposed to compute sufficient statistics of a parametric distribution to improve the performance of gradient quantization.