M
Manzil Zaheer
Researcher at Google
Publications - 138
Citations - 9300
Manzil Zaheer is an academic researcher from Google. The author has contributed to research in topics: Computer science & Question answering. The author has an hindex of 30, co-authored 113 publications receiving 5215 citations. Previous affiliations of Manzil Zaheer include Microsoft & Carnegie Mellon University.
Papers
More filters
Federated Optimization in Heterogeneous Networks
TL;DR: This work introduces a framework, FedProx, to tackle heterogeneity in federated networks, and provides convergence guarantees for this framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work.
Posted Content
Deep Sets
Manzil Zaheer,Satwik Kottur,Siamak Ravanbakhsh,Barnabás Póczos,Ruslan Salakhutdinov,Alexander J. Smola +5 more
TL;DR: The main theorem characterizes the permutation invariant objective functions and provides a family of functions to which any permutation covariant objective function must belong, which enables the design of a deep network architecture that can operate on sets and which can be deployed on a variety of scenarios including both unsupervised and supervised learning tasks.
Posted Content
Big Bird: Transformers for Longer Sequences
Manzil Zaheer,Guru Guruganesh,Avinava Dubey,Joshua Ainslie,Chris Alberti,Santiago Ontañón,Philip Pham,Anirudh Ravula,Qifan Wang,Li Yang,Amr Ahmed +10 more
TL;DR: It is shown that BigBird is a universal approximator of sequence functions and is Turing complete, thereby preserving these properties of the quadratic, full attention model.
Posted Content
Adaptive Federated Optimization
Sashank J. Reddi,Zachary Charles,Manzil Zaheer,Zachary Garrett,Keith Rush,Jakub Konečný,Sanjiv Kumar,H. Brendan McMahan +7 more
TL;DR: This work proposes federated versions of adaptive optimizers, including Adagrad, Adam, and Yogi, and analyzes their convergence in the presence of heterogeneous data for general nonconvex settings to highlight the interplay between client heterogeneity and communication efficiency.
Posted Content
Federated Optimization in Heterogeneous Networks
TL;DR: FedProx as discussed by the authors is a generalization and re-parametrization of FedAvg, which is the state-of-the-art method for federated learning.