Z
Zhize Li
Researcher at King Abdullah University of Science and Technology
Publications - 42
Citations - 704
Zhize Li is an academic researcher from King Abdullah University of Science and Technology. The author has contributed to research in topics: Computer science & Gradient descent. The author has an hindex of 13, co-authored 35 publications receiving 483 citations. Previous affiliations of Zhize Li include Tsinghua University & Carnegie Mellon University.
Papers
More filters
Proceedings Article
A simple proximal stochastic gradient method for nonsmooth nonconvex optimization
TL;DR: ProxSVRG+ as discussed by the authors is a proximal stochastic gradient algorithm based on variance reduction, which can automatically switch to the faster linear convergence in some regions as long as the objective function satisfies the Polyak-Łojasiewicz condition locally in these regions.
Posted Content
Acceleration for Compressed Gradient Descent in Distributed and Federated Optimization
TL;DR: This paper proposes the first accelerated compressed gradient descent (ACGD) methods and improves upon the existing non-accelerated rates and recovers the optimal rates of accelerated gradient descent as a special case when no compression is applied.
Posted Content
PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization
TL;DR: The results demonstrate that PAGE not only converges much faster than SGD in training but also achieves the higher test accuracy, validating the theoretical results and confirming the practical superiority of PAGE.
Proceedings Article
On Top-k selection in multi-armed bandits and hidden bipartite graphs
TL;DR: This paper discusses how to efficiently choose from n unknown distributions the k ones whose means are the greatest by a certain metric, up to a small relative error, and proposes optimal algorithms whose sample complexities match those lower bounds.
Proceedings Article
Learning Two-Layer Neural Networks with Symmetric Inputs
TL;DR: A new algorithm for learning a two-layer neural network under a general class of input distributions based on the method-of-moments framework and extends several results in tensor decompositions to avoid the complicated non-convex optimization in learning neural networks.