scispace - formally typeset
Z

Zhize Li

Researcher at King Abdullah University of Science and Technology

Publications -  42
Citations -  704

Zhize Li is an academic researcher from King Abdullah University of Science and Technology. The author has contributed to research in topics: Computer science & Gradient descent. The author has an hindex of 13, co-authored 35 publications receiving 483 citations. Previous affiliations of Zhize Li include Tsinghua University & Carnegie Mellon University.

Papers
More filters
Proceedings Article

A simple proximal stochastic gradient method for nonsmooth nonconvex optimization

TL;DR: ProxSVRG+ as discussed by the authors is a proximal stochastic gradient algorithm based on variance reduction, which can automatically switch to the faster linear convergence in some regions as long as the objective function satisfies the Polyak-Łojasiewicz condition locally in these regions.
Posted Content

Acceleration for Compressed Gradient Descent in Distributed and Federated Optimization

TL;DR: This paper proposes the first accelerated compressed gradient descent (ACGD) methods and improves upon the existing non-accelerated rates and recovers the optimal rates of accelerated gradient descent as a special case when no compression is applied.
Posted Content

PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

TL;DR: The results demonstrate that PAGE not only converges much faster than SGD in training but also achieves the higher test accuracy, validating the theoretical results and confirming the practical superiority of PAGE.
Proceedings Article

On Top-k selection in multi-armed bandits and hidden bipartite graphs

TL;DR: This paper discusses how to efficiently choose from n unknown distributions the k ones whose means are the greatest by a certain metric, up to a small relative error, and proposes optimal algorithms whose sample complexities match those lower bounds.
Proceedings Article

Learning Two-Layer Neural Networks with Symmetric Inputs

TL;DR: A new algorithm for learning a two-layer neural network under a general class of input distributions based on the method-of-moments framework and extends several results in tensor decompositions to avoid the complicated non-convex optimization in learning neural networks.