H
Haochuan Li
Researcher at Massachusetts Institute of Technology
Publications - 16
Citations - 794
Haochuan Li is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Computer science & Artificial neural network. The author has an hindex of 4, co-authored 7 publications receiving 742 citations. Previous affiliations of Haochuan Li include Peking University.
Papers
More filters
Proceedings Article
Gradient descent finds global minima of deep neural networks
TL;DR: This paper showed that gradient descent achieves zero training loss in polynomial time for a deep over-parameterized neural network with residual connections (ResNet) and further extended their analysis to deep residual convolutional neural networks and obtained a similar convergence result.
Posted Content
Convergence of Adversarial Training in Overparametrized Neural Networks
TL;DR: This paper provides a partial answer to the success of adversarial training, by showing that it converges to a network where the surrogate loss with respect to the the attack algorithm is within $\epsilon$ of the optimal robust loss.
Posted Content
Gradient Descent Finds Global Minima of Deep Neural Networks
TL;DR: This article showed that gradient descent achieves zero training loss in polynomial time for a deep over-parameterized neural network with residual connections (ResNet) and further extended their analysis to deep residual convolutional neural networks and obtained a similar convergence result.
Proceedings Article
Convergence of Adversarial Training in Overparametrized Neural Networks
TL;DR: In this paper, the authors show that adversarial training converges to a network where the surrogate loss with respect to the attack algorithm is within a factor of 1/ε of the optimal robust loss.
Journal ArticleDOI
Variance-reduced Clipping for Non-convex Optimization
TL;DR: In this article , the authors employ a variance reduction technique, namely SPIDER, and demonstrate that for a carefully designed learning rate, this complexity is improved to O(epsilon^{-3})$ which is order-optimal.