Zhao Song

Researcher at Princeton University

Publications - 144

Citations - 4661

Zhao Song is an academic researcher from Princeton University. The author has contributed to research in topics: Computer science & Matrix (mathematics). The author has an hindex of 27, co-authored 124 publications receiving 3106 citations. Previous affiliations of Zhao Song include University of Texas at Austin & Harvard University.

Papers

PDF

Open Access

More filters

Posted Content

A Convergence Theory for Deep Learning via Over-Parameterization

Zeyuan Allen-Zhu, +2 more

- 09 Nov 2018 -

arXiv: Learning

TL;DR: This work proves why stochastic gradient descent can find global minima on the training objective of DNNs in $\textit{polynomial time}$ and implies an equivalence between over-parameterized neural networks and neural tangent kernel (NTK) in the finite (and polynomial) width setting.

...read moreread less

Proceedings Article

Towards Fast Computation of Certified Robustness for ReLU Networks

Tsui-Wei Weng, +7 more

TL;DR: In this paper, the authors exploit the special structure of ReLU networks and provide two computationally efficient algorithms FastLin and Fast-Lip that are able to certify non-trivial lower bounds of minimum distortions, by bounding the ReLU units with appropriate linear functions Fast-Linear or FastLip.

...read moreread less

Proceedings Article

A Convergence Theory for Deep Learning via Over-Parameterization

Zeyuan Allen-Zhu, +2 more

TL;DR: In this paper, the authors present a new theory to understand the convergence of training DNNs, where they make two assumptions: the inputs do not degenerate and the network is over-parameterized.

...read moreread less

Posted Content

Solving Linear Programs in the Current Matrix Multiplication Time

Michael B. Cohen, +2 more

- 18 Oct 2018 -

arXiv: Data Structures and Algorithms

TL;DR: This paper shows how to solve linear programs of the form minAx=b,x≥0 c⊤x with n variables in time O*((nω+n2.5−α/2+ n2+1/6) log(n/δ)) where ω is the exponent of matrix multiplication, α is the dual exponent of Matrix multiplication, and δ is the relative accuracy.

...read moreread less

Posted Content

Recovery Guarantees for One-hidden-layer Neural Networks

Kai Zhong, +4 more

- 10 Jun 2017 -

arXiv: Learning

TL;DR: This work distill some properties of activation functions that lead to local strong convexity in the neighborhood of the ground-truth parameters for the 1NN squared-loss objective, and provides recovery guarantees for 1NNs with both sample complexity and computational complexity $\mathit{linear}$ in the input dimension and $\math it{logarithmic}$in the precision.

...read moreread less

Collapse