scispace - formally typeset
Z

Zhao Song

Researcher at Princeton University

Publications -  144
Citations -  4661

Zhao Song is an academic researcher from Princeton University. The author has contributed to research in topics: Computer science & Matrix (mathematics). The author has an hindex of 27, co-authored 124 publications receiving 3106 citations. Previous affiliations of Zhao Song include University of Texas at Austin & Harvard University.

Papers
More filters
Posted Content

A Convergence Theory for Deep Learning via Over-Parameterization

TL;DR: This work proves why stochastic gradient descent can find global minima on the training objective of DNNs in $\textit{polynomial time}$ and implies an equivalence between over-parameterized neural networks and neural tangent kernel (NTK) in the finite (and polynomial) width setting.
Proceedings Article

Towards Fast Computation of Certified Robustness for ReLU Networks

TL;DR: In this paper, the authors exploit the special structure of ReLU networks and provide two computationally efficient algorithms FastLin and Fast-Lip that are able to certify non-trivial lower bounds of minimum distortions, by bounding the ReLU units with appropriate linear functions Fast-Linear or FastLip.
Proceedings Article

A Convergence Theory for Deep Learning via Over-Parameterization

TL;DR: In this paper, the authors present a new theory to understand the convergence of training DNNs, where they make two assumptions: the inputs do not degenerate and the network is over-parameterized.
Posted Content

Solving Linear Programs in the Current Matrix Multiplication Time

TL;DR: This paper shows how to solve linear programs of the form minAx=b,x≥0 c⊤x with n variables in time O*((nω+n2.5−α/2+ n2+1/6) log(n/δ)) where ω is the exponent of matrix multiplication, α is the dual exponent of Matrix multiplication, and δ is the relative accuracy.
Posted Content

Recovery Guarantees for One-hidden-layer Neural Networks

TL;DR: This work distill some properties of activation functions that lead to local strong convexity in the neighborhood of the ground-truth parameters for the 1NN squared-loss objective, and provides recovery guarantees for 1NNs with both sample complexity and computational complexity $\mathit{linear}$ in the input dimension and $\math it{logarithmic}$in the precision.