scispace - formally typeset
H

Huishuai Zhang

Researcher at Microsoft

Publications -  79
Citations -  1463

Huishuai Zhang is an academic researcher from Microsoft. The author has contributed to research in topics: Computer science & Gradient descent. The author has an hindex of 16, co-authored 62 publications receiving 941 citations. Previous affiliations of Huishuai Zhang include Syracuse University & Salesforce.com.

Papers
More filters
Posted Content

On Layer Normalization in the Transformer Architecture

TL;DR: In this paper, the authors show that layer normalization is crucial to the performance of pre-LN Transformers and remove the warm-up stage for the training of Pre-LNs.
Proceedings Article

Reshaped Wirtinger Flow for Solving Quadratic System of Equations

TL;DR: It is shown that for random Gaussian measurements, reshaped-WF enjoys geometric convergence to a global optimal point as long as the number of measurements is at the order of $\cO(n)$, where $n$ is the dimension of the unknown $\bx$.
Journal Article

A Nonconvex Approach for Phase Retrieval: Reshaped Wirtinger Flow and Incremental Algorithms

TL;DR: An incremental (stochastic) version of RWF (IRWF) is developed and connected with the randomized Kaczmarz method for phase retrieval and it is demonstrated that IRWF outperforms existing incremental as well as batch algorithms with experiments.
Posted Content

Provable Non-convex Phase Retrieval with Outliers: Median Truncated Wirtinger Flow

TL;DR: In this article, a non-convex gradient descent Wirtinger flow (WF) algorithm is proposed to recover the signal from a near-optimal number of measurements composed of i.i.d. Gaussian entries, up to a logarithmic factor, even when a constant portion of the measurements are corrupted by arbitrary outliers.
Proceedings Article

Provable non-convex phase retrieval with outliers: median truncated wirtinger flow

TL;DR: This paper develops a novel median-TWF algorithm that exploits robustness of sample median to resist arbitrary outliers in the initialization and the gradient update in each iteration, and shows that such a non-convex algorithm provably recovers the signal from a near-optimal number of measurements, even when a constant portion of the measurements are corrupted by arbitrary outlier.