scispace - formally typeset
K

Kenji Kawaguchi

Researcher at Harvard University

Publications -  109
Citations -  3914

Kenji Kawaguchi is an academic researcher from Harvard University. The author has contributed to research in topics: Computer science & Artificial neural network. The author has an hindex of 20, co-authored 73 publications receiving 2522 citations. Previous affiliations of Kenji Kawaguchi include Japan Atomic Energy Agency & Massachusetts Institute of Technology.

Papers
More filters
Posted Content

Deep Learning without Poor Local Minima

TL;DR: In this article, the squared loss function of deep linear neural networks with any depth and any widths is shown to be non-convex and nonconcave, every local minimum is a global minimum, every critical point that is not a global minima is a saddle point, and there exist "bad" saddle points (where the Hessian has no negative eigenvalue) for deep networks with more than three layers.
Proceedings Article

Deep Learning without Poor Local Minima

TL;DR: This paper proves a conjecture published in 1989 and partially addresses an open problem announced at the Conference on Learning Theory (COLT) 2015, and presents an instance for which it can answer the following question: how difficult is it to directly train a deep model in theory?
Journal ArticleDOI

Adaptive activation functions accelerate convergence in deep and physics-informed neural networks

TL;DR: It is theoretically proved that in the proposed method, gradient descent algorithms are not attracted to suboptimal critical points or local minima, and the proposed adaptive activation functions are shown to accelerate the minimization process of the loss values in standard deep learning benchmarks with and without data augmentation.
Proceedings ArticleDOI

Interpolation consistency training for semi-supervised learning.

TL;DR: Interpolation Consistency Training (ICT) as mentioned in this paper encourages the prediction at an interpolation of unlabeled points to be consistent with the interpolations of the predictions at those points.
Posted Content

Generalization in Deep Learning

TL;DR: Non-vacuous and numerically-tight generalization guarantees for deep learning are provided, as well as theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima.