scispace - formally typeset
Open AccessProceedings Article

Deep Learning without Poor Local Minima

TLDR
This paper proves a conjecture published in 1989 and partially addresses an open problem announced at the Conference on Learning Theory (COLT) 2015, and presents an instance for which it can answer the following question: how difficult is it to directly train a deep model in theory?
Abstract
In this paper, we prove a conjecture published in 1989 and also partially address an open problem announced at the Conference on Learning Theory (COLT) 2015. For an expected loss function of a deep nonlinear neural network, we prove the following statements under the independence assumption adopted from recent work: 1) the function is non-convex and non-concave, 2) every local minimum is a global minimum, 3) every critical point that is not a global minimum is a saddle point, and 4) the property of saddle points differs for shallow networks (with three layers) and deeper networks (with more than three layers). Moreover, we prove that the same four statements hold for deep linear neural networks with any depth, any widths and no unrealistic assumptions. As a result, we present an instance, for which we can answer to the following question: how difficult to directly train a deep model in theory? It is more difficult than the classical machine learning models (because of the non-convexity), but not too difficult (because of the nonexistence of poor local minima and the property of the saddle points). We note that even though we have advanced the theoretical foundations of deep learning, there is still a gap between theory and practice.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Geometric Deep Learning: Going beyond Euclidean data

TL;DR: In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions) and are natural targets for machine-learning techniques as mentioned in this paper.
Posted Content

Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI.

TL;DR: Previous efforts to define explainability in Machine Learning are summarized, establishing a novel definition that covers prior conceptual propositions with a major focus on the audience for which explainability is sought, and a taxonomy of recent contributions related to the explainability of different Machine Learning models are proposed.
Journal ArticleDOI

Survey on deep learning with class imbalance

TL;DR: Examination of existing deep learning techniques for addressing class imbalanced data finds that research in this area is very limited, that most existing work focuses on computer vision tasks with convolutional neural networks, and that the effects of big data are rarely considered.
Posted Content

Evolution Strategies as a Scalable Alternative to Reinforcement Learning.

TL;DR: This work explores the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an alternative to popular MDP-based RL techniques such as Q-learning and Policy Gradients, and highlights several advantages of ES as a blackbox optimization technique.
Related Papers (5)