Open AccessProceedings Article
Deep Learning without Poor Local Minima
Kenji Kawaguchi
- Vol. 29, pp 586-594
TLDR
This paper proves a conjecture published in 1989 and partially addresses an open problem announced at the Conference on Learning Theory (COLT) 2015, and presents an instance for which it can answer the following question: how difficult is it to directly train a deep model in theory?Abstract:
In this paper, we prove a conjecture published in 1989 and also partially address an open problem announced at the Conference on Learning Theory (COLT) 2015. For an expected loss function of a deep nonlinear neural network, we prove the following statements under the independence assumption adopted from recent work: 1) the function is non-convex and non-concave, 2) every local minimum is a global minimum, 3) every critical point that is not a global minimum is a saddle point, and 4) the property of saddle points differs for shallow networks (with three layers) and deeper networks (with more than three layers). Moreover, we prove that the same four statements hold for deep linear neural networks with any depth, any widths and no unrealistic assumptions. As a result, we present an instance, for which we can answer to the following question: how difficult to directly train a deep model in theory? It is more difficult than the classical machine learning models (because of the non-convexity), but not too difficult (because of the nonexistence of poor local minima and the property of the saddle points). We note that even though we have advanced the theoretical foundations of deep learning, there is still a gap between theory and practice.read more
Citations
More filters
Journal ArticleDOI
Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI
Alejandro Barredo Arrieta,Natalia Díaz-Rodríguez,Javier Del Ser,Javier Del Ser,Adrien Bennetot,Adrien Bennetot,Siham Tabik,Alberto Barbado,Salvador García,Sergio Gil-Lopez,Daniel Molina,Richard Benjamins,Raja Chatila,Francisco Herrera +13 more
TL;DR: In this paper, a taxonomy of recent contributions related to explainability of different machine learning models, including those aimed at explaining Deep Learning methods, is presented, and a second dedicated taxonomy is built and examined in detail.
Journal ArticleDOI
Geometric Deep Learning: Going beyond Euclidean data
TL;DR: In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions) and are natural targets for machine-learning techniques as mentioned in this paper.
Posted Content
Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI.
Alejandro Barredo Arrieta,Natalia Díaz-Rodríguez,Javier Del Ser,Javier Del Ser,Adrien Bennetot,Adrien Bennetot,Siham Tabik,Alberto Barbado,Salvador García,Sergio Gil-Lopez,Daniel Molina,Richard Benjamins,Raja Chatila,Francisco Herrera +13 more
TL;DR: Previous efforts to define explainability in Machine Learning are summarized, establishing a novel definition that covers prior conceptual propositions with a major focus on the audience for which explainability is sought, and a taxonomy of recent contributions related to the explainability of different Machine Learning models are proposed.
Journal ArticleDOI
Survey on deep learning with class imbalance
TL;DR: Examination of existing deep learning techniques for addressing class imbalanced data finds that research in this area is very limited, that most existing work focuses on computer vision tasks with convolutional neural networks, and that the effects of big data are rarely considered.
Posted Content
Evolution Strategies as a Scalable Alternative to Reinforcement Learning.
TL;DR: This work explores the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an alternative to popular MDP-based RL techniques such as Q-learning and Policy Gradients, and highlights several advantages of ES as a blackbox optimization technique.