Optimization Methods for Large-Scale Machine Learning
Reads0
Chats0
TLDR
The authors provides a review and commentary on the past, present, and future of numerical optimization algorithms in the context of machine learning applications and discusses how optimization problems arise in machine learning and what makes them challenging.Abstract:
This paper provides a review and commentary on the past, present, and future of numerical optimization algorithms in the context of machine learning applications. Through case studies on text classification and the training of deep neural networks, we discuss how optimization problems arise in machine learning and what makes them challenging. A major theme of our study is that large-scale machine learning represents a distinctive setting in which the stochastic gradient (SG) method has traditionally played a central role while conventional gradient-based nonlinear optimization techniques typically falter. Based on this viewpoint, we present a comprehensive theory of a straightforward, yet versatile SG algorithm, discuss its practical behavior, and highlight opportunities for designing algorithms with improved performance. This leads to a discussion about the next generation of optimization methods for large-scale machine learning, including an investigation of two main streams of research on techniques th...read more
Citations
More filters
Journal ArticleDOI
Generalizing from a Few Examples: A Survey on Few-shot Learning
TL;DR: A thorough survey to fully understand Few-shot Learning (FSL), and categorizes FSL methods from three perspectives: data, which uses prior knowledge to augment the supervised experience; model, which used to reduce the size of the hypothesis space; and algorithm, which using prior knowledgeto alter the search for the best hypothesis in the given hypothesis space.
Posted Content
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
Nitish Shirish Keskar,Dheevatsa Mudigere,Jorge Nocedal,Mikhail Smelyanskiy,Ping Tak Peter Tang +4 more
TL;DR: In this paper, the authors investigate the cause of the generalization drop in the large batch regime and present numerical evidence that supports the view that large-batch methods tend to converge to sharp minima of the training and testing functions.
Posted Content
Generalizing from a Few Examples: A Survey on Few-Shot Learning
TL;DR: A thorough survey to fully understand Few-Shot Learning (FSL), and categorizes FSL methods from three perspectives: data, which uses prior knowledge to augment the supervised experience; model, which used to reduce the size of the hypothesis space; and algorithm, which using prior knowledgeto alter the search for the best hypothesis in the given hypothesis space.
Journal ArticleDOI
Model reduction of dynamical systems on nonlinear manifolds using deep convolutional autoencoders
Kookjin Lee,Kevin Carlberg +1 more
TL;DR: The ability of the method to significantly outperform even the optimal linear-subspace ROM on benchmark advection-dominated problems is demonstrated, thereby demonstrating the method's ability to overcome the intrinsic $n$-width limitations of linear subspaces.
Journal ArticleDOI
Solving inverse problems using data-driven models
TL;DR: This survey paper aims to give an account of some of the main contributions in data-driven inverse problems.