scispace - formally typeset
Open AccessPosted Content

Practical Bayesian Optimization of Machine Learning Algorithms

Reads0
Chats0
TLDR
In this paper, a learning algorithm's generalization performance is modeled as a sample from a Gaussian process and the tractable posterior distribution induced by the GP leads to efficient use of the information gathered by previous experiments, enabling optimal choices about what parameters to try next.
Abstract
Machine learning algorithms frequently require careful tuning of model hyperparameters, regularization terms, and optimization parameters. Unfortunately, this tuning is often a "black art" that requires expert experience, unwritten rules of thumb, or sometimes brute-force search. Much more appealing is the idea of developing automatic approaches which can optimize the performance of a given learning algorithm to the task at hand. In this work, we consider the automatic tuning problem within the framework of Bayesian optimization, in which a learning algorithm's generalization performance is modeled as a sample from a Gaussian process (GP). The tractable posterior distribution induced by the GP leads to efficient use of the information gathered by previous experiments, enabling optimal choices about what parameters to try next. Here we show how the effects of the Gaussian process prior and the associated inference procedure can have a large impact on the success or failure of Bayesian optimization. We show that thoughtful choices can lead to results that exceed expert-level performance in tuning machine learning algorithms. We also describe new algorithms that take into account the variable cost (duration) of learning experiments and that can leverage the presence of multiple cores for parallel experimentation. We show that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization on a diverse set of contemporary algorithms including latent Dirichlet allocation, structured SVMs and convolutional neural networks.

read more

Citations
More filters
Journal ArticleDOI

A survey on deep learning in medical image analysis

TL;DR: This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year, to survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks.
Posted Content

SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size

TL;DR: This work proposes a small DNN architecture called SqueezeNet, which achieves AlexNet-level accuracy on ImageNet with 50x fewer parameters and is able to compress to less than 0.5MB (510x smaller than AlexNet).
Posted Content

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

TL;DR: This paper proposed an attention-based model that automatically learns to describe the content of images by focusing on salient objects while generating corresponding words in the output sequence, which achieved state-of-the-art performance on three benchmark datasets: Flickr8k, Flickr30k and MS COCO.
Journal ArticleDOI

Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations

TL;DR: In this article, the authors introduce physics-informed neural networks, which are trained to solve supervised learning tasks while respecting any given laws of physics described by general nonlinear partial differential equations.
Posted Content

Neural Architecture Search with Reinforcement Learning

Barret Zoph, +1 more
- 05 Nov 2016 - 
TL;DR: This paper uses a recurrent network to generate the model descriptions of neural networks and trains this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set.
Related Papers (5)