Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures

Open AccessProceedings Article

Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures

- pp 115-123

TLDR

This work proposes a meta-modeling approach to support automated hyperparameter optimization, with the goal of providing practical tools that replace hand-tuning with a reproducible and unbiased optimization process.

Abstract:

Many computer vision algorithms depend on configuration settings that are typically hand-tuned in the course of evaluating the algorithm for a particular data set. While such parameter tuning is often presented as being incidental to the algorithm, correctly setting these parameter choices is frequently critical to realizing a method's full potential. Compounding matters, these parameters often must be re-tuned when the algorithm is applied to a new problem domain, and the tuning process itself often depends on personal experience and intuition in ways that are hard to quantify or describe. Since the performance of a given technique depends on both the fundamental quality of the algorithm and the details of its tuning, it is sometimes difficult to know whether a given technique is genuinely better, or simply better tuned. In this work, we propose a meta-modeling approach to support automated hyperparameter optimization, with the goal of providing practical tools that replace hand-tuning with a reproducible and unbiased optimization process. Our approach is to expose the underlying expression graph of how a performance metric (e.g. classification accuracy on validation examples) is computed from hyperparameters that govern not only how individual processing steps are applied, but even which processing steps are included. A hyperparameter optimization algorithm transforms this graph into a program for optimizing that performance metric. Our approach yields state of the art results on three disparate computer vision problems: a face-matching verification task (LFW), a face identification task (PubFig83) and an object recognition task (CIFAR-10), using a single broad class of feed-forward vision architectures.

Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures

Citations

Learning Transferable Architectures for Scalable Image Recognition

Taking the Human Out of the Loop: A Review of Bayesian Optimization

Neural Architecture Search with Reinforcement Learning

Neural Architecture Search with Reinforcement Learning

Neural Networks and Deep Learning

References

Scikit-learn: Machine Learning in Python

Scikit-learn: Machine Learning in Python

Object recognition from local scale-invariant features

A fast learning algorithm for deep belief nets

Learning Multiple Layers of Features from Tiny Images

Related Papers (5)

Scikit-learn: Machine Learning in Python

Deep Residual Learning for Image Recognition

Adam: A Method for Stochastic Optimization

ImageNet Classification with Deep Convolutional Neural Networks

Dropout: a simple way to prevent neural networks from overfitting