Peephole: Predicting Network Performance Before Training.

Open AccessPosted Content

Peephole: Predicting Network Performance Before Training.

- 09 Dec 2017 -

TLDR

A unified way to encode individual layers into vectors and bring them together to form an integrated description via LSTM, taking advantage of the recurrent network's strong expressive power, can reliably predict the performances of various network architectures.

Abstract:

The quest for performant networks has been a significant force that drives the advancements of deep learning in recent years. While rewarding, improving network design has never been an easy journey. The large design space combined with the tremendous cost required for network training poses a major obstacle to this endeavor. In this work, we propose a new approach to this problem, namely, predicting the performance of a network before training, based on its architecture. Specifically, we develop a unified way to encode individual layers into vectors and bring them together to form an integrated description via LSTM. Taking advantage of the recurrent network's strong expressive power, this method can reliably predict the performances of various network architectures. Our empirical studies showed that it not only achieved accurate predictions but also produced consistent rankings across datasets -- a key desideratum in performance prediction.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

AutoML: A survey of the state-of-the-art

Xin He, +2 more

- 05 Jan 2021 -

Knowledge Based Systems

TL;DR: A comprehensive and up-to-date review of the state-of-the-art (SOTA) in AutoML methods according to the pipeline, covering data preparation, feature engineering, hyperparameter optimization, and neural architecture search (NAS).

...read moreread less

Posted Content

Efficient Neural Architecture Search via Parameter Sharing

Hieu Pham, +4 more

- 09 Feb 2018 -

arXiv: Learning

TL;DR: Efficient Neural Architecture Search is a fast and inexpensive approach for automatic model design that establishes a new state-of-the-art among all methods without post-training processing and delivers strong empirical performances using much fewer GPU-hours.

...read moreread less

Proceedings Article

Neural Architecture Optimization

Renqian Luo, +4 more

TL;DR: Neural architecture optimization (NAO) as discussed by the authors is a method for automatic neural architecture design based on continuous optimization, where an encoder embeds/maps neural network architectures into a continuous space and a decoder maps a continuous representation of a network back to its architecture.

...read moreread less

Posted Content

Taking Human out of Learning Applications: A Survey on Automated Machine Learning

Quanming Yao, +8 more

- 31 Oct 2018 -

arXiv: Artificial Intelligence

TL;DR: An up to date survey on AutoML and proposes a general AutoML framework that not only covers most existing approaches to date but also can guide the design for new methods.

...read moreread less

Posted Content

SNAS: Stochastic Neural Architecture Search

Sirui Xie, +3 more

- 24 Dec 2018 -

arXiv: Learning

TL;DR: It is proved that this search gradient optimizes the same objective as reinforcement-learning-based NAS, but assigns credits to structural decisions more efficiently, and is further augmented with locally decomposable reward to enforce a resource-efficient constraint.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Collapse

Peephole: Predicting Network Performance Before Training.

Citations

AutoML: A survey of the state-of-the-art

Efficient Neural Architecture Search via Parameter Sharing

Neural Architecture Optimization

Taking Human out of Learning Applications: A Survey on Automated Machine Learning

SNAS: Stochastic Neural Architecture Search

References

Deep Residual Learning for Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

Long short-term memory

Very Deep Convolutional Networks for Large-Scale Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

Related Papers (5)

Learning Transferable Architectures for Scalable Image Recognition

Deep Residual Learning for Image Recognition

Densely Connected Convolutional Networks

Learning Multiple Layers of Features from Tiny Images

Neural Architecture Search with Reinforcement Learning