Open AccessPosted Content
Peephole: Predicting Network Performance Before Training.
Boyang Deng,Junjie Yan,Dahua Lin +2 more
TLDR
A unified way to encode individual layers into vectors and bring them together to form an integrated description via LSTM, taking advantage of the recurrent network's strong expressive power, can reliably predict the performances of various network architectures.Abstract:
The quest for performant networks has been a significant force that drives the advancements of deep learning in recent years. While rewarding, improving network design has never been an easy journey. The large design space combined with the tremendous cost required for network training poses a major obstacle to this endeavor. In this work, we propose a new approach to this problem, namely, predicting the performance of a network before training, based on its architecture. Specifically, we develop a unified way to encode individual layers into vectors and bring them together to form an integrated description via LSTM. Taking advantage of the recurrent network's strong expressive power, this method can reliably predict the performances of various network architectures. Our empirical studies showed that it not only achieved accurate predictions but also produced consistent rankings across datasets -- a key desideratum in performance prediction.read more
Citations
More filters
Journal ArticleDOI
AutoML: A survey of the state-of-the-art
Xin He,Kaiyong Zhao,Xiaowen Chu +2 more
TL;DR: A comprehensive and up-to-date review of the state-of-the-art (SOTA) in AutoML methods according to the pipeline, covering data preparation, feature engineering, hyperparameter optimization, and neural architecture search (NAS).
Posted Content
Efficient Neural Architecture Search via Parameter Sharing
TL;DR: Efficient Neural Architecture Search is a fast and inexpensive approach for automatic model design that establishes a new state-of-the-art among all methods without post-training processing and delivers strong empirical performances using much fewer GPU-hours.
Proceedings Article
Neural Architecture Optimization
TL;DR: Neural architecture optimization (NAO) as discussed by the authors is a method for automatic neural architecture design based on continuous optimization, where an encoder embeds/maps neural network architectures into a continuous space and a decoder maps a continuous representation of a network back to its architecture.
Posted Content
Taking Human out of Learning Applications: A Survey on Automated Machine Learning
Quanming Yao,Mengshuo Wang,Hugo Jair Escalante,Isabelle Guyon,Yi-Qi Hu,Yu-Feng Li,Wei-Wei Tu,Qiang Yang,Yang Yu +8 more
TL;DR: An up to date survey on AutoML and proposes a general AutoML framework that not only covers most existing approaches to date but also can guide the design for new methods.
Posted Content
SNAS: Stochastic Neural Architecture Search
TL;DR: It is proved that this search gradient optimizes the same objective as reinforcement-learning-based NAS, but assigns credits to structural decisions more efficiently, and is further augmented with locally decomposable reward to enforce a resource-efficient constraint.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Journal ArticleDOI
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.