Open AccessJournal Article
Dropout: a simple way to prevent neural networks from overfitting
TLDR
It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.Abstract:
Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for addressing this problem. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much. During training, dropout samples from an exponential number of different "thinned" networks. At test time, it is easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. This significantly reduces overfitting and gives major improvements over other regularization methods. We show that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.read more
Citations
More filters
Posted Content
Anticipating Visual Representations from Unlabeled Video
TL;DR: In this article, a framework that capitalizes on temporal structure in unlabeled video to learn to anticipate human actions and objects is presented. But this task is challenging partly because it requires leveraging extensive knowledge of the world that is difficult to write down.
Journal ArticleDOI
Deep Neural Network Based Demand Side Short Term Load Forecasting
TL;DR: This paper proposes deep neural network (DNN)-based load forecasting models and applies them to a demand side empirical load database and shows that DNNs exhibit accurate and robust predictions compared to other forecasting models.
Posted Content
How Powerful are Graph Neural Networks
TL;DR: This work characterize the discriminative power of popular GNN variants, such as Graph Convolutional Networks and GraphSAGE, and show that they cannot learn to distinguish certain simple graph structures, and develops a simple architecture that is provably the most expressive among the class of GNNs.
Proceedings ArticleDOI
CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
Daniel Zeman,Martin Popel,Milan Straka,Jan Hajič,Joakim Nivre,Filip Ginter,Juhani Luotolahti,Sampo Pyysalo,Slav Petrov,Martin Potthast,Francis M. Tyers,Elena Badmaeva,Memduh Gökırmak,Anna Nedoluzhko,Silvie Cinková,Jaroslava Hlaváčová,Václava Kettnerová,Zdenka Uresova,Jenna Kanerva,Stina Ojala,Anna Missilä,Christopher D. Manning,Sebastian Schuster,Siva Reddy,Dima Taji,Nizar Habash,Herman Leung,Marie-Catherine de Marneffe,Manuela Sanguinetti,Maria Simi,Hiroshi Kanayama,Valeria dePaiva,Kira Droganova,Héctor Martínez Alonso,Ça ugrı Çöltekin,Umut Sulubacak,Hans Uszkoreit,Vivien Macketanz,Aljoscha Burchardt,Kim Harris,Katrin Marheinecke,Georg Rehm,Tolga Kayadelen,Mohammed Attia,Ali Elkahky,Zhuoran Yu,Emily Pitler,Saran Lertpradit,Michael Mandl,Jesse Kirchner,Hector Fernandez Alcalde,Jana Strnadová,Esha Banerjee,Ruli Manurung,Antonio Stella,Atsuko Shimada,Sookyoung Kwak,Gustavo Mendonça,Tatiana Lando,Rattima Nitisaroj,Josie Li +60 more
TL;DR: The task and evaluation methodology is defined, how the data sets were prepared, report and analyze the main results, and a brief categorization of the different approaches of the participating systems are provided.
Proceedings ArticleDOI
Semantic Visual Localization
TL;DR: In this paper, a joint 3D geometric and semantic understanding of the world is used for robust visual localization under a wide range of viewing conditions, enabling it to succeed under conditions where previous approaches failed.
References
More filters
Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Journal ArticleDOI
Regression Shrinkage and Selection via the Lasso
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Journal ArticleDOI
Reducing the Dimensionality of Data with Neural Networks
TL;DR: In this article, an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data is described.
Journal ArticleDOI
A fast learning algorithm for deep belief nets
TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Dissertation
Learning Multiple Layers of Features from Tiny Images
TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.