Dropout: a simple way to prevent neural networks from overfitting
Citations
3,472 citations
Cites background or methods from "Dropout: a simple way to prevent ne..."
...Dropout is used in many models in deep learning as a way to avoid over-fitting (Srivastava et al., 2014), and our interpretation suggests that dropout approximately integrates over the models’ weights....
[...]
...Furthermore, our results carry to other variants of dropout as well (such as drop-connect (Wan et al., 2013), multiplicative Gaussian noise (Srivastava et al., 2014), etc.)....
[...]
...In this paper we give a complete theoretical treatment of the link between Gaussian processes and dropout, and develop the tools necessary to represent uncertainty in deep learning....
[...]
3,445 citations
3,222 citations
3,148 citations
Additional excerpts
...For the CBOW and BiLSTM models, we tune Dropout on the SNLI development set and find that a drop rate of 0.1 works well....
[...]
...We use Dropout (Srivastava et al., 2014) for regularization....
[...]
3,100 citations
References
73,978 citations
40,785 citations
"Dropout: a simple way to prevent ne..." refers methods in this paper
...These include L2 weight decay (more generally Tikhonov regularization (Tikhonov, 1943)), lasso (Tibshirani, 1996), KL-sparsity and max-norm regularization....
[...]
16,717 citations
15,055 citations
"Dropout: a simple way to prevent ne..." refers methods in this paper
...2 Learning Dropout RBMs Learning algorithms developed for RBMs such as Contrastive Divergence (Hinton et al., 2006) can be directly applied for learning Dropout RBMs....
[...]
15,005 citations