Dropout: a simple way to prevent neural networks from overfitting
Citations
[...]
38,208 citations
30,843 citations
27,821 citations
17,184 citations
15,696 citations
Cites background from "Dropout: a simple way to prevent ne..."
...This assumption, however, might restrict modeling capacity, as graph edges need not necessarily encode node similarity, but could contain additional information....
[...]
References
688 citations
683 citations
"Dropout: a simple way to prevent ne..." refers background in this paper
...These include stopping the training as soon as performance on a validation set starts to get worse, introducing weight penalties of various kinds such as L1 and L2 regularization and soft weight sharing (Nowlan and Hinton, 1992)....
[...]
535 citations
"Dropout: a simple way to prevent ne..." refers background in this paper
...Wager et al. (2013) describes how dropout can be seen as an adaptive regularizer....
[...]
439 citations
"Dropout: a simple way to prevent ne..." refers methods in this paper
...These include L2 weight decay (more generally Tikhonov regularization (Tikhonov, 1943)), lasso (Tibshirani, 1996), KL-sparsity and max-norm regularization....
[...]
422 citations
"Dropout: a simple way to prevent ne..." refers methods in this paper
...Wang and Manning (2013) proposed a method for speeding up dropout by marginalizing dropout noise....
[...]