Dropout: a simple way to prevent neural networks from overfitting
Citations
5,896 citations
5,782 citations
Cites background from "Dropout: a simple way to prevent ne..."
...• Dropout [7] is a regularization technique that zeros out the activation values of randomly chosen neurons during training....
[...]
5,709 citations
Cites methods from "Dropout: a simple way to prevent ne..."
...For improving the object detection training, a CNN usually uses the following: • Activations: ReLU, leaky-ReLU, parametric-ReLU, ReLU6, SELU, Swish, or Mish • Bounding box regression loss: MSE, IoU, GIoU, CIoU, DIoU • Data augmentation: CutOut, MixUp, CutMix • Regularization method: DropOut, DropPath [36], Spatial DropOut [79], or DropBlock • Normalization of the network activations by their mean and variance: Batch Normalization (BN) [32], Cross-GPU Batch Normalization (CGBN or SyncBN) [93], Filter Response Normalization (FRN) [70], or Cross-Iteration Batch Normalization (CBN) [89] • Skip-connections: Residual connections, Weighted residual connections, Multi-input weighted residual connections, or Cross stage partial connections (CSP) As for training activation function, since PReLU and SELU are more difficult to train, and ReLU6 is specifically designed for quantization network, we therefore remove the above activation functions from the candidate list....
[...]
...If similar concepts are applied to feature maps, there are DropOut [71], DropConnect [80], and DropBlock [16] methods....
[...]
4,967 citations
4,862 citations
References
73,978 citations
40,785 citations
"Dropout: a simple way to prevent ne..." refers methods in this paper
...These include L2 weight decay (more generally Tikhonov regularization (Tikhonov, 1943)), lasso (Tibshirani, 1996), KL-sparsity and max-norm regularization....
[...]
16,717 citations
15,055 citations
"Dropout: a simple way to prevent ne..." refers methods in this paper
...2 Learning Dropout RBMs Learning algorithms developed for RBMs such as Contrastive Divergence (Hinton et al., 2006) can be directly applied for learning Dropout RBMs....
[...]
15,005 citations