Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization.
Citations
579 citations
Cites background or methods from "Distributionally Robust Neural Netw..."
...We note that reweighting methods like Group DRO are effective only when the training loss is non-vanishing, which we achieve through early stopping (Byrd & Lipton, 2019; Sagawa et al., 2020a;b)....
[...]
...…al., 2018a), ImageNet-C (Hendrycks & Dietterich, 2019), and similar ImageNet variants (Geirhos et al., 2018b); and datasets that crop out objects and replace their backgrounds, as in the Backgrounds Challenge (Xiao et al., 2020) and other similar datasets (Sagawa et al., 2020a; Koh et al., 2020)....
[...]
...We adapted the implementations of CORAL from Gulrajani & Lopez-Paz (2020); IRM from Arjovsky et al. (2019); and Group DRO from Sagawa et al. (2020a)....
[...]
...We fixed the step size hyperparameter for Group DRO to its default value of 0.01 (Sagawa et al., 2020a)....
[...]
...These types of spurious correlations can significantly degrade model performance on particular subpopulations (Sagawa et al., 2020a)....
[...]
492 citations
Cites methods from "Distributionally Robust Neural Netw..."
...For example, to implement group DRO [Sagawa et al., 2019, Algorithm 1], we simply write the following in algorithms.py: class DRO(ERM): def __init__(self , input_shape , num_classes , num_domains , hparams ): super ()....
[...]
400 citations
226 citations
212 citations
References
123,388 citations
33,597 citations
[...]
33,341 citations
30,843 citations
"Distributionally Robust Neural Netw..." refers methods in this paper
...As in the original paper, we used batch normalization (Ioffe & Szegedy, 2015) and no dropout (Srivastava et al., 2014)....
[...]
24,672 citations