How transferable are features in deep neural networks
Citations
[...]
38,208 citations
14,807 citations
Cites result from "How transferable are features in de..."
...The two observations are consistent with findings in previous work [23, 50], namely that lower layer features are typically more general (i....
[...]
6,953 citations
Cites background from "How transferable are features in de..."
...computer vision [Oquab et al., 2014; Jia et al., 2014; Huh et al., 2016; Yosinski et al., 2014]) use a supervised dataset for pre-training, we were interested to see whether omitting the unsupervised task from the multi-task pre-training mixture still produced good results....
[...]
...In applications of transfer learning to computer vision [Oquab et al., 2014; Jia et al., 2014; Huh et al., 2016; Yosinski et al., 2014], pre-training is typically done via supervised learning on a large labeled dataset like ImageNet [Russakovsky et al....
[...]
...In applications of transfer learning to computer vision [Oquab et al., 2014; Jia et al., 2014; Huh et al., 2016; Yosinski et al., 2014], pre-training is typically done via supervised learning on a large labeled dataset like ImageNet [Russakovsky et al., 2015; Deng et al., 2009]....
[...]
5,782 citations
Cites background from "How transferable are features in de..."
...[14] find that transferability is negatively affected primarily by the specialization of higher layer neurons and difficulties with splitting co-adapted neurons....
[...]
4,288 citations
References
73,978 citations
49,639 citations
"How transferable are features in de..." refers background or methods in this paper
...The ImageNet dataset, as released in the Large Scale Visual Recognition Challenge 2012 (ILSVRC2012) (Deng et al., 2009) contains 1,281,167 labeled training images and 50,000 test images, with each image labeled with one of 1000 classes....
[...]
...The largest dataset contains the entire ILSVRC2012 (Deng et al., 2009) release with a maximum of 1300 examples per class, and the smallest dataset contains only 1 example per class (1000 data points in total)....
[...]
...1The ImageNet dataset, as released in the Large Scale Visual Recognition Challenge 2012 (ILSVRC2012) (Deng et al., 2009) contains 1,281,167 labeled training images and 50,000 test images, with each image labeled with one of 1000 classes....
[...]
13,081 citations
"How transferable are features in de..." refers background in this paper
...Although examples of successful feature transfer have been reported elsewhere in the literature (Girshick et al., 2013; Donahue et al., 2013b), to our knowledge these results have been limited to noticing that transfer from a given layer is much better than the alternative of training strictly on…...
[...]
...Although examples of successful feature transfer have been reported elsewhere in the literature (Girshick et al., 2013; Donahue et al., 2013b), to our knowledge these results have been limited to noticing that transfer from a given layer is much better than the alternative of training strictly on the target task, i....
[...]
12,783 citations
"How transferable are features in de..." refers background or methods in this paper
...For example, Zeiler and Fergus (2013) found that it is better to decrease the first layer filters sizes from 11 × 11 to 7 × 7 and to use a smaller stride of 2 instead of 4....
[...]
...A final surprising result is that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset....
[...]
...…Recent studies have taken advantage of this fact to obtain state-of-the-art results when transferring from higher layers (Donahue et al., 2013a; Zeiler and Fergus, 2013; Sermanet et al., 2014), collectively suggesting that these layers of neural networks do indeed compute features that are…...
[...]
12,531 citations