More is Less: A More Complicated Network with Less Inference Complexity
Citations
1,086 citations
Cites background from "More is Less: A More Complicated Ne..."
...Extensive works [20, 19, 34, 12, 18, 17] have been done on accelerating neural networks by compression....
[...]
698 citations
481 citations
Cites background from "More is Less: A More Complicated Ne..."
...ng the gating modules) parameterized by " and g "{0,1}N . The overall objective is deÞned as min J (")=min E x E g L ! (g ,x ) =min E x E g # L (öy (x ,F ! ,g ),y) ! # N $N i=1 R i % , (5) where R i =(1! gi)C i is the reward of each gating module. The constant C i is the cost of executing F i and the term (1 ! gi)C i reßects the reward associated with skipping F i. In our experiments, ...
[...]
446 citations
348 citations
Cites methods from "More is Less: A More Complicated Ne..."
...CNN models [3, 4, 9, 11, 30, 31] are usually used as the function φ....
[...]
References
123,388 citations
73,978 citations
49,914 citations
40,257 citations
30,843 citations
"More is Less: A More Complicated Ne..." refers background in this paper
...Recently, the Batch Normalization (BN) [17] is proposed to improve the network performance and increase the convergence speed during training by stabilizing the distribution and reducing the internal covariate shift of input data....
[...]