Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs
Citations
498 citations
363 citations
Cites methods from "Evaluating Fast Algorithms for Conv..."
...In particular, existing work [17, 28, 29] has demonstrated that applying Winograd [27] and fast Fourier transformations to convolutional computation can significantly improve resource efficiency....
[...]
354 citations
308 citations
Cites background or methods from "Evaluating Fast Algorithms for Conv..."
...[189] implemented a novel FPGA architecture with a two-dimensional Winograd algorithm [185] to accelerate convolutional computation of CNNs....
[...]
...Winograd-based CNN Accelerator [189], where, m is the size of the input FM tile, n is the size of the output FM tile, M is the number of input channels, N is the number of output channels, W is the maximal width of all input FMs, C is the width of the output FMs....
[...]
...[184], [188], and [189] utilized Winograd transformation in performing CONV operations as this reduces the computational complexity by around 50%....
[...]
...Since the computational requirement of FC layers is significantly less than that of CONV layers, to improve performance, and maximize resource utilization, a number of techniques such as [153], [162], [188], and [189] create...
[...]
259 citations
Cites background or methods or result from "Evaluating Fast Algorithms for Conv..."
...For a fair comparison, both small-scale networks in [19] and [30] and large-scale deep networks in [18], [20],...
[...]
...With the same filtering algorithm, the design in [18] achieves a throughput of 2....
[...]
...and [18] to speed up the convolutional computations....
[...]
...To reduce the size of block RAM, studies in [4], [12], [18], [20], and [25] save the intermediate data for each layer in off-chip memory....
[...]
...It is noteworthy that the streaming style completely eliminates the off-chip access for intermediate results, which was a limitation of the previous works in [4], [12], [18], [20], and [25]....
[...]
References
123,388 citations
73,978 citations
55,235 citations
"Evaluating Fast Algorithms for Conv..." refers methods in this paper
...2) VGGNet: In VGG16 [22], all convolutional layers are with 3 × 3 filters, which fit well for Winograd algorithm....
[...]
...VGG16 consists of 5 convolution groups with different input size (224, 112, 56, 28, 14)....
[...]
...Our implementation also improves the energy efficiency from 3.79 GOP/s/W to 36.2 GOP/s/W. 2) VGGNet: In VGG16 [22], all convolutional layers are with 3 × 3 filters, which fit well for Winograd algorithm....
[...]
...• We perform rigorous validation of our techniques using the state-of-the-art CNNs including AlexNet and VGG16....
[...]
...For example, all convolutional layers of Alexnet employ 3× 3 and 5× 5 filters except the first layer [3]; VGG16 only uses 3×3 filters [22]....
[...]
49,914 citations
44,703 citations
"Evaluating Fast Algorithms for Conv..." refers background in this paper
...The significant accuracy improvement of CNNs comes at the cost of huge computational complexity as it requires a comprehensive assessment of all the regions across the feature maps [3, 4]....
[...]