scispace - formally typeset
Open AccessJournal ArticleDOI

Structured Pruning of Deep Convolutional Neural Networks

Reads0
Chats0
TLDR
The proposed work shows that when pruning granularities are applied in combination, the CIFAR-10 network can be pruned by more than 70% with less than a 1% loss in accuracy.
Abstract
Real-time application of deep learning algorithms is often hindered by high computational complexity and frequent memory accesses. Network pruning is a promising technique to solve this problem. However, pruning usually results in irregular network connections that not only demand extra representation efforts but also do not fit well on parallel computation. We introduce structured sparsity at various scales for convolutional neural networks: feature map-wise, kernel-wise, and intra-kernel strided sparsity. This structured sparsity is very advantageous for direct computational resource savings on embedded computers, in parallel computing environments, and in hardware-based systems. To decide the importance of network connections and paths, the proposed method uses a particle filtering approach. The importance weight of each particle is assigned by assessing the misclassification rate with a corresponding connectivity pattern. The pruned network is retrained to compensate for the losses due to pruning. While implementing convolutions as matrix products, we particularly show that intra-kernel strided sparsity with a simple constraint can significantly reduce the size of the kernel and feature map tensors. The proposed work shows that when pruning granularities are applied in combination, we can prune the CIFAR-10 network by more than 70% with less than a 1% loss in accuracy.

read more

Citations
More filters
Journal ArticleDOI

Efficient Processing of Deep Neural Networks: A Tutorial and Survey

TL;DR: In this paper, the authors provide a comprehensive tutorial and survey about the recent advances toward the goal of enabling efficient processing of DNNs, and discuss various hardware platforms and architectures that support DNN, and highlight key trends in reducing the computation cost of deep neural networks either solely via hardware design changes or via joint hardware and DNN algorithm changes.
Proceedings ArticleDOI

Channel Pruning for Accelerating Very Deep Neural Networks

TL;DR: In this paper, a LASSO regression based channel selection and least square reconstruction is proposed to accelerate very deep convolutional neural networks, which achieves 5× speedup along with only 0.3% increase of error.
Journal ArticleDOI

A survey on deep learning techniques for image and video semantic segmentation

TL;DR: A review on deep learning methods for semantic segmentation applied to various application areas and points out a set of promising future works to help researchers decide which are the ones that best suit their needs and goals.
Posted Content

Efficient Processing of Deep Neural Networks: A Tutorial and Survey

TL;DR: In this article, the authors provide a comprehensive tutorial and survey about the recent advances towards the goal of enabling efficient processing of DNNs, and discuss various hardware platforms and architectures that support deep neural networks.
Journal ArticleDOI

Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey

TL;DR: This article reviews the mainstream compression approaches such as compact model, tensor decomposition, data quantization, and network sparsification, and answers the question of how to leverage these methods in the design of neural network accelerators and present the state-of-the-art hardware architectures.
References
More filters
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Journal ArticleDOI

Gradient-based learning applied to document recognition

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Journal Article

Dropout: a simple way to prevent neural networks from overfitting

TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Journal ArticleDOI

ImageNet classification with deep convolutional neural networks

TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.
Posted Content

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.