Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks

This work proposes a global filter pruning algorithm called Gate Decorator, which transforms a vanilla CNN module by multiplying its output by the channel-wise scaling factors (i.e. gate), and proposes an iterative pruning framework called Tick-Tock to improve pruning accuracy.
Filter pruning is one of the most effective ways to accelerate and compress convolutional neural networks (CNNs). In this work, we propose a global filter pruning algorithm called Gate Decorator, which transforms a vanilla CNN module by multiplying its output by the channel-wise scaling factors (i.e. gate). When the scaling factor is set to zero, it is equivalent to removing the corresponding filter. We use Taylor expansion to estimate the change in the loss function caused by setting the scaling factor to zero and use the estimation for the global filter importance ranking. Then we prune the network by removing those unimportant filters. After pruning, we merge all the scaling factors into its original module, so no special operations or structures are introduced. Moreover, we propose an iterative pruning framework called Tick-Tock to improve pruning accuracy. The extensive experiments demonstrate the effectiveness of our approaches. For example, we achieve the state-of-the-art pruning ratio on ResNet-56 by reducing 70% FLOPs without noticeable loss in accuracy. For ResNet-50 on ImageNet, our pruned model with 40% FLOPs reduction outperforms the baseline model by 0.31% in top-1 accuracy. Various datasets are used, including CIFAR-10, CIFAR-100, CUB-200, ImageNet ILSVRC-12 and PASCAL VOC 2011.

Proceedings ArticleDOI

Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration

TL;DR: Learning Filter Pruning Criteria (LFPC) is proposed, which develops a differentiable pruning criteria sampler that is learnable and optimized by the validation loss of the pruned network obtained from the sampled criteria.
Proceedings ArticleDOI

Learning Dynamic Routing for Semantic Segmentation

TL;DR: A conceptually new method to alleviate the scale variance in semantic representation, named dynamic routing, which generates data-dependent routes, adapting to the scale distribution of each image, and compares with several static architectures, which can be modeled as special cases in the routing space.
Book ChapterDOI

EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning

TL;DR: A pruning method called EagleEye is presented, in which a simple yet efficient evaluation component based on adaptive batch normalization is applied to unveil a strong correlation between different pruned DNN structures and their final settled accuracy.
Proceedings ArticleDOI

Network Pruning via Performance Maximization

TL;DR: In this paper, an episodic memory is introduced to update and collect sub-networks during the pruning process, which can achieve state-of-the-art performance with ResNet, MobileNet, and ShuffleNetV2+ on ImageNet.
Posted Content

Dynamic Model Pruning with Feedback

TL;DR: A novel model compression method is proposed that generates a sparse trained model without additional overhead by allowing dynamic allocation of the sparsity pattern and incorporating feedback signal to reactivate prematurely pruned weights to obtain a performant sparse model in one single training pass.
