AMC: AutoML for Model Compression and Acceleration on Mobile Devices

doi:10.1007/978-3-030-01234-2_48

Open AccessBook ChapterDOI

AMC: AutoML for Model Compression and Acceleration on Mobile Devices

Yihui He, +5 more

- pp 784-800

Chats0

TLDR

This paper proposes AutoML for Model Compression (AMC) which leverages reinforcement learning to efficiently sample the design space and can improve the model compression quality and achieves state-of-the-art model compression results in a fully automated way without any human efforts.

Abstract:

Model compression is an effective technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets. Conventional model compression techniques rely on hand-crafted features and require domain experts to explore the large design space trading off among model size, speed, and accuracy, which is usually sub-optimal and time-consuming. In this paper, we propose AutoML for Model Compression (AMC) which leverages reinforcement learning to efficiently sample the design space and can improve the model compression quality. We achieved state-of-the-art model compression results in a fully automated way without any human efforts. Under 4$\times $ FLOPs reduction, we achieved 2.7% better accuracy than the hand-crafted model compression method for VGG-16 on ImageNet. We applied this automated, push-the-button compression pipeline to MobileNet-V1 and achieved a speedup of 1.53$\times $ on the GPU (Titan Xp) and 1.95$\times $ on an Android phone (Google Pixel 1), with negligible loss of accuracy.

Citations

PDF

Open Access

More filters

Posted Content

Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization is Sufficient

Ankit Pensia, +4 more

- 14 Jun 2020 -

arXiv: Learning

TL;DR: The strong lottery ticket hypothesis as mentioned in this paper assumes that one can approximate any target neural network by only pruning the weights of a sufficiently over-parameterized random network, which is at odds with recent experimental research that achieves good approximation with networks that are a small factor wider than the target.

...read moreread less

Posted Content

Non-structured DNN Weight Pruning Considered Harmful

Yanzhi Wang, +12 more

TL;DR: This paper builds ADMM-NN-S, a recently proposed joint weight pruning and quantization framework, and develops a methodology for fair and fundamental comparison of non-structured and structured pruning in terms of both storage and computation efficiency, concluding that non- structures pruning is considered harmful.

...read moreread less

Posted Content

OICSR: Out-In-Channel Sparsity Regularization for Compact Deep Neural Networks

Jiashi Li, +6 more

- 28 May 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Wang et al. as mentioned in this paper proposed Out-In-Channel Sparsity Regularization (OICSR), which considers correlations between successive layers to further retain predictive power of the compact network.

...read moreread less

Posted Content

ResNet Can Be Pruned 60x: Introducing Network Purification and Unused Path Removal (P-RM) after Weight Pruning.

Xiaolong Ma, +5 more

- 30 Apr 2019 -

arXiv: Learning

TL;DR: In this article, the authors proposed a DNN framework which combines two different types of structured weight pruning (filter and column pruning) by incorporating alternating direction method of multipliers (ADMM) algorithm for better prune performance.

...read moreread less

Journal ArticleDOI

Modeling Data Reuse in Deep Neural Networks by Taking Data-Types into Cognizance

Nandan Kumar Jha, +1 more

- 01 Sep 2021 -

IEEE Transactions on Computers

TL;DR: In this article, the authors proposed a data type aware weighted arithmetic intensity (DI$ D I ) model, which accounts for the unequal importance of different data types in DNNs.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Proceedings ArticleDOI

Going deeper with convolutions

Christian Szegedy, +8 more

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Posted Content

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

- 11 Feb 2015 -

arXiv: Learning

TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.

...read moreread less

Dissertation

Learning Multiple Layers of Features from Tiny Images

Alex Krizhevsky

TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.

...read moreread less

Collapse

arXiv: Computer Vision and Pattern Recog...

Learning Multiple Layers of Features from Tiny Images

Alex Krizhevsky

MobileNetV2: Inverted Residuals and Linear Bottlenecks

Mark Sandler, +4 more

AMC: AutoML for Model Compression and Acceleration on Mobile Devices

Citations

Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization is Sufficient

Non-structured DNN Weight Pruning Considered Harmful

OICSR: Out-In-Channel Sparsity Regularization for Compact Deep Neural Networks

ResNet Can Be Pruned 60x: Introducing Network Purification and Unused Path Removal (P-RM) after Weight Pruning.

Modeling Data Reuse in Deep Neural Networks by Taking Data-Types into Cognizance

References

Deep Residual Learning for Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

Going deeper with convolutions

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Learning Multiple Layers of Features from Tiny Images

Related Papers (5)

Deep Residual Learning for Image Recognition

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Learning Multiple Layers of Features from Tiny Images

MobileNetV2: Inverted Residuals and Linear Bottlenecks