AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Yihui He,Ji Lin,Zhijian Liu,Hanrui Wang,Li-Jia Li,Song Han +5 more
- pp 784-800
Reads0
Chats0
TLDR
This paper proposes AutoML for Model Compression (AMC) which leverages reinforcement learning to efficiently sample the design space and can improve the model compression quality and achieves state-of-the-art model compression results in a fully automated way without any human efforts.Abstract:
Model compression is an effective technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets. Conventional model compression techniques rely on hand-crafted features and require domain experts to explore the large design space trading off among model size, speed, and accuracy, which is usually sub-optimal and time-consuming. In this paper, we propose AutoML for Model Compression (AMC) which leverages reinforcement learning to efficiently sample the design space and can improve the model compression quality. We achieved state-of-the-art model compression results in a fully automated way without any human efforts. Under 4\(\times \) FLOPs reduction, we achieved 2.7% better accuracy than the hand-crafted model compression method for VGG-16 on ImageNet. We applied this automated, push-the-button compression pipeline to MobileNet-V1 and achieved a speedup of 1.53\(\times \) on the GPU (Titan Xp) and 1.95\(\times \) on an Android phone (Google Pixel 1), with negligible loss of accuracy.read more
Citations
More filters
Proceedings ArticleDOI
FPA-DNN: A Forward Propagation Acceleration based Deep Neural Network for Ship Detection
TL;DR: Experimental results on the optical remote sensing image dataset show that, compared with several state-of-the-art deep learning models, the LSDN model outperforms the others on the detection accuracy and detection speed; and the FPA-DNN model can further improve the Detection accuracy and speed up the detection process significantly.
Journal ArticleDOI
SPATL: Salient Parameter Aggregation and Transfer Learning for Heterogeneous Federated Learning
TL;DR: SPATL as discussed by the authors proposes a salient parameter selection agent and communicating selected parameters only, splitting a model into a shared encoder and a local predictor, and transferring its knowledge to heterogeneous clients via the locally customized predictor.
Proceedings ArticleDOI
N3H-Core
TL;DR: N3H-Core as discussed by the authors is an FPGA-based heterogeneous computing system for neural network acceleration, which consists of DSP-and LUT-based GEneral Matrix-Multiplication (GEMM) computing cores, which form the entire computing system in a heterogeneous fashion.
Proceedings ArticleDOI
BLCR: Towards Real-time DNN Execution with Block-based Reweighted Pruning
Xiaolong Ma,Geng Yuan,Zhengang Li,Yifan Gong,Tianyun Zhang,Wei Niu,Zheng Zhan,Pu Zhao,Ning Liu,Jian Tang,Xue Lin,Bin Ren,Yanzhi Wang +12 more
TL;DR: BLCR is proposed, a novel block-based pruning framework that comprises a general and flexible structured pruning scheme that enjoys higher flexibility while exploiting full on-device parallelism, as well as a powerful and efficient reweighted regularization method to achieve the proposed sparsity scheme.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings ArticleDOI
Going deeper with convolutions
Christian Szegedy,Wei Liu,Yangqing Jia,Pierre Sermanet,Scott Reed,Dragomir Anguelov,Dumitru Erhan,Vincent Vanhoucke,Andrew Rabinovich +8 more
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Posted Content
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe,Christian Szegedy +1 more
TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.
Dissertation
Learning Multiple Layers of Features from Tiny Images
TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.