Data-Free Quantization Through Weight Equalization and Bias Correction
Markus Nagel,Mart van Baalen,Tijmen Blankevoort,Max Welling +3 more
- pp 1325-1334
Reads0
Chats0
TLDR
This work introduces a data-free quantization method for deep neural networks that does not require fine-tuning or hyperparameter selection, and achieves near-original model performance on common computer vision architectures and tasks.Abstract:
We introduce a data-free quantization method for deep neural networks that does not require fine-tuning or hyperparameter selection. It achieves near-original model performance on common computer vision architectures and tasks. 8-bit fixed-point quantization is essential for efficient inference on modern deep learning hardware. However, quantizing models to run in 8-bit is a non-trivial task, frequently leading to either significant performance reduction or engineering time spent on training a network to be amenable to quantization. Our approach relies on equalizing the weight ranges in the network by making use of a scale-equivariance property of activation functions. In addition the method corrects biases in the error that are introduced during quantization. This improves quantization accuracy performance, and can be applied to many common computer vision architectures with a straight forward API call. For common architectures, such as the MobileNet family, we achieve state-of-the-art quantized model performance. We further show that the method also extends to other computer vision architectures and tasks such as semantic segmentation and object detection.read more
Citations
More filters
Proceedings ArticleDOI
CrypTFlow: Secure TensorFlow Inference
TL;DR: In this article, the authors present CrypTFlow, a system that converts TensorFlow inference code into Secure Multi-Party Computation (MPC) protocols at the push of a button.
Proceedings ArticleDOI
Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression
TL;DR: This paper analyzes two popular network compression techniques, i.e. filter pruning and low-rank decomposition, in a unified sense and proposes to compress the whole network jointly instead of in a layer-wise manner.
Posted Content
Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation.
TL;DR: This paper presents a workflow for 8-bit quantization that is able to maintain accuracy within 1% of the floating-point baseline on all networks studied, including models that are more difficult to quantize, such as MobileNets and BERT-large.
Posted Content
Up or Down? Adaptive Rounding for Post-Training Quantization
TL;DR: AdaRound is proposed, a better weight-rounding mechanism for post-training quantization that adapts to the data and the task loss that outperforms rounding-to-nearest by a significant margin and establishes a new state-of-the-art forPost- training quantization on several networks and tasks.
Proceedings ArticleDOI
CrypTFlow2: Practical 2-Party Secure Inference.
Deevashwer Rathee,Mayank Rathee,Nishant Kumar,Nishanth Chandran,Divya Gupta,Aseem Rastogi,Rahul Sharma +6 more
TL;DR: In this article, the authors present CrypTFlow2, a cryptographic framework for secure inference over realistic deep neural networks (DNNs) using secure 2-party computation.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Journal ArticleDOI
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael S. Bernstein,Alexander C. Berg,Li Fei-Fei +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Book ChapterDOI
SSD: Single Shot MultiBox Detector
Wei Liu,Dragomir Anguelov,Dumitru Erhan,Christian Szegedy,Scott Reed,Cheng-Yang Fu,Alexander C. Berg +6 more
TL;DR: The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.
Proceedings Article
Rectified Linear Units Improve Restricted Boltzmann Machines
Vinod Nair,Geoffrey E. Hinton +1 more
TL;DR: Restricted Boltzmann machines were developed using binary stochastic hidden units that learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset.
Posted Content
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew Howard,Menglong Zhu,Bo Chen,Dmitry Kalenichenko,Weijun Wang,Tobias Weyand,M. Andreetto,Hartwig Adam +7 more
TL;DR: This work introduces two simple global hyper-parameters that efficiently trade off between latency and accuracy and demonstrates the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.