Data-Free Quantization Through Weight Equalization and Bias Correction

doi:10.1109/ICCV.2019.00141

Open AccessProceedings ArticleDOI

Data-Free Quantization Through Weight Equalization and Bias Correction

Markus Nagel, +3 more

- pp 1325-1334

Chats0

TLDR

This work introduces a data-free quantization method for deep neural networks that does not require fine-tuning or hyperparameter selection, and achieves near-original model performance on common computer vision architectures and tasks.

Abstract:

We introduce a data-free quantization method for deep neural networks that does not require fine-tuning or hyperparameter selection. It achieves near-original model performance on common computer vision architectures and tasks. 8-bit fixed-point quantization is essential for efficient inference on modern deep learning hardware. However, quantizing models to run in 8-bit is a non-trivial task, frequently leading to either significant performance reduction or engineering time spent on training a network to be amenable to quantization. Our approach relies on equalizing the weight ranges in the network by making use of a scale-equivariance property of activation functions. In addition the method corrects biases in the error that are introduced during quantization. This improves quantization accuracy performance, and can be applied to many common computer vision architectures with a straight forward API call. For common architectures, such as the MobileNet family, we achieve state-of-the-art quantized model performance. We further show that the method also extends to other computer vision architectures and tasks such as semantic segmentation and object detection.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

CrypTFlow: Secure TensorFlow Inference

Nishant Kumar, +5 more

TL;DR: In this article, the authors present CrypTFlow, a system that converts TensorFlow inference code into Secure Multi-Party Computation (MPC) protocols at the push of a button.

...read moreread less

Proceedings ArticleDOI

Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression

Yawei Li, +4 more

TL;DR: This paper analyzes two popular network compression techniques, i.e. filter pruning and low-rank decomposition, in a unified sense and proposes to compress the whole network jointly instead of in a layer-wise manner.

...read moreread less

Posted Content

Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation.

Hao Wu, +4 more

- 20 Apr 2020 -

arXiv: Learning

TL;DR: This paper presents a workflow for 8-bit quantization that is able to maintain accuracy within 1% of the floating-point baseline on all networks studied, including models that are more difficult to quantize, such as MobileNets and BERT-large.

...read moreread less

Posted Content

Up or Down? Adaptive Rounding for Post-Training Quantization

Markus Nagel, +4 more

- 22 Apr 2020 -

arXiv: Learning

TL;DR: AdaRound is proposed, a better weight-rounding mechanism for post-training quantization that adapts to the data and the task loss that outperforms rounding-to-nearest by a significant margin and establishes a new state-of-the-art forPost- training quantization on several networks and tasks.

...read moreread less

Proceedings ArticleDOI

CrypTFlow2: Practical 2-Party Secure Inference.

Deevashwer Rathee, +6 more

- 13 Oct 2020 -

arXiv: Cryptography and Security

TL;DR: In this article, the authors present CrypTFlow2, a cryptographic framework for secure inference over realistic deep neural networks (DNNs) using secure 2-party computation.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Book ChapterDOI

SSD: Single Shot MultiBox Detector

Wei Liu, +6 more

TL;DR: The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.

...read moreread less

Proceedings Article

Rectified Linear Units Improve Restricted Boltzmann Machines

Vinod Nair, +1 more

TL;DR: Restricted Boltzmann machines were developed using binary stochastic hidden units that learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset.

...read moreread less

Posted Content

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Andrew Howard, +7 more

- 17 Apr 2017 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work introduces two simple global hyper-parameters that efficiently trade off between latency and accuracy and demonstrates the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.

...read moreread less

Collapse

arXiv: Machine Learning

Data-Free Quantization Through Weight Equalization and Bias Correction

Citations

CrypTFlow: Secure TensorFlow Inference

Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression

Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation.

Up or Down? Adaptive Rounding for Post-Training Quantization

CrypTFlow2: Practical 2-Party Secure Inference.

References

Deep Residual Learning for Image Recognition

ImageNet Large Scale Visual Recognition Challenge

SSD: Single Shot MultiBox Detector

Rectified Linear Units Improve Restricted Boltzmann Machines

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Related Papers (5)

Deep Residual Learning for Image Recognition

MobileNetV2: Inverted Residuals and Linear Bottlenecks

XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

Distilling the Knowledge in a Neural Network