PACT: Parameterized Clipping Activation for Quantized Neural Networks

Posted Content•

PACT: Parameterized Clipping Activation for Quantized Neural Networks

Jungwook Choi, Zhuo Wang¹, Swagath Venkataramani¹, Pierce I.Jen Chuang¹, Vijayalakshmi Srinivasan¹, Kailash Gopalakrishnan¹ - Show less +2 more•Institutions (1)

IBM¹

15 Feb 2018-arXiv: Computer Vision and Pattern Recognition-

TL;DR: It is shown, for the first time, that both weights and activations can be quantized to 4-bits of precision while still achieving accuracy comparable to full precision networks across a range of popular models and datasets.

read less

Abstract: Deep learning algorithms achieve high classification accuracy at the expense of significant computation cost To address this cost, a number of quantization schemes have been proposed - but most of these techniques focused on quantizing weights, which are relatively smaller in size compared to activations This paper proposes a novel quantization scheme for activations during training - that enables neural networks to work well with ultra low precision weights and activations without any significant accuracy degradation This technique, PArameterized Clipping acTivation (PACT), uses an activation clipping parameter $\alpha$ that is optimized during training to find the right quantization scale PACT allows quantizing activations to arbitrary bit precisions, while achieving much better accuracy relative to published state-of-the-art quantization schemes We show, for the first time, that both weights and activations can be quantized to 4-bits of precision while still achieving accuracy comparable to full precision networks across a range of popular models and datasets We also show that exploiting these reduced-precision computational units in hardware can enable a super-linear improvement in inferencing performance due to a significant reduction in the area of accelerator compute engines coupled with the ability to retain the quantized model and activation data in on-chip memories

...read moreread less

Content maybe subject to copyright Report

PACT: Parameterized Clipping Activation for Quantized Neural Networks

Citations

Cites methods from "PACT: Parameterized Clipping Activa..."

Cites methods from "PACT: Parameterized Clipping Activa..."

Cites methods from "PACT: Parameterized Clipping Activa..."

References

"PACT: Parameterized Clipping Activa..." refers methods in this paper

"PACT: Parameterized Clipping Activa..." refers background in this paper

Related Papers (5)