PACT: Parameterized Clipping Activation for Quantized Neural Networks
Citations
610 citations
Cites methods from "PACT: Parameterized Clipping Activa..."
...We use PACT [5] as the quantization algorithm....
[...]
...Following [36, 5, 27], we only search and quantize the res-blocks, excluding the first convolutional layer and the last fully-connected layer....
[...]
467 citations
401 citations
Cites methods from "PACT: Parameterized Clipping Activa..."
...-5 Latency PACT [3] 4 bits 62.44 84.19 45.45 ms 61.39 83.72 52.15 ms 62.44 84.19 57.49 ms 61.39 83.72 74.46 ms Ours flexible 67.40 87.90 45.51 ms 66.99 87.33 52.12 ms 65.33 86.60 57.40 ms 67.01 87.46 73.97 ms our edge device and Xilinx VU9P [29] as our cloud device....
[...]
...In order to demonstrate the effectiveness of our framework on different hardware architectures, we further compare our framework with PACT [3] under the latency constraints on the BitFusion [26] architecture (Table 4)....
[...]
...Conventional quantization methods use the same number of bits for all layers [3, 15], but as different layers have different redundancy and behave differently on the hardware (computation bounded or memory bounded), it is necessary to use mixed precision for different layers (as shown in Figure 1)....
[...]
...As for comparison, we adopt the PACT [3] as our baseline, which uses the same number of bits for all layers except for the first layer which extracts the low level features, they use 8 bits for both weights and activations as it has fewer parameters and is very sensitive to errors....
[...]
...Similar to the latency-constrained experiments, we compare our framework with PACT [3] that uses fixed number of bits without hardware feedback....
[...]
363 citations
346 citations
Cites methods from "PACT: Parameterized Clipping Activa..."
...can also adopt more exible binarization function and learn its parameters during minimizing the quantization error. To achieve this goal, Choi et.al. proposed PArameterized Clipping Activation (PACT) [74] with a learnable upper bound for the activation function. The optimized upper bound of each layer is able to ensure that the quantization range of each layer is aligned with the original distribution...
[...]
...et [60] - A - - High-Order Residual Quantization [70] - A - - ABC-Net [71] - S - - Two-Step Quantization [72] RB - - - Binary Weight Networks via Hashing[73] - S - - PArameterized Clipping acTivation [74] - - - LQ-Nets [61] RB - - - Wide Reduced-Precision Networks [75] WD A - - XNOR-Net++ [76] - A - - Learning Symmetric Quantization [77] - - X - BBG [78] SC - - - Real-to-Bin [79] SC A - X Improve Netw...
[...]
...et [71] 1/32 ResNet-18 62.8 84.4 2/32 ResNet-18 63.7 85.2 1/1 ResNet-18 42.7 67.6 1/1 ResNet-34 52.4 76.5 TSQ [72] 1/1 AlexNet 58.0 80.5 BWNH [73] 1/32 AlexNet 58.5 80.9 1/32 ResNet-18 64.3 85.9 PACT [74] 1/32 ResNet-18 65.8 86.7 1/2 ResNet-18 62.9 84.7 1/2 ResNet-50 67.8 87.9 LQ-Nets [61] 1/2 ResNet-18 62.6 84.3 1/2 ResNet-34 66.6 86.9 1/2 ResNet-50 68.7 88.4 1/2 AlexNet 55.7 78.8 1/2 VGG-Variant 67....
[...]
...ions are binarized. Thus eliminating the in uence of activation binarization is usually much more important when designing binary network, which becomes the main motivations for studies like [85] and [74]. After adding reasonable regularization to the dis25 Table 3: Image Classication Performance of Binary Neural Networks on CIFAR-10 Dataset Type Method Bit-Width (W/A) Topology Acc. (%) Full-Precisio...
[...]
References
123,388 citations
111,197 citations
"PACT: Parameterized Clipping Activa..." refers methods in this paper
...We used ADAM with epsilon 10−5 and learning rate starting from 10−4 and scaled by 0.2 at epoch 56 and 64....
[...]
...• Quantization using Wide Reduced-Precision Networks (WRPN, Mishra et al. (2017)): A scheme to increase the number of filter maps to increase robustness for activation quantization....
[...]
...• Fine-grained Quantization (FGQ, Mellempudi et al. (2017)): A direct quantization scheme (i.e., little re-training needed) based on fine-grained grouping (i.e., within a small subset of filter maps)....
[...]
...(2017)), FGQ (Mellempudi et al. (2017)), WEP (Park et al. (2017)), LPBN (Graham (2017)), and HWGQ (Cai et al....
[...]
...For comparisons, we include accuracy results reported in the following prior work: DoReFa (Zhou et al. (2016)), BalancedQ (Zhou et al. (2017)), WRPN (Mishra et al. (2017)), FGQ (Mellempudi et al. (2017)), WEP (Park et al. (2017)), LPBN (Graham (2017)), and HWGQ (Cai et al. (2017))....
[...]
73,978 citations
30,843 citations
"PACT: Parameterized Clipping Activa..." refers background in this paper
...Graham (2017) recommends that normalized activation, in the process of batch normalization (Ioffe & Szegedy (2015), BatchNorm), is a good candidate for quantization....
[...]
30,811 citations