# A butterfly structured design of the hybrid transform coding scheme

^{1}

TL;DR: This work devise a novel ADST-like transform whose kernel is consistent with that of DCT, thereby enabling butterfly structured computation flow, while largely retaining the performance advantages of hybrid transform coding scheme in terms of compression efficiency.

Abstract: The hybrid transform coding scheme that alternates amongst the asymmetric discrete sine transform (ADST) and the discrete cosine transform (DCT) depending on the boundary prediction conditions, is an efficient tool for video and image compression. It optimally exploits the statistical characteristics of prediction residual, thereby achieving significant coding performance gains over the conventional DCT-based approach. A practical concern lies in the intrinsic conflict between transform kernels of ADST and DCT, which prevents a butterfly structured implementation for parallel computing. Hence the hybrid transform coding scheme has to rely on matrix multiplication, which presents a speed-up barrier due to under-utilization of the hardware, especially for larger block sizes. In this work, we devise a novel ADST-like transform whose kernel is consistent with that of DCT, thereby enabling butterfly structured computation flow, while largely retaining the performance advantages of hybrid transform coding scheme in terms of compression efficiency. A prototype implementation of the proposed butterfly structured hybrid transform coding scheme is available in the VP9 codec repository.

## Summary (2 min read)

### Introduction

- Transform coding is a central component in video and image compression.
- In fact, methods along this line are typically limited to smaller transform dimensions.
- It is noteworthy that larger block size transforms provides higher transform coding gains for stationary signal and are experimentally proved to contribute compression efficiency in various video codecs.
- The authors hence use this btfADST to replace the original ADST in the hybrid transform coding scheme.

### II. SPATIAL PREDICTION AND TRANSFORM CODING

- The authors revisit the mathematical theory that derived the original ADST, in the context 1-D first-order Gauss-Markov model, given partial 1In practice, all the computations are performed in the integer format for speed reasons.
- Prediction boundary [1], which leads to their btf-ADST proposed in this work.
- This irregularity complicates an analytic derivation of the eigenvalues and eigenvectors of P1.
- The approximation clearly holds for ρ → 1, which is indeed a common approximation that describes the spatial correlation of video/image signals.

### III. BUTTERFLY STRUCTURED VARIANT OF ADST

- A key observation of the above derived ADST is that the rows of TS (i.e., basis functions of the transform) possess smaller values in the beginning (closer to the known boundary), and larger values towards the other end.
- This effectively exploits the fact that pixels closer to the known boundary are better predicted and hence have statistically smaller variance than those at far end.
- It inspires their search for a unitary sinusoidal transform that resembles the compression performance of the ADST, to overcome the intricacy of butterfly design of ADST and hence hybrid transform coding for parallel computing.
- Clearly, it also possesses the property of asymmetric basis function, but has the denominator of kernel argument, 4N , consistent with that of DCT, thereby allowing the butterfly structured implementation.
- In practice, all these computations are performed in the integer format, which inevitably incurs rounding effects accumulated through every stage.

### IV. QUANTITATIVE ANALYSIS

- The authors quantitatively evaluate the performance of the btf-ADST, original ADST, and DCT, against the KLT (of y in Sec. II) in terms of coding gains [7] under the assumed signal model, at different correlation coefficient values.
- This bit-allocation problem is addressed by water filling algorithm of [7].
- The coding gain, GA thus provides a comparison of the average distortion incurred with and without the transformation A. Note that for any given A (including the btf-ADST, ADST, DCT, and KLT of y), computing Rzz , and hence σ2zi , does not require making any approximations for P1.
- Clearly the original ADST well approximates KLT at various values of the correlation coefficient ρ.
- The maximum gap between ADST and KLT, or the maximum loss of optimality, is less than 0.05 dB.

### V. EXPERIMENTAL RESULTS

- The proposed btf-ADST was employed to replace the original ADST in the hybrid transform coding scheme.
- This btf-ADST/DCT hybrid transform coding scheme was implemented in the VP9 codec [8].
- Fig. 2 demonstrates the rate-distortion performance comparison for sequence harbour at CIF resolution.
- Similar results were observed over a wide varieties of sequences and resolutions.
- The authors compare the runtime of the btf-ADST/DCT and the original ADST/DCT hybrid transform schemes, in terms of the average CPU cycles, as shown in Fig.

### VI. CONCLUSIONS

- This work devised a novel variant of ADST transform whose kernel approximates the original ADST basis-wisely and is consistent with the DCT kernel, thereby enabling the butterfly structured implementation.
- The proposed scheme allows efficient hardware utilization for significant codec speed-up, while largely retaining the advantageous compression performance of hybrid transform coding scheme.

Did you find this useful? Give us your feedback

...read more

##### Citations

^{1}

26 citations

17 citations

13 citations

### Cites background or methods from "A butterfly structured design of th..."

...2 as Haar unit, as opposed to general Givens rotations, which are often referred to as “butterflies” [21], [25], [33]....

[...]

...An n dimensional Givens rotation [30], commonly referred to as a butterfly [20], [21], [25], is a linear transformation that applies a rotation of angle θ to two coordinates, denoted as p and q....

[...]

...This means that those sub-GFTs can also be implemented using fast DCT and ADST algorithms [23]–[25]....

[...]

...Because of the availability of fast algorithms, DCT and Type-4 DST have been adopted in 1053-587X © 2019 IEEE....

[...]

...We also note that, for any steerable DFT with a length n that is a multiple of 4, the GFTs of G++c and G−+c are Type-2 DCT and Type-4 DST, respectively....

[...]

^{1}

11 citations

### Cites methods from "A butterfly structured design of th..."

...Note that since the original ADST derived in [33] cannot be decomposed for the butterfly structure, a variant of it, as introduced in [36] and also as shown in Figure 27, is adopted by AV1 for transform block sizes of 8× 8 and above....

[...]

##### References

11,485 citations

### "A butterfly structured design of th..." refers background in this paper

...On the hardware design side, the transform module typically contributes a large portion of codec computational complexity, and hence a butterfly structured implementation that allows parallel computing via single instruction multiple data (SIMD) operations [3] is highly desirable....

[...]

1,272 citations

1,003 citations

### "A butterfly structured design of th..." refers background in this paper

...II) in terms of coding gains [7] under the assumed signal model, at different correlation coefficient values....

[...]

...This bit-allocation problem is addressed by water filling algorithm of [7]....

[...]

720 citations

### "A butterfly structured design of th..." refers methods in this paper

...) A recent development on fast transform using integer transform was proposed in [5], where it approximates the DCT transform element-wisely using a matrix whose entries are all small integers....

[...]