MITSUBISHI ELECTRIC RESEARCH LABORATORIES
http://www.merl.com
A Fast Hybrid Jacket-Hadamard Matrix Based Diag onal
Block-wise Transform
Lee, M.H.; Khan, M.H.A.; Kim, K.J.; Park, D.
TR2014-002 January 2014
Abstract
In this paper, based on the block (element)-wise inverse Jacket matrix, a unified fast hybrid
diagonal block-wise transform (FHDBT) algorithm is proposed. A new fast diagonal block
matrix decomposition is made by the matrix product of successively lower order diagonal
Jacket matrix and Hadamard matrix. Using a common lower order matrix in the form of 1 1,
a fast recursive structure can be developed in the FHDBT, which is able to convert a newly de-
veloped discrete cosine transform (DCT)-II, discrete sine transform (DST)-II, discrete Fourier
transform (DFT), and Haar-based wavelet transform (HWT). Since these DCT-II, DST-II,
DFT, and HWT are widely used in different areas of applications, the proposed FHDBT can
be applied to the heterogeneous system requiring several transforms simultaneously. Com-
paring with pre-existing DCT-II, DST-II, DFT, and HWT, it is shown that the proposed
FHDBT exhibits less the complexity as its matrix size gets larger. The proposed algorithm is
also well matched to circulant channel matrix. From the numerical experiments, it is shown
that a better performance can be achieved by the use of DCT/DST-II compression scheme
compared with the DCT-II only compression method.
Signal Processing: Image Communication
This work may not be copied or reproduced in whole or in p art for any commercial purpose. Permission to copy in
whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all
such whole or partial copies include the following: a notice that such co pying is by permission of Mitsubishi Electric
Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all
applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require
a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved.
Copyright
c
Mitsubishi Electric Research Laboratories, Inc., 2014
201 Broadway, Cambridge, Massachusetts 02139
Author's personal copy
A fast hybrid Jacket–Hadamard matrix based diagonal
block-wise transform
$
Moon Ho Lee
a
, Md. Hashem Ali Khan
a,
n
, Kyeong Jin Kim
b,1
, Daechul Park
c
a
Division Electronics and Information Engineering, Chonbuk National University, Jeonju 561-756, South Korea
b
Mitsubishi Electric Research Laboratories (MERL), 201 Broadway, Cambridge, MA 02139, USA
c
Department of Information and Communication Engineering, Hannam University, Daejeon 306-791, South Korea
article info
Article history:
Received 23 April 2013
Received in revised form
12 November 2013
Accepted 12 November 2013
Available online 4 December 2013
Keywords:
Diagonal block (element)-wise inverse
Jacket matrix (BIJM)
Sparse matrix decomposition
Successive lower order diagonal sparse
matrix
Hadamard matrix
abstract
In this paper, based on the block (element)-wise inverse Jacket matrix, a unified fast
hybrid diagonal block-wise transform (FHDBT) algorithm is proposed. A new fast diagonal
block matrix decomposition is made by the matrix product of successively lower order
diagonal Jacket matrix and Hadamard matrix. Using a common lower order matrix in the
form of
11
1 1
, a fast recursive structure can be developed in the FHDBT, which is able
to convert a newly developed discrete cosine transform (DCT)-II, discrete sine transform
(DST)-II, discrete Fourier transform (DFT), and Haar-based wavelet transform (HWT). Since
these DCT-II, DST-II, DFT, and HWT are widely used in different areas of applications, the
proposed FHDBT can be applied to the heterogeneous system requiring several transforms
simultaneously. Comparing with pre-existing DCT-II, DST-II, DFT, and HWT, it is shown
that the proposed FHDBT exhibits less the complexity as its matrix size gets larger. The
proposed algorithm is also well matched to circulant channel matrix. From the numerical
experiments, it is shown that a better performance can be achieved by the use of DCT/
DST-II compression schem e compared with the DCT-II only compression method.
& 2013 Elsevier B.V. All rights reserved.
1. Introduction
The last decade based on orthogonal transform has been seen a quiet revolution in digital video technology such as
Moving Picture Experts Group (MPEG)-4, H.264, and high efficiency video coding (HEVC) [1–7]. Digital video is everywhere
such as DVD, gaming players, computers and mobile handsets. Nowadays, many of the coexisting heterogeneous systems
[7,8] are likely to catch the latest news on the web as on the smart TV and iPhone. Video compression is essential to all these
applications. The discrete cosine transform (DCT)-II is popular compression structures for MPEG-4, H.264, and HEVC, and is
accepted as the best suboptimal transformation since its performance is very close to that of the statistically optimal
Karhunen–Loeve transform (KLT) [1–5]. For practical consideration, the underlying H.264-advanced video coding (AVC)
intra mode dictates the transform coding implementation within a block coder with a typical block of size up to 16 16.
However, since a DCT-based block coder suffers from blocking effect, i.e., a disturbing discontinuity at the block boundaries,
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/image
Signal Processing: Image Communication
0923-5965/$ - see front matter & 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.image.2013.11.002
☆
This work was supported by the MEST 2012-002521, NRF, Korea.
n
Corresponding author: Tel.: þ82 632702463; fax: þ82 632704166.
E-mail addresses: moonho@jbnu.ac.kr (M. H. Lee), hashem05ali@jbnu.ac.kr (M. H. A. Khan), kyeong.j.kim@hotmail.com (K. J. Kim),
fia4joy@yahoo.co.kr (D. Park).
1
This work was done when he was with Inha University, Incheon, Korea.
Signal Processing: Image Communication 29 (2014) 49–65
Author's personal copy
much research efforts have been leveraged to reduce the blocking effect. In [4,7], a first-order Gauss–Markov model was
assumed for the images, and then it was shown that the image can be decomposed into a boundary response and a residual
process given the closed bound boundary information. The boundary response is an interpolation of the block content from
its boundary data, whereas the residual process is the interpolation error. An approach in [4] showed that the KLT of the
residual process became discrete sine transform (DST) and DCT when the boundary conditions are available in vertical and
horizontal directions [4,6,7].
The discrete signal processing based on the discrete Fourier transform (DFT) is popular in orthogonal frequency division
multiplexing (OFDM) wireless mobile communication systems [3] such as 3rd generation partnership project long-term
evolution (3GPP-LTE), mobile worldwide interoperability for microwave access (WiMAX), international mobile
telecommunications-advanced (IMT-Advanced) as well as wireless local area network (WLAN). In addition, wireless
personal area network (WPAN), and broadcasting related applications (digital audio broadcasting (DAB), digital video
broadcasting (DVB), digital multimedia broadcasting (DMB)) are based on DFT. Furthermore, the Haar-based wavelet
transform (HW T) is also very useful in the joint photographic experts group committee in 2000 (JPEG-2000) standard [2,9].
Thus, different applications require different types of unitary matrices and their decompositions. From this reason, in this
paper we will propose a unified hybrid algorithm which can be used in the mentioned several applications in different
purposes.
Compared with the conventional individual matrix decompositions, our main contributions are summarized as follows:
W e propose the diagonal sparse matrix factorization for a unified hybrid algorithm based on the properties of the Jacket
matrix [10,11] and the decomposition of the sparse matrix. It has been shown that this matrix decomposition is useful in
developing the fast algorithms and char acters [20]. Individual DCT-II [1–3,6,7,12],DST-II[4,6,7,13],DFT[3,5,14],andHWT[9]
Table 1
The comparison of computation complexity of conventional independent the DCT-II, DST-II, DFT, Haar transform and hybrid DCT-II/DST-II/DFT/HWT.
Conventional Proposed
References number Addition Multiplication Addition Multiplication
W. H. Chen at el [18]
DCT-II
3N=2 log
2
N− 1
þ 2 Nlog
2
N− 3N=2
þ 4
Nlog
2
N
N
2
log
2
N þ 1
Z. Wang [13]
DST-II
N
7
4
log
2
ðNÞ−2
þ 3
N
3
4
log
2
ðNÞ−1
þ 3
Nlog
2
N
N
2
log
2
N þ 1
Cooley & Tukey [21]
DFT
Nlog
2
N
N=2
log
2
N
Nlog
2
N
N=2
log
2
N
Andrews &Caspari [22]
HWT 2 N−1ðÞ N
∑
h−1
i¼1
N
2
i
; h ¼ log
2
N ∑
h−2
i¼1
N
2
i
; h ¼ log
2
N
n
n
Addition count = N=2
n−1
þ N=2
n−2
þ … þ N=2 ¼ ∑
n−1
i¼1
N=2
i
Multiplication count = N=2
n−2
þ N=2
n−2
þ … þ N=2 ¼ ∑
n−2
i¼1
N=2
i
.
Table 2
Computational complexity: DCT-II/DST-II/DFT/HWT.
Matrix size
Conventional Proposed
DCT-II DST-II DFT HWT DCT-II DST-II DFT HWT
Addition
N¼489868882
N¼8 26 29 4 14 24 24 24 6
N¼16 74 83 64 30 64 64 64 14
N¼32 194 219 160 62 160 160 160 30
N¼64 482 547 384 126 384 384 384 62
N¼128 1154 1315 896 254 896 896 896 126
N¼256 2690 3075 2048 510 2048 2048 2048 254
Multiplication
N¼4654 66
N¼81631281616124
N¼16 44 35 32 16 40 40 32 12
N¼32 116 91 8032 96 96 8028
N¼64 292 227 192 64 224 224 192 60
N¼128 708 547 448 128 512 512 448 124
N¼256 1668 1283 1024 256 1152 1152 1024 252
M.H. Lee et al. / Signal Processing: Image Communication 29 (2014) 49–6550
Author's personal copy
matrices can be decomposed to one orthogonal character matrix and a corresponding special sparse matrix. The inverse of
the sparse matrix can be easily obtained from the property of the block (element)-wise inverse Jacket matrix. However, there
hav e been no previous works in the development of the common matrix decomposition supporting these transforms.
We propose a new unified hybrid algorithm which can be used in the multimedia applications, wireless communication
systems, and broadcasting systems at almost the same computational complexity as those of the conventional unitary
Fig. 1. Regular systematic butterfly data flow of DCT-II.
Fig. 2. Regular systematic butterfly data flow of DST-II.
M.H. Lee et al. / Signal Processing: Image Communication 29 (2014) 49–65 51