scispace - formally typeset
Open AccessProceedings ArticleDOI

Coding-gain-based complexity control for H.264 video encoder

TLDR
Experiments performed on a programming optimized source code show that the computational complexity associated with each frame is well controlled below a given limit with very little R-D performance degradation under a reasonable constraint comparing to the unconstrained case.
Abstract
The allowable computational complexity of video encoding is limited in a power-constrained system. Different video frames are associated with different motions and contexts, and so are associated with different computational complexities if no complexity control is utilized. Variation in computational complexity leads to encoding delay jittering. Typically motion estimation (ME) consumes much more computational complexity than other encoding tools. This work proposes a practical complexity control method based on the complexity analysis of an H.264 video encoder to determine the coding gain of each encoding tool in the video encoder. Experiments performed on a programming optimized source code show that the computational complexity associated with each frame is well controlled below a given limit with very little R-D performance degradation under a reasonable constraint comparing to the unconstrained case.

read more

Content maybe subject to copyright    Report

CODING-GAIN-BASED COMPLEXITY CONTROL FOR H.264 VIDEO
ENCODER
Ming-Chen Chien
1,2
, Zong-Yi Chen
1
, and Pao-Chi Chang
1
1 Department of Communication Engineering, National Central University, Taiwan
2 Department of Electrical Engineering, Chin Min Institute of Technology, Taiwan
^
`
FCF
Cc
ts
RDJ
d
..
minmin
O
(3)
ABSTRACT
The allowable computational complexity of video encoding is
limited in a power-constrained system. Different video frames
are associated with different motions and contexts, and so are
associated with different computational complexities if no
complexity control is utilized. Variation in computational
complexity leads to encoding delay jittering. Typically motion
estimation (ME) consumes much more computational
complexity than other encoding tools. This work proposes a
practical complexity control method based on the complexity
analysis of an H.264 video encoder to determine the coding
gain of each encoding tool in the video encoder. Experiments
performed on a programming optimized source code show that
the computational complexity associated with each frame is
well controlled below a given limit with very little R-D
performance degradation under a reasonable constraint
comparing to the unconstrained case.
where D denotes distortion; R denotes bit rate; Ȝ denotes the
Lagrange multiplier; J denotes the R-D cost, and c
F
denotes the
complexity used for a frame.
Traditionally, the complexity constraint is computed in the
frame layer as described above. For typical MPEG-like video
encoders, a frame is partitioned into a number of MBs while an
MB is the basic encoding unit. Different MBs have various
motions and contexts and hence are associated with different
complexities. Therefore, the allocation of C
FC
among MBs is a
critical problem. Typically, MPEG-like video encoders use
many encoding tools, such as ME, DCT, Q, entropy coding and
others. Different encoding tools may exhibit substantially
different coding efficiency. Accordingly, allocating complexity
among encoding tools is another key problem.
A metric of coding gain which represents the coding
efficiency has been proposed [4] as follows:
Index Terms—Complexity control, complexity allocation,
video encoder, H.264
/CG J C ' ' (4)
ǻJDȜ
R
' ' (5)
where C' represents the increase in complexity when an
encoding tool is adopted;
D
'
represents the decrease in
distortion;
R
' represents the decrease in rate, and Ȝ is the
Lagrange multiplier. However, a proper
Ȝ is not easily
determined. When the rate control is turned on for a target rate,
R'
becomes nearly zero, and
J
' equals
D
'
:
1. INTRODUCTION
The real-time video encoding is an important element for many
applications over various wireless networks. To avoid encoding
delay jittering, the available encoding time of each video frame,
T
FC
, is limited in the real-time video encoding system and can
be defined as
/CG D C ' ' (6)
fr
T
FC
1
(1)
A few works of complexity control have been conducted
[2],[3],[4],[5]. The optimization formula of the first C-R-D
model [2] is too complicated to be solved in closed form. Also,
an MHM-based method for allocating complexity for ME
among MBs, which was not optimal, was also proposed in that
study. A statistical optimal operation mode for a sequence in a
complexity-constrained video encoding system has also been
proposed
[3]. However, an optimal operation mode could be
optimal for a frame but inadequate for another frame. A
complexity allocation method for ME based on the cost-
complexity curve has been proposed [4]. A C-R-D optimization
for H.264 ME has also been proposed [5]. It proposed two
Lagrange multipliers to terminate the complexity-inefficient
ME rounds and thus increase coding efficiency. Typically ME
consumes most complexity with a large variation between MBs.
In general, optimal complexity control algorithms are difficult
where fr represents the frame rate. The limited encoding time of
each frame limits the available computational complexity of
each frame, C
FC
, which can be defined as
fr
C
TCC
PRC
FCPRCFC
(2)
where C
PRC
represents the clock rate of the processor. However,
the C
PRC
of the processor embedded in wireless handsets is
limited and hence C
FC
is also limited.
Optimal complexity control aims to control the encoding
complexity of each frame under a given limit while achieving
optimal R-D performance as follows:
2136978-1-4244-1764-3/08/$25.00 ©2008 IEEE ICIP 2008

to apply to practical real-time video encoders because of their
large computational overhead. To the best of our knowledge, no
practical complexity control that is efficiently enough and
operates in real time exists for an H.264 video encoder.
Table II.
Coding gain of each encoding tool
Based on complexity analysis of a programming optimized
H.264 code, X264
[10], this work proposes a simple and
practical complexity control method which can control the
encoding complexity of each frame under a given limit while
achieving very good R-D performance.
This paper is organized as follows. Section 2 proposes a
practical complexity control method based on the results of
complexity analysis. Section 3 presents experimental results,
and section 4 draws conclusions.
2. PROPOSED COMPLEXITY CONTROL
For a typical MPEG-like video encoder, Figure 1 displays the
encoding block diagram of an MB. DCT, Q, Q
-1
, IDCT have
been collectively denoted by PRECODING [2]. This paper
follows this notation, and divides the encoder into three major
encoding tools - ME, PRECODING, and entropy coding.
Fig. 1 Basic block diagram of a video encoder
Highly efficient complexity control should be performed
by allocating complexity to the encoding tools with higher
coding gain. This work conducts experiments with the options
presented in Table I to analyze the coding gains of various
encoding tools in the modern H.264 encoder. The metric of
coding gain is given by (6), where is represented by
, which represents the increase in PSNR, and
D'
PSNR' C'
is
measured by the number of CPU clocks spent on
a piece of
code. Table II presents the results, which will be discussed in
the following subsection.
Table I.
Options for complexity analysis
Video source Foreman QCIF, Carphone
QCIF
Fast ME Diamond
Target rate 103k bps
Frame rate 20
Number of reference frames 1
GOP type IPPPP
CPU Intel Pentium 4 2.66G Hz
RAM 512M bytes
MMX tech. On for SAD computation
Source code of H.264 X264
Encoding tool Coding gain
(db/kclks)
CABAC (compare to CAVLC) 9.17e-4
half pixel ME 2.88e-3
Deblocking filter 8.54e-4
Quarter pixel ME 4.45e-4
8x8 partition mode 1.42e-4
16x8 & 8x16 partition mode 4.63e-5
Sub8x8 partition mode 4.7e-5
4x4 Intra 5.22e-6
5 reference frames 4.06e-5
2.1. Complexity Allocation
The complexity allocation allocates complexity from frame
layer to MB layer. It should be performed before the first MB
in a frame is encoded. When the video encoder starts to encode
a frame, it should do some initialization before encoding slices.
Complexity control records the complexity consumed by the
initialization, which is denoted by
C
Finit
. The complexity budget
of encoding all slices in a frame is
C
SLs
. After the slices are
encoded, deblocking filtering can be performed; it is followed
by updating references and other necessary tasks. The
complexity of these tasks after the encoding of slices,
C
Fother
,
should be reserved. The deblocking filter is suggested to be
adopted because it has high coding gain as shown in Table II
and proposed elsewhere [6].
C
Fother
is smaller than C
SLs
as
displayed in Fig. 2, and it does not vary greatly. It can be
regarded as a constant and can be estimated from the previous
frame. Accordingly, before the slices are encoded, by
measuring
C
Finit
and reserving C
Fother
, C
SLs
can be allocated by
FotherFinitFCSLs
CCCC
(7)
0 100 200 300 400
0
500
1000
frame number
CPU clocks (k)
frame init
0 100 200 300 400
0
1
2
3
x 10
4
frame number
CPU clocks (k)
slices encode
0 100 200 300 400
0
1000
2000
3000
4000
frame number
CPU clocks (k)
others
Fig. 2 Complexity consumption in the frame layer
The operation of the slice layer is very simple. Only a
short slice header is added. The complexity of encoding all
slice headers in a frame is small and can be treated as a constant.
It is denoted by
C
SLhs
. Therefore, the complexity of encoding all
MBs in a frame,
C
MBs
, can be allocated according to
M
Bs SLs SLhs
CCC (8)
2137

Each MB can adopt ME, PRECODING and entropy
coding. Typically, ME consumes most of the
complexity, as
shown in Fig. 3. It is the main object on which complexity
control will be performed. The modern entropy coding tool
CABAC has a high coding gain, as shown in Table II and
elsewhere [6]. Its adoption is recommended. The modern video
encoding standard H.264 significantly simplifies DCT
operation [6]. Hence, PRECODING has high coding gain, and
is destined to be adopted. Some early termination algorithms
for PRECODING have been proposed to skip the
PRECODING for the MB with small residual signals [11]. All
such algorithms with high efficiency can be utilized. As
described above, the complexity for PRECODING and entropy
coding should be reserved. The complexity budget
C
MEs
can be
allocated using
MEs MBs MBother
where C
MBother
denotes the complexity reserved for
PRECODING and entropy coding of a MB and
M is the
number of MBs in a frame. Figure 3 shows
C
MBother
is relatively
small and its variation is much smaller than
C
ME
, the
complexity for ME of a MB. Therefore,
C
MBother
can be treated
as a constant and can be estimated statistically by running test
video sequences in advance. The complexity compensation
described below will eliminate the estimation error.
CCC M u (9)
0 100 200 300 400
6
8
10
12
14
frame number
CPU clocks (k/8)
MB init
0 100 200 300 400
500
1000
1500
2000
frame number
CPU clocks (k/8)
ME
0 100 200 300 400
100
150
200
250
frame number
CPU clocks (k/8)
precoding
0 100 200 300 400
0
100
200
300
400
frame number
CPU clocks (k/8)
entropy coding
Fig. 3 Complexity consumption in the MB layer
The complexity allocation for ME among MBs is
suggested to be weighted by
COST0 as
Mi
COST
COST
CiC
M
j
j
i
MEsME
,...,2,1,
0
0
)(
1
u
¦
(10)
where
COST0 represents the cost of ME with zero MV in
16x16 partition mode. This equation is simple but meaningful
because
COST0 contains information about context and motion.
Since the MB with larger motion or more complex context has
larger
COST0, it deserves larger complexity budget. Otherwise,
a larger bit rate and larger distortion will be generated.
2.2. ME Flow in Decreasing order of CG
According to the coding gain in Table II, the ME flow in Fig. 4
is suggested. The resulting operation order is similar to that
suggested elsewhere [5] but the adoption of 4x4 Intra prediction
is different. Table II reveals that the coding gain of 4x4 Intra
for inter frames is very low, because most MBs in the inter
frame choose inter mode as the best mode. However, 4x4 Intra
prediction is beneficial to MBs that choose the Intra mode. The
tendency to Intra mode is examined by comparing 16x16 ME
and 16x16 Intra prediction. If the 16x16 Intra prediction yields
a better performance, 4x4 Intra prediction can be utilized to
reduce the residual signal. Otherwise, 4x4 Intra prediction is
not used.
Fig. 4 ME flow in decreasing CG of encoding tools
2.3. Complexity Check and Compensation
After each computation of SAD and the R-D cost, the used
complexity
C
MEused
is examined. If C
MEused
exceeds C
ME
, the ME
process terminates. Otherwise, the ME process continues.
Any efficient early termination algorithm for
PRECODING can be employed. Complexity compensation
described below will distribute the saved complexity.
After the whole process of the MB encoding is complete,
the balance
C
MBbalance
between the used complexity C
MBused
and
the budget
C
MB
is given by
M
Bbalance MB MBused
CCC (11)
where
C
MB
is obtained by
BMEMBothe
CCC
r
(12)
Then
C
MBbalance
is distributed uniformly to the remaining MBs in
that frame.
3. EXPERIMENTAL RESULTS
The options of experiments for the proposed practical
complexity control are shown in Table III. The complexity
metric is the number of
CPU clocks used by an encoding tool,
as measured by the ‘rdtsc’ instruction of an Intel CPU [7].
Figure 5 indicates that the complexity is well controlled
under the given
limit. The complexity of each frame rarely
exceeds the bound. Figure 6 and 7 show that the rate and PSNR
2138

under complexity control are both very close to those in the
unconstrained case. Figure 8 plots the R-D performance with
Foreman video sequence under various complexity constraints,
where Cfm denotes the maximum complexity of a frame
without complexity constraint. When
C
FC
is down to 72% of
Cfm, the PSNR obtained by this algorithm only degrades less
than 0.5 dB at the same rate. When
C
FC
is down to 58% of Cfm,
the PSNR obtained by this algorithm degrades no more than 1
dB at the same rate. Experiments with another video source
‘Carphone’ yield similar results.
0 50 100 150 200 250 300 350 400
0.5
1
1.5
2
2.5
3
3.5
x 10
4
frame number
CPU clocks (k)
complexity control for a fixed frame rate
adopt complexity control
without complexity control
complexity bound
Table III.
Fig. 5 Comparisons of computational complexity with
and without complexity control
Options for complexity control
C
Fmax
(clk)
source QP Rate
control
Fast ME Complexity
metric
23 M Foreman 29 off Diamond CPU clock
0 50 100 150 200 250 300 350 400
0
500
1000
1500
2000
2500
3000
rate
frame number
rate (frame size)
adopt complexity control
without complexity control
4. CONCLUSION AND FUTURE WORK
This work proposes an efficient complexity control
method with very little degradation of R-D performance. The
proposed method, which has very low overhead, is also very
practical.
5. REFERENCES
Fig. 6 Comparisons of rate with and without complexity
control
[1] “Draft ITU-T recommendation and final draft international
standard of joint video specification (ITU-T rec. H.264/ISO/IEC
14496-10 AVC)”, in JVT of ISO/IEC MPEG and ITU-T VCEG,
JVT-G050, 2003.
0 50 100 150 200 250 300 350 400
33
33.5
34
34.5
35
35.5
36
36.5
37
37.5
Ypsnr
frame number
YPSNR (db)
adopt complexity control
without complexity control
[2] Z. He and Y. F. Liang, “Power-Rate-Distortion analysis for
wireless video communication under energy constraints,” IEEE
Trans. Circuits Syst. Video Technol., vol. 15, no. 5, pp. 645-658,
May 2005.
[3] D. N. Kwon and P. F. Driessen, “Performance and
computational optimization in configurable hybrid video system,”
IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 1, pp. 31-
42, Jan. 2006.
[4] C. Kim and J. Xin, “Hierarchical complexity control of motion
estimation for H.264/AVC,” MITSUBISHI ELECTRIC
RESEARCH LABORATORIES, TR2006-004, Dec. 2006.
Available: http://www.merl.com
Fig. 7 Comparisons of YPSNR with and without
complexity control
[5] Y. Hu, Q. Li, S. Ma, and C. C. J. Kuo, Joint rate-distortion-
complexity optimization for H.264 motion search,” in Proc. ICME
2006, pp. 1949-1952.
2 3 4 5 6 7 8 9
32
32.5
33
33.5
34
34.5
35
35.5
36
36.5
37
Rate (kbits)
YPSNR (db)
rate-distortion curves under various complexity constraints
no constraint
Cfc = 72% Cfm
Cfc = 66% Cfm
Cfc = 58% cfm
Cfc = 48% cfm
[6] E. G. Richardson, H.264 and MPEG-4 Video Compression.
John Wiley & Sons, 2003.
[7] http://www.intel.com
[8] J. Ostermann, J. Bormans, P. List, D. Marpe, M. Narroschke, F.
Pereira, T. Stockhammer, and T. Wedi, “Video coding with
H.264/AVC: Tools, performance, and complexity,” IEEE Circuits
Syst. Mag., vol. 4, no. 1, pp. 7-28, Apr. 2004.
[9] Joint Model reference software version 10, Available:
http://iphome.hhi.de/suehring/tml/index.htm
[10] x264, Available: http://developers.videolan.org/x264.html.
[11] Z. Chen, P. Zhou, and Y. He, “Fast Integer Pel and Fractional
Pel Motion estimation in for JVT,” JVT-F017r1.doc, Joint Video
Team (JVT) of ISO/IEC MPEG & ITU-T VCEG, 6
th
meeting,
Awaji, Island, JP, 5-13 Dec. 2002.
Fig. 8 R-D performance under various complexity
constraints
2139
Citations
More filters
Patent

Optimized allocation of multi-core computation for video encoding

TL;DR: In this article, video encoding computations are optimized by dynamically adjusting slice patterns of video frames based on complexity of each frame and allocating multi-core threading based on the slices.
Journal ArticleDOI

Computational complexity allocation and control for inter-coding of high efficiency video coding with fast coding unit split decision

TL;DR: A computational complexity allocation and control method for the low-delay P-frame configuration of the HEVC encoder, which can be reduced to 80% and 60% or even lower when the target complexity was reduced to 60%.
Journal ArticleDOI

Optimal model-based complexity control for H.264 video encoding

TL;DR: This research proposes a complexity control mechanism that is composed of two algorithms to minimise the distortion of each encoded video frame under the computational complexity constraint and the rate constraint.

Resource Constrained Video Coding Systems

Waqar Zia
TL;DR: This work provides a set of frameworks for computational resource management, applicable for a variety of resource-constrained video communication systems, with a special emphasis on low complexity and high accuracy.
Journal ArticleDOI

Complexity control for high-efficiency video coding by coding layers complexity allocations

TL;DR: This work proposes a complexity control method for the low-delay P-frame configuration of the HEVC encoder that can simultaneously satisfy the entire complexity constraint (ECC) for entire sequence encoding and the instant complexity constraint ("ICC") for each frame during real-time encoding.
References
More filters
Book

H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia

TL;DR: In this article, the MPEG-4 and H.264 standards are discussed and an overview of the technologies involved in their development is presented. But the focus is on the performance and not the technical aspects.
MonographDOI

H.264 and MPEG-4 Video Compression

TL;DR: This paper presents a meta-review of the MPEG-4 and H.264 standards for video quality and design, and some of the standards themselves have been revised and improved since their publication in 2009.
Journal ArticleDOI

Power-rate-distortion analysis for wireless video communication under energy constraints

TL;DR: This paper analyzes the encoding mechanism of typical video coding systems, and develops a parametric video encoding architecture which is fully scalable in computational complexity, using dynamic voltage scaling (DVS), an energy consumption management technology recently developed in CMOS circuits design.
Related Papers (5)
Frequently Asked Questions (10)
Q1. What are the contributions mentioned in the paper "Coding-gain-based complexity control for h.264 video encoder" ?

This work proposes a practical complexity control method based on the complexity analysis of an H. 264 video encoder to determine the coding gain of each encoding tool in the video encoder. A metric of coding gain which represents the coding efficiency has been proposed [ 4 ] as follows: 

After the slices are encoded, deblocking filtering can be performed; it is followed by updating references and other necessary tasks. 

To avoid encoding delay jittering, the available encoding time of each video frame, TFC, is limited in the real-time video encoding system and can be defined as /CG D C (6)frTFC 1 (1) A few works of complexity control have been conducted[2],[3],[4],[5]. 

before the slices are encoded, by measuring CFinit and reserving CFother, CSLs can be allocated byThe operation of the slice layer is very simple. 

The limited encoding time of each frame limits the available computational complexity of each frame, CFC, which can be defined as fr C TCC PRCFCPRCFC (2) where CPRC represents the clock rate of the processor. 

Optimal complexity control aims to control the encoding complexity of each frame under a given limit while achieving optimal R-D performance as follows:2136978-1-4244-1764-3/08/$25.00 ©2008 IEEE ICIP 2008to apply to practical real-time video encoders because of their large computational overhead. 

Based on complexity analysis of a programming optimized H.264 code, X264 [10], this work proposes a simple and practical complexity control method which can control the encoding complexity of each frame under a given limit while achieving very good R-D performance. 

Rate (kbits)Y PS NR (db)rate-distortion curves under various complexity constraintsno constraint Cfc = 72% Cfm Cfc = 66% Cfm Cfc = 58% cfm Cfc = 48% cfm [6] 

The complexity compensation described below will eliminate the estimation error.C C C M (9)The complexity allocation for ME among MBs issuggested to be weighted by COST0 asMi COSTCOSTCiC Mj ji MEsME ,...,2,1,00)(1(10)where COST0 represents the cost of ME with zero MV in 16x16 partition mode. 

Table II reveals that the coding gain of 4x4 Intrafor inter frames is very low, because most MBs in the inter frame choose inter mode as the best mode.