scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Analysis of video transmission over lossy channels

TL;DR: The main focus of this paper is to show the accuracy of the derived analytical model and its applicability to the analysis and optimization of an entire video transmission system.
Abstract: A theoretical analysis of the overall mean squared error (MSE) in hybrid video coding is presented for the case of error prone transmission. Our model covers the complete transmission system including the rate-distortion performance of the video encoder, forward error correction, interleaving, and the effect of error concealment and interframe error propagation at the video decoder. The channel model used is a 2-state Markov model describing burst errors on the symbol level. Reed-Solomon codes are used for forward error correction. Extensive simulation results using an H.263 video codec are provided for verification. Using the model, the optimal tradeoff between INTRA and INTER coding as well as the optimal channel code rate can be determined for given channel parameters by minimizing the expected MSE at the decoder. The main focus of this paper is to show the accuracy of the derived analytical model and its applicability to the analysis and optimization of an entire video transmission system.

Summary (2 min read)

Introduction

  • Interestingly, even this less ambitious problem is not well investigated in the literature.
  • Because of this interaction of system components, the influence of individual parameters is difficult to understand, and the design of the overall system might become a formidable task.
  • In this paper the authors consider only a single layer codec but include the effects of transmission errors and INTRA coding as well as the distortion-rate behavior of the video encoder.
  • Finally, joint optimization of source and channel coding parameters is investigated in Section V-C. II.

A. Overview

  • And introduce the most important model parameters.the authors.
  • Often, this in- TABLE I SUMMARY OF MODEL PARAMETERS volves packetization and some form oferror control.
  • Fast resynchronization of the bitstream and error concealment are two important issues that can help to mitigate the effect of residual errors.
  • First consider a variation of the code rate (see Section V-A).
  • First, a reduction of reduces the bit rate available to the video encoder and thus increases the distortion at the encoder regardless of transmission errors.

B. Simulation Environment

  • The simulation environment the authors use in this paper to verify the derived model is described as follows.
  • As source signals, the authors use the QCIF test sequencesMother&DaughterandForemanwhich are encoded at 12.5 fps using 150 and 125 frames, respectively.
  • This rate control reduces buffer variations to an acceptable amount, and hence allows the transmission over a constant bit rate channel with limited delay.
  • In either case, error concealment is done for any GOB that overlaps with the lost packet.

C. Distortion Measure

  • For the evaluation of the video transmission system, it is necessary to average the distortion over the whole sequence in order to provide a single figure of merit.
  • In the following section the authors model the distortion-rate performance of the video encoder.
  • The authors have found that the relationship with is approximately linear, i.e., (5) such that the total number of model parameters is six.
  • Errors that are introduced at a given point in time propagate due to the recursive structure of the decoder.

A. Optimal INTRA Rate

  • The influence of the INTRA rateon the decoded picture distortion is studied for a fixed channel code rate .
  • On the one hand, an increased percentage of INTRA coded macroblocks helps to reduce interframe error propagation, and therefore reducesas described by (8) and (9).
  • It can be seen that the model gives a very good approximation of the PSNR at the decoder.
  • Therefore, the INTRA mode can be used more generously, and higher optimal INTRA rates result.
  • On the other hand, the exact selection ofis less critical, since the optimum is rather flat.

B. Optimal FEC Code Rate

  • Analogous to the previous subsection, the authors now study the influence of the channel code rateon the decoded video quality PSNR for a fixed INTRA rate .
  • Fig. 8 shows that their model approximates the PSNRat the video decoder for different channel code rates very well.
  • As explained in Section III-B, this is due to the fact that the introduced errors are not independent any more.
  • Note that the variation of PSNRas a function of is more severe for theForemansequence than for theMother&Daughter sequence.
  • More importantly, the same reduction in code rate is more effective for theForemansequence because of the increased block size.

C. Optimal Parameter Selection for the Transmission System

  • In this subsection the authors optimize the rate of INTRA coded macroblocks and the channel code ratejointly.
  • Fig. 11 shows the optimal INTRA rate and the optimal channel code rate for a transmission over burst channels with different average burst lengths and symbol error rates in the range % %.
  • The authors have derived a theoretical framework for the decoded picture quality after video transmission over lossy channels.
  • In contrast, for bursty channels the use of FEC is limited and the INTRA update is essential.
  • The authors are mainly interested in the variance of the propagated error signal and in its average over time.

A. Derivation of Block Error Density

  • Then the probability of errors within a block of symbols is where is the average error probability.
  • From 1993 until 1999, he was a Chaired Professor of Electrical Engineering/Telecommunications at University of Erlangen-Nuremberg, Germany, and the Head of the Telecommunications Institute I, co-directing the Telecommunications Laboratory.
  • He has served as the Chairman of the Electrical Engineering Department from 1995 to 1997, and as Director of the Center of Excellence “3-D Image Analysis and Synthesis” from 1995 to 1999.

Did you find this useful? Give us your feedback

Figures (13)

Content maybe subject to copyright    Report

1012 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000
Analysis of Video Transmission over Lossy Channels
Klaus Stuhlmüller, Niko Färber, Member, IEEE, Michael Link, and Bernd Girod, Fellow, IEEE
Abstract—A theoretical analysis of the overall mean squared
error (MSE) in hybrid video coding is presented for the case of
error prone transmission. Our model covers the complete trans-
mission system including the rate-distortion performance of the
video encoder, forward error correction, interleaving, and the ef-
fect of error concealment and interframe error propagation at the
video decoder. The channel model used is a 2-state Markov model
describing burst errors on the symbol level. Reed–Solomon codes
are used for forward error correction. Extensive simulation results
using an H.263 video codec are provided for verification. Using the
model, the optimal tradeoff between INTRA and INTER coding as
well as the optimal channel code rate can be determined for given
channel parameters by minimizing the expected MSE at the de-
coder. The main focus of this paper is to show the accuracy of the
derived analytical model and its applicability to the analysis and
optimization of an entire video transmission system.
Index Terms—Error resilience, intra-update, joint source-
channel coding, robust video transmission, tradeoff source-
channel coding, video transmission system model.
I. INTRODUCTION
T
O TRANSMIT video over noisy channels, one uses both
source and channel coding. According to Shannon’s
Separation Principle, these components can be designed
independently without loss in performance [1]. However, this
important information-theoretic result is based on several
assumptions that might break down in practice. In particular,
it is based on 1) the assumption of an infinite block length for
both source and channel coding, and 2) an exact and complete
knowledge of the statistics of the (ergodic) transmission
channel. As a result of the first assumption, the Separation
Principle cannot be applied without performance loss to
applications with real-time constraints. This holds especially
for bursty channels which are characteristic for mobile radio
transmission or the Internet. As a consequence of the second
assumption, it applies only to point-to-point communications.
Therefore, Joint Source-Channel Coding and Error Resilient
Coding can be advantageous, in practice, and have become an
important research topic. Recent reviews and special issues in
the context of video coding include [2]–[5].
Despite increased research activity, joint source-channel
coding schemes for video are still in their infancy today. A
Manuscript received May 5, 1999; revised November 11, 1999. This work
was supported in part by the German DFN-Verein.
K. Stuhlmüller and N. Färber are with the Telecommunications Laboratory,
University of Erlangen-Nuremberg, Cauerstrasse 7/NT, 91058 Erlangen, Ger-
many (e-mail: stuhl@LNT.de; faerber@LNT.de).
M. Link is with Lucent Technologies, Nuremberg, Germany (e-mail:
mlink@lucent.com).
B. Girod is with the Information Systems Laboratory, Stanford University,
Stanford, CA USA (e-mail: girod@ee.stanford.edu).
Publisher Item Identifier S 0733-8716(00)04338-9.
pragmatic approach for today’s state of the art is to keep the
source coder and the channel coder separate, but to optimize
their parameters jointly. A key problem of this optimization
is the bit allocation between source and channel coding that
is also discussed in this paper. Interestingly, even this less
ambitious problem is not well investigated in the literature.
Often, the underlying transmission system is regarded as a
“black box,” and the video codec has to cope with whatever
bit error rate or packet error rate is offered. This approach
is indeed justified if video is added as another application
on top of a fixed transmission system. However, current and
future transmission systems provide increasing flexibility at
the interface to the transport level. For example, the enhanced
air interface of the GSM system (EDGE [6]) will include a
flexible link adaptation where either 1/1, 3/4, 2/3, or 1/2 of the
total bit rate can be allocated to the source while the rest is used
for channel coding. In fact, the advantage of this flexibility for
speech transmission is already exploited in the next generation
speech codec of the GSM system, called Adaptive Multi Rate
(AMR, [7]). In the future, software radios may even allow
configuration of the modulation scheme [8]. This trend toward
increased flexibility allows inclusion of channel coding (and
modulation) into the optimization.
More flexibility, on the other hand, also increases the com-
plexity of the system and makes parameter optimization more
difficult. The overall performance depends on many interre-
lated issues, such as the distortion-rate performance and error
resilience of the source codec, the error correction capability of
the channel codec, andthe characteristicofthechannel.Because
of this interaction of system components, the influence of indi-
vidual parameters is difficult to understand, and the design of
the overall system might become a formidable task. Often, sim-
ulations are used to study overallsystem performance (e.g., [9]).
However, measurements can rarely be generalized, and provide
only limited insight in the underlying problem. Furthermore,
simulations can become very complex for a large parameter
space. It is therefore desirable to develop appropriate models
to study and understand the interaction and tradeoffs between
system parameters.
The scope of this paper is to provide such a model for a
complete video transmission system. We use this model to
analyze the overall performance as a function of the most
important system parameters. In particular, the optimum
bit allocation between source and channel coding is found
analytically while also considering the optimal tradeoff be-
tween INTER and INTRA coding. Similar investigations have
been performed for vector quantization [10] and Lempel–Ziv
compression [11]. However, no analysis has been presented for
motion-compensated video coding that forms the basis of all
common video coding standards, including H.261, H.263
,
0733–8716/00$10.00 © 2000 IEEE

STUHLMÜLLER et al.: ANALYSIS OF VIDEO TRANSMISSION OVER LOSSY CHANNELS 1013
Fig. 1. Video transmission scheme. The video encoder is described by its distortion-rate function
D
(
;R
)
depending on the INTRA rate
. The influence of
the transmission using FEC is described by the residual word error rate
P
(
r; P ;L
)
depending on the channel code rate
r
and the channel characteristics
P
(error probability) and
L
(average burst length). At the video decoder, the effect of error propagation is given by
D
(
;P
)
. The overall decoded video quality
is denoted
D
.
MPEG-1, MPEG-2, and MPEG-4 [12]–[17]. In previous work,
we addressed the related problem of optimal transmission of
a given scalable video bit stream over a packet network by
optimizing the unequal error protection for the layers of the
video stream [18], [19]. In this paper we consider only a single
layer codec but include the effects of transmission errors and
INTRA coding as well as the distortion-rate behavior of the
video encoder.
This paper is organized as follows. We first outline the trans-
mission system in Section II. In Section III we model a hybrid
motion compensated video codec. The distortion-rate perfor-
mance of the video encoder is analyzed in Section III-A, while
a theoretical framework for interframe error propagation is pre-
sented in Section III-B. The influence of channel coding and
channel parameters are discussed in Section IV. Then, we com-
bine the models to describe the overall system performance, and
show in Section V that our model can approximate the decoded
picture quality very accurately. The impact of INTRA coding
and FEC is studied in Sections V-A and B, respectively. Finally,
joint optimization of source and channel coding parameters is
investigated in Section V-C.
II. V
IDEO TRANSMISSION SYSTEM
A. Overview
In this section we provide an overview of the video transmis-
sion system under consideration, and introduce the most impor-
tant model parameters. As can be seen from Fig. 1, the system
consists of three parts: the video encoder, the video decoder, and
the error control channel, which is definedas the combination of
the channel codec and the channel [20]. These components are
described briefly in the following paragraphs and are discussed
in more detail in Sections III and IV. All model parameters are
summarized in Table I for quick reference.
We assume that a space-time discrete video signal is used as
input to the video encoder which is characterized by its opera-
tional distortion-rate (DR) function
; i.e., the average
distortion
is expressed as a function of the average bit rate
andINTRA rate . The common DR relationship is extended
by the INTRA rate because of its significant influence on error
resilience. In fact, it is used as the first important parameter for
system optimization in this paper.
After source coding, the compressed video bitstream is pre-
pared for transmission by the channel codec. Often, this in-
TABLE I
S
UMMARY OF MODEL PARAMETERS
volves packetization and some form of error control. In this
paper we focus on forward error correction (FEC) that can be
combined with interleaving to reduce the effect of burst errors.
More specifically, we assume an (
) Reed–Solomon (RS)
block code with a block size of
symbols including in-
formation symbols. The second important parameter that is used
for system optimization is the code rate
. By reducing
the code rate, more channel coding redundancy is added to each
codeword which improves the error correction capability of the
code while reducing the throughput at the same time.
After channel encoding, the RS codewords are transmitted
over the channel. We use a two-state Markov model to describe
errors on the symbol level. As intuitive channel parameters, we
use the average symbol error rate
and the average burst
length
. Together with the total bit rate , these two param-
eters completely describe the channel and can be used to, e.g.,
study the influence of burst errors versus independent symbol
errors. Furthermore, the selected channel model allows calcula-
tion of the residual word error rate
after channel
decoding from theparameters of the Markov model and the code
rate. Thus, the overall performance of the error control channel,

1014 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000
including a burst channel and an RS channel codec, can be de-
scribed analytically.
Finally, the influence of residual errors on the decoded video
quality has to be considered. Depending on the error resilience
capabilities of the video decoder, a single lost codeword may
cause severe image distortion. Fast resynchronization of the bit-
stream and error concealment are two important issues that can
help to mitigate the effect of residual errors. Another important
issue is interframe error propagation because errors may be vis-
ible over many consecutive frames. Therefore, a model for in-
terframe error propagation is derived in this paper that describes
the additional distortion at the decoder
as a function
of the INTRA rate
and the residual word error rate .
After this brief description of each system component, it is
interesting to discuss the interactions and tradeoffs that influ-
ence the overall distortion
. First consider a
variation of the code rate
(see Section V-A). Note that for
a given channel bit rate
, the code rate controls the bit al-
location between source and channel coding. This has two ef-
fects on the picture quality of the video signal at the decoder
output. First, a reduction of
reduces the bit rate available to
the video encoder and thus increases the distortion at the en-
coder regardless of transmission errors. The actual
increase
is determined by the operational DR function
of the
video encoder. On the other hand, the residual word error rate
is reduced when reducing
, determined by the properties of the
error control channel according to
. Finally, a re-
duction in
leads to a reduction in depending on
several implementation issues as discussed above. Considering
the total distortion
at the video decoder output, these inter-
actions of the various components make it difficult to select the
optimum code rate. Basically, the characteristic of each compo-
nent may have significant influence.
Now consider a variation of the INTRA rate
which is used
as the second important optimization parameter in this paper
(see Section V-B). Since INTRA coded macroblocks do not de-
pend on the previous frame, error propagation can be reduced
by increasing the number of INTRA coded macroblocks, thus
reducing
. However, INTRA coding also reduces the coding
efficiency compared to motion compensated prediction. Hence,
the distortion at the encoder
is increased for a fixed bit rate
. Whether or not an increase in is advantageous for the
overall distortion
depends on the actual amount
of increase/decrease in each component. This illustrates that
each component needs to be modeled accurately before system
optimization can be attempted. This is particularly true for a
joint optimization of
and (see Section V-C ).
B. Simulation Environment
The simulation environment we use in this paper to verify the
derived model is described as follows.As source signals, we use
the QCIF test sequences Mother&Daughter and Foreman which
are encoded at 12.5 fps using 150 and 125 frames, respectively.
The sequences are selected because of their different character-
istic in motion and spatial detail. Although the model can also
be applied to other test sequences (see [21]), we do not provide
additional results because the selected sequences are sufficient
to discuss the effect of different source statistics.
For source coding, we use an H.263 compliant video en-
coder. No H.263 options are used, however, each Group Of
Blocks (GOB) is encoded with a header to improve resynchro-
nization. The encoder operates at a constant bit rate
which
is enforced by a simple rate control that is described as fol-
lows. Each frame is encoded with a fixed quantizer step size,
which is adapted frame by frame to obtain a given target bit
budget. The adaptation of the quantizer step size is performed
as follows. First, the mode decision is performed according to
TMN5 [22] for the whole frame, and then the resulting predic-
tion error is transformed and quantized with different quantizer
step sizes. Finally, the value that minimizes the difference be-
tween the accumulated number of transmitted bits and target
bits is selected. This rate control reduces buffer variations to an
acceptable amount, and hence allows the transmission over a
constant bit rate channel with limited delay. In practice, other
rate control algorithms should be used that can further reduce
buffer variations at improved performance. However, since rate
control is not the focus of the paper, the above approach is suf-
ficient.
Another issue that is related to the coding control of the
video encoder is the INTRA update scheme employed. Several
schemes have been proposed in the literature that either con-
sider the activity of image regions [23], [24], vary the shape
of INTRA update patterns [9], or include the INTRA mode
decision in a rate-distortion optimized encoding framework
[25]–[27]. In a very common scheme, which is also recom-
mended in H.263, each macroblock is assigned a counter that
is incremented if the macroblock is encoded in interframe
mode. If the counter reaches a threshold
( update interval),
the macroblock is encoded in INTRA mode and the counter
is reset to zero. By assigning a different initial offset to each
macroblock, the updates of individual macroblocks can be
spread out in time. In our simulations, we use a very similar
update scheme, however, with a variable threshold
instead
of the fixed value of
that is recommended in H.263.
The only difference is that we also increment the counter for
skipped (i.e., UNCODED) macroblocks to guarantee a regular
update of all image regions.
The channel parameters are selected as follows. Considering
the different complexity of the sequences, we chose a total
channel bit rate of
kbps for Mother&Daughter
and
kbps for Foreman. This allows varia-
tion of the INTRA rate and code rate over a wide range
without suffering too high distortions or buffer overflows.
Unless otherwise noted, the average burst length is set to
while the symbol error rate is selected from the set
[%].
The parameters of the RS code are considered next. We
use the very common choice of 8 bit per symbol, i.e., one
symbol corresponds to one byte. The block size is set to the
average GOB size which results in
bytes (80 000/12.5/9
712 bit) for the Mother&Daughter and bytes
(200000/12.5/9
1778 bit) for the Foreman sequence. Note
that this limits the delay introduced by channel coding to one
GOB, and therefore also allows for conversational services

STUHLMÜLLER et al.: ANALYSIS OF VIDEO TRANSMISSION OVER LOSSY CHANNELS 1015
with their strict delay constraints. The amount of information
symbols
is varied in increments of 8 bytes to achieve different
code rates.
Finally, we need to consider the operation of the video de-
coder in the case of errors. If the RS decoder fails to correct
the transmission errors in a block, the video decoder receives
an error indication or detects that there has been an error due to
bit stream syntax violations. In either case, error concealment
is done for any GOB that overlaps with the lost packet. No spe-
cial packetization is used, i.e., new GOB’s are not necessarily
aligned with the beginning of a packet. For error concealment,
the previous-frame GOB is simply copied to the current frame
buffer.
C. Distortion Measure
For the evaluation of the video transmission system, it is nec-
essary to average the distortion over the whole sequence in order
to provide a single figure of merit. Even though the time aver-
aged squared error is somewhat questionable as a measure of
subjective quality, this approach is still very useful, e.g., to pro-
vide an overview for a large set of simulations. Therefore, the
video quality is measured as the Mean-Squared-Error (MSE)
averaged over all frames of the video sequence throughout this
paper. Since PSNR is a measure more common in the video
coding community, we use PSNR
MSE to il-
lustrate simulation results. Note that the average PSNR is often
computed by first computing the PSNR for each frame and av-
eraging in time afterwards. The definition used in this paper al-
lows a better theoretical analysis (see Section III-B) and is more
consistent with subjective quality for strong quality variations.
In practice, however, there is no significant difference between
the two definitions.
Note that we need to distinguish between the picture quality
at the encoder and the picture quality at the decoder.Using
to
describe the overall MSE for a whole sequence after encoding,
we obtain
PSNR
(1)
for the corresponding PSNR value. At the decoder side we need
to recall that the result depends on the probabilistic nature of
the channel. Hence, the averaged distortion over many channel
realizations has to be considered. For the simulation results in
this paper, we use 30 random channel realizations for each par-
ticular setting of the video transmission system and average the
MSE over all frames and realizations. The resulting MSE and
PSNR are denoted
and
PSNR
(2)
respectively. In order to ensure that the distortion at the decoder
is measured in a steady state, only the last 50 encoded frames
are used to calculate
and .
As mentioned above, the overall MSE
is actually a su-
perposition of two distortion types. The distortion caused by
signal compression
and the distortion which is caused
by residual errors and interframe error propagation. Assuming
that
and are uncorrelated, we can calculate the overall
MSE as
(3)
Our experiments indicate that this assumption is valid. Even
though transmission errors may be clustered around active
regions, and thus their magnitude may be correlated with the
coding errors, usually their sign is not correlated to the sign of
the coding errors.
However, it should be noted that (3) combines two distortion
types that are likely to be perceived differently. The distortion
is caused by signal compression and consists of blocking ar-
tifacts, mosquito noise, ringing, blurring, etc. The distortion in-
troduced by transmission errors
consists of severe destruc-
tion of image content and may be large and infrequent. Sub-
jective tests are needed to determine how
and shall be
combined to give the best possible approximation of subjective
quality.
If subjective tests show that, e.g., the distortion
caused
by transmission errors is more annoying than the distortion
caused by the video encoder
, (3) and (2) can be changed
to a weighted sum or some other function of
and . The
determination of such a subjective quality function is beyond
the scope of this paper and is left to future research.
III. A
NALYSIS OF THE VIDEO CODEC
In this section, we analyze the performance of the video en-
coder and decoder. Although we use the ITU-T H.263 [13] video
compression standard throughout this paper, the model derived
can be used for other codecs that are based on hybrid motion
compensation.
In the following section we model the distortion-rate perfor-
mance of the video encoder. Then we introduce an analytical
model for the error propagation at the video decoder which can
explain the cumulative effect of transmission errors. We focus
on the main results and refer to the Appendixes for most deriva-
tions.
A. Video Encoder
In this section we model the Distortion-Rate (DR) perfor-
mance of a hybrid motion compensated video encoder. The
proposed model is an empirical model that is not derived
analytically. Instead, we focus on the input–output behavior
of the video encoder and emphasize simplicity and usability
over a complete theoretical description. On the one hand,
this approach is taken because we want to describe a com-
plete transmission system, which requires the complexity of
individual components to be kept at a reasonable level. On
the other hand, we found that theoretically founded models
often cannot describe experimental results very accurately due
to simplistic assumptions. For example, such a theoretically
founded model for the performance of motion compensated
prediction is described in [28] and [29], where the DR per-
formance is analyzed by deriving the power spectral density
of the prediction error with respect to the probability density
function of the displacement error. Although this model pro-
vides very interesting insights, it cannot describe the measured

1016 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000
DR performance of an H.263 encoder with sufficient accuracy.
Similar problems can be observed for the description of the
DR performance in transform coding [30] and DCT coding in
particular. Although several empirical distortion-rate models
have been published (e.g., [31]–[34], they are usually used for
rate control and cannot be used to model the distortion of an
entire video encoder for a given rate.
To avoid these limitations without an increase in model com-
plexity, we use a simple equation that relates the distortion at the
encoder
to the relevant parameters. In the simulation sce-
nario that we consider, there are two parameters with a signifi-
cant impact on
, namely the source rate that is allocated to
the video encoder, and second, the percentage of INTRA coded
macroblocks (INTRA rate)
that is enforced by the coding con-
trol to improve error robustness. The general idea to use empir-
ical models to describe DR performance has also been used for
rate control as, for example, in [32], however, our focus is on
the description of the overall performance, i.e., the average dis-
tortion for a whole sequence given
and .
One drawback of this approach is that the necessary model
parameters cannot be derived from commonly used signal sta-
tistics, like variance, correlation, or the power spectral density.
Instead, the parameters need to be estimated by fitting the model
to a subset of measured data points from the DR curve. Since the
proposed model uses only six parameters (see below), the nec-
essary subset is relatively small and can be obtained with rea-
sonable complexity. However, the obtained parameters are spe-
cific for a given video sequence and video codec. Furthermore,
the interpretation of these parameters is not always obvious.
This makes it difficult to, e.g., extend results from a sequence
with “complex motion” to a sequence with “moderate motion.”
However, we found that the model can describe the DR perfor-
mance of a wide range of test sequences with very good accu-
racy, once the parameters are selected correctly. Furthermore,
the simplicity of the model significantly increases its usability
and thus, in practice, outweighs the described drawbacks. Nev-
ertheless, it should be noted that a model of similar simplicity
that is founded on theoretical analysis would be highly desir-
able.
We use the DR model
(4)
where
is the distortion of the encoded sequence, measured
as the MSE, and
is the output rate of the video encoder. The
remaining variables (
, and ) are the parameters of the
DR model which depend on the encoded sequence as well as on
the percentage of INTRA coded macroblocks
. We have found
that the relationship with
is approximately linear, i.e.,
(5)
such that the total number of model parameters is six. According
to (5), it is sufficient to measure the DR curves for only two
differentINTRA rates. Intermediate values can then be obtained
by linear interpolation. This is also the approach used in the
following to obtain the model parameters.
Fig. 2 shows that (4) and (5) approximate the DR performance
of the video encoder very accurately. Although the experimental
results are obtained with an H.263 encoder, the DR curves for
other hybrid motion compensated video encoders, e.g., H.261
[12], MPEG-1 [15], or MPEG-2 [16], exhibit very similar be-
havior.
The model (4) was fitted to the measured points for
%
and for
%( % for Foreman) INTRA coded
macroblocks. The fitting was done by minimizing the sum
of squared MSE differences between the model and the
measured points. This resulted in two sets of parameters
for each sequence. These two parameter sets
together consist of six values, thus allowing us to determine
, and from (5).
The model parameters
, and
are used to interpolate the DR curves for other INTRA
rates
. The intermediate curves in Fig. 2 for 3%, 6%, 11%, and
22% (and 33%, 44% for Foreman) were generated by using (4)
and interpolating the parameters according to (5). The maximal
PSNR deviation between the model fitted that way and the
measured DR points is 0.22 dB for the Mother&Daughter
sequence and 0.3 dB for the Foreman sequence (Fig. 2).
Note that the parameters
, and
characterize the coding of the input video sequence with
the given hybrid motion compensated encoder, in this example
Mother&Daughter or Foreman coded with H.263 in baseline
mode. The parameters depend very much on the spatial detail
and the amount of motion in the sequence; e.g., for a sequence
with high motion and little spatial detail
is low, whereas
for a sequence with moderate motion and high spatial detail
is high.
B. Video Decoder
While motion compensated prediction yields significant
gains in coding efficiency, it also introduces interframe error
propagation in the case of transmission errors. Since these
errors decay slowly, they are very annoying. To optimize the
overall performance of video transmission systems in noisy
environments, it is therefore important to consider the effect
of error propagation. While several heuristic approaches have
been investigated in the literature to reduce the influence of
error propagation (e.g., [23], [24], and [35]), up until now
no theoretical framework has been proposed to model the
influence of transmission errors on the decoded picture quality.
The model proposed in the following includes the effects of
INTRA coding and spatial loop filtering and corresponds to
simulation results very accurately.
Note that two differenttypes of errors contributeto the overall
distortion at the decoder. First, the errors that are caused by
signal compression at the encoder
and, second, errors that
are caused by residual errors which cannot be corrected by the
channel decoder. Since the first type of error is sufficiently de-
scribed by (4), we now focus on the second type of error and use
the variable
to refer to it.
A simplified block diagram of a hybrid motion compensated
video codec is illustrated in Fig. 3, together with the relevant pa-
rameters that are introduced in the following. We describe errors

Citations
More filters
Journal ArticleDOI
TL;DR: This article emphasizes the processing that is done on the luminance components of the video, and provides an overview of the techniques used for bit-rate reduction and the corresponding architectures that have been proposed.
Abstract: Throughout this article, we concentrate on the transcoding of block-based video coding schemes that use hybrid discrete cosine transform (DCT) and motion compensation (MC). In such schemes, the frames of the video sequence are divided into macroblocks (MBs), where each MB typically consists of a luminance block (e.g., of size 16 /spl times/ 16, or alternatively, four 8 /spl times/ 8 blocks) along with corresponding chrominance blocks (e.g., 8 /spl times/ 8 Cb and 8 /spl times/ 8 Cr). This article emphasizes the processing that is done on the luminance components of the video. In general, the chrominance components can be handled similarly and will not be discussed in this article. We first provide an overview of the techniques used for bit-rate reduction and the corresponding architectures that have been proposed. Then, we describe the advances regarding spatial and temporal resolution reduction techniques and architectures. Additionally, an overview of error resilient transcoding is also provided, as well as a discussion of scalable coding techniques and how they relate to video transcoding. Finally, the article ends with concluding remarks, including pointers to other works on video transcoding that have not been covered in this article, as well as some future directions.

736 citations

Journal ArticleDOI
TL;DR: An analytic solution for adaptive intra mode selection and joint source-channel rate control under time-varying wireless channel conditions is derived and significantly improves the end-to-end video quality in wireless video coding and transmission.
Abstract: We first develop a rate-distortion (R-D) model for DCT-based video coding incorporating the macroblock (MB) intra refreshing rate. For any given bit rate and intra refreshing rate, this model is capable of estimating the corresponding coding distortion even before a video frame is coded. We then present a theoretical analysis of the picture distortion caused by channel errors and the subsequent inter-frame propagation. Based on this analysis, we develop a statistical model to estimate such channel errors induced distortion for different channel conditions and encoder settings. The proposed analytic model mathematically describes the complex behavior of channel errors in a video coding and transmission system. Unlike other experimental approaches for distortion estimation reported in the literature, this analytic model has very low computational complexity and implementation cost, which are highly desirable in wireless video applications. Simulation results show that this model is able to accurately estimate the channel errors induced distortion with a minimum delay in processing. Based on the proposed source coding R-D model and the analytic channel-distortion estimation, we derive an analytic solution for adaptive intra mode selection and joint source-channel rate control under time-varying wireless channel conditions. Extensive experimental results demonstrate that this scheme significantly improves the end-to-end video quality in wireless video coding and transmission.

390 citations


Cites background or methods or result from "Analysis of video transmission over..."

  • ...Standard video coding schemes, such as H.263 and MPEG-4, mploy a motion-compensation based discrete cosine transform (MC-DCT) coding scheme....

    [...]

  • ...Notice that in standard video coding, such as H.263 and MPEG-4,monotonically increases with ....

    [...]

  • ...The channel-distortion model and the corresponding estimation scheme are described in Section III....

    [...]

  • ...REFERENCES [1] ITU-T, “Video coding for low bit rate communications,” ITU-T Recommendation H.263, version 1, version 2, Jan. 1998....

    [...]

  • ...Due to the limited bandwidth of the wireless channels, video signals have to be highly compressed by efficient coding algorithms, such as H.263 [1] and MPEG-4 [2]....

    [...]

Patent
Petrus J. L. Van Beek1
25 Jun 2003
TL;DR: In this article, a transmission system suitable for video where a sender encodes video for transmission to a receiver at an adjustable date rate is presented, where the data rate may be adjusted using a delay constraint that constrains the expected delay of transmitted packets.
Abstract: A transmission system suitable for video where a sender encodes video for transmission to a receiver at an adjustable date rate The data rate may be adjusted using a delay constraint that constrains the expected delay of transmitted packets The expected delay may be measured from a time that a transmitter encodes a packet to a time that a receiver decodes a packet

258 citations

Journal ArticleDOI
TL;DR: This work studies the problem of video streaming over multi-channel multi-radio multihop wireless networks, and develops fully distributed scheduling schemes with the goals of minimizing the video distortion and achieving certain fairness, and proposes a media-aware distortion-fairness strategy.
Abstract: An important issue of supporting multi-user video streaming over wireless networks is how to optimize the systematic scheduling by intelligently utilizing the available network resources while, at the same time, to meet each video's Quality of Service (QoS) requirement. In this work, we study the problem of video streaming over multi-channel multi-radio multihop wireless networks, and develop fully distributed scheduling schemes with the goals of minimizing the video distortion and achieving certain fairness. We first construct a general distortion model according to the network?s transmission mechanism, as well as the rate distortion characteristics of the video. Then, we formulate the scheduling as a convex optimization problem, and propose a distributed solution by jointly considering channel assignment, rate allocation, and routing. Specifically, each stream strikes a balance between the selfish motivation of minimizing video distortion and the global performance of minimizing network congestions. Furthermore, we extend the proposed scheduling scheme by addressing the fairness problem. Unlike prior works that target at users' bandwidth or demand fairness, we propose a media-aware distortion-fairness strategy which is aware of the characteristics of video frames and ensures max-min distortion-fairness sharing among multiple video streams. We provide extensive simulation results which demonstrate the effectiveness of our proposed schemes.

242 citations


Cites background or methods from "Analysis of video transmission over..."

  • ...According to [21], Dcomp can be approximated by:...

    [...]

  • ...For the distortion of wireless video transmission, we employ an additive model to capture the total video distortion as [10], [21], [22], and the overall distortion Dall can be obtained by:...

    [...]

  • ...where α depends on parameters related to the compressed video sequence [21]....

    [...]

Proceedings ArticleDOI
06 Apr 2003
TL;DR: A model is proposed that accurately estimates the expected distortion by explicitly accounting for the loss pattern, inter-frame error propagation, and the correlation between error frames and the accuracy of the proposed model is validated with JVT/H.
Abstract: Video communication is often afflicted by various forms of losses, such as packet loss over the Internet. The paper examines the question of whether the packet loss pattern, and in particular the burst length, is important for accurately estimating the expected mean-squared error distortion. Specifically, we (1) verify that the loss pattern does have a significant effect on the resulting distortion, (2) explain why a loss pattern, for example a burst loss, generally produces a larger distortion than an equal number of isolated losses, and (3) propose a model that accurately estimates the expected distortion by explicitly accounting for the loss pattern, inter-frame error propagation, and the correlation between error frames. The accuracy of the proposed model is validated with JVT/H.26L coded video and previous frame concealment, where for most sequences the total distortion is predicted to within /spl plusmn/0.3 dB for burst loss of length two packets, as compared to prior models which underestimate the distortion by about 1.5 dB. Furthermore, as the burst length increases, our prediction is within /spl plusmn/0.7 dB, while prior models degrade and underestimate the distortion by over 3 dB.

209 citations


Cites background or methods from "Analysis of video transmission over..."

  • ...The problem of error-resilient video communication has received significant attention in recent years, and a variety of techniques have been proposed, including intra/inter-mode switching [1, 2], dynamic control of prediction dependencies [3], forward error correction [4], and multiple description coding [5]....

    [...]

  • ...Prior work on modeling the effect of losses generally model the distortion as being proportional to the number of losses that occur [2, 7]....

    [...]

  • ...For example [2] carefully analyzes and models the distortion for a single (isolated) loss (accounting for error propagation, intra refresh, and spatial filtering), and model the effect of multiple losses as the superposition of multiple independent losses....

    [...]

  • ...In [2], the loop filter is approximated by a Gaussian low-pass filter....

    [...]

References
More filters
Book
01 Jan 1991
TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.
Abstract: Preface to the Second Edition. Preface to the First Edition. Acknowledgments for the Second Edition. Acknowledgments for the First Edition. 1. Introduction and Preview. 1.1 Preview of the Book. 2. Entropy, Relative Entropy, and Mutual Information. 2.1 Entropy. 2.2 Joint Entropy and Conditional Entropy. 2.3 Relative Entropy and Mutual Information. 2.4 Relationship Between Entropy and Mutual Information. 2.5 Chain Rules for Entropy, Relative Entropy, and Mutual Information. 2.6 Jensen's Inequality and Its Consequences. 2.7 Log Sum Inequality and Its Applications. 2.8 Data-Processing Inequality. 2.9 Sufficient Statistics. 2.10 Fano's Inequality. Summary. Problems. Historical Notes. 3. Asymptotic Equipartition Property. 3.1 Asymptotic Equipartition Property Theorem. 3.2 Consequences of the AEP: Data Compression. 3.3 High-Probability Sets and the Typical Set. Summary. Problems. Historical Notes. 4. Entropy Rates of a Stochastic Process. 4.1 Markov Chains. 4.2 Entropy Rate. 4.3 Example: Entropy Rate of a Random Walk on a Weighted Graph. 4.4 Second Law of Thermodynamics. 4.5 Functions of Markov Chains. Summary. Problems. Historical Notes. 5. Data Compression. 5.1 Examples of Codes. 5.2 Kraft Inequality. 5.3 Optimal Codes. 5.4 Bounds on the Optimal Code Length. 5.5 Kraft Inequality for Uniquely Decodable Codes. 5.6 Huffman Codes. 5.7 Some Comments on Huffman Codes. 5.8 Optimality of Huffman Codes. 5.9 Shannon-Fano-Elias Coding. 5.10 Competitive Optimality of the Shannon Code. 5.11 Generation of Discrete Distributions from Fair Coins. Summary. Problems. Historical Notes. 6. Gambling and Data Compression. 6.1 The Horse Race. 6.2 Gambling and Side Information. 6.3 Dependent Horse Races and Entropy Rate. 6.4 The Entropy of English. 6.5 Data Compression and Gambling. 6.6 Gambling Estimate of the Entropy of English. Summary. Problems. Historical Notes. 7. Channel Capacity. 7.1 Examples of Channel Capacity. 7.2 Symmetric Channels. 7.3 Properties of Channel Capacity. 7.4 Preview of the Channel Coding Theorem. 7.5 Definitions. 7.6 Jointly Typical Sequences. 7.7 Channel Coding Theorem. 7.8 Zero-Error Codes. 7.9 Fano's Inequality and the Converse to the Coding Theorem. 7.10 Equality in the Converse to the Channel Coding Theorem. 7.11 Hamming Codes. 7.12 Feedback Capacity. 7.13 Source-Channel Separation Theorem. Summary. Problems. Historical Notes. 8. Differential Entropy. 8.1 Definitions. 8.2 AEP for Continuous Random Variables. 8.3 Relation of Differential Entropy to Discrete Entropy. 8.4 Joint and Conditional Differential Entropy. 8.5 Relative Entropy and Mutual Information. 8.6 Properties of Differential Entropy, Relative Entropy, and Mutual Information. Summary. Problems. Historical Notes. 9. Gaussian Channel. 9.1 Gaussian Channel: Definitions. 9.2 Converse to the Coding Theorem for Gaussian Channels. 9.3 Bandlimited Channels. 9.4 Parallel Gaussian Channels. 9.5 Channels with Colored Gaussian Noise. 9.6 Gaussian Channels with Feedback. Summary. Problems. Historical Notes. 10. Rate Distortion Theory. 10.1 Quantization. 10.2 Definitions. 10.3 Calculation of the Rate Distortion Function. 10.4 Converse to the Rate Distortion Theorem. 10.5 Achievability of the Rate Distortion Function. 10.6 Strongly Typical Sequences and Rate Distortion. 10.7 Characterization of the Rate Distortion Function. 10.8 Computation of Channel Capacity and the Rate Distortion Function. Summary. Problems. Historical Notes. 11. Information Theory and Statistics. 11.1 Method of Types. 11.2 Law of Large Numbers. 11.3 Universal Source Coding. 11.4 Large Deviation Theory. 11.5 Examples of Sanov's Theorem. 11.6 Conditional Limit Theorem. 11.7 Hypothesis Testing. 11.8 Chernoff-Stein Lemma. 11.9 Chernoff Information. 11.10 Fisher Information and the Cram-er-Rao Inequality. Summary. Problems. Historical Notes. 12. Maximum Entropy. 12.1 Maximum Entropy Distributions. 12.2 Examples. 12.3 Anomalous Maximum Entropy Problem. 12.4 Spectrum Estimation. 12.5 Entropy Rates of a Gaussian Process. 12.6 Burg's Maximum Entropy Theorem. Summary. Problems. Historical Notes. 13. Universal Source Coding. 13.1 Universal Codes and Channel Capacity. 13.2 Universal Coding for Binary Sequences. 13.3 Arithmetic Coding. 13.4 Lempel-Ziv Coding. 13.5 Optimality of Lempel-Ziv Algorithms. Compression. Summary. Problems. Historical Notes. 14. Kolmogorov Complexity. 14.1 Models of Computation. 14.2 Kolmogorov Complexity: Definitions and Examples. 14.3 Kolmogorov Complexity and Entropy. 14.4 Kolmogorov Complexity of Integers. 14.5 Algorithmically Random and Incompressible Sequences. 14.6 Universal Probability. 14.7 Kolmogorov complexity. 14.9 Universal Gambling. 14.10 Occam's Razor. 14.11 Kolmogorov Complexity and Universal Probability. 14.12 Kolmogorov Sufficient Statistic. 14.13 Minimum Description Length Principle. Summary. Problems. Historical Notes. 15. Network Information Theory. 15.1 Gaussian Multiple-User Channels. 15.2 Jointly Typical Sequences. 15.3 Multiple-Access Channel. 15.4 Encoding of Correlated Sources. 15.5 Duality Between Slepian-Wolf Encoding and Multiple-Access Channels. 15.6 Broadcast Channel. 15.7 Relay Channel. 15.8 Source Coding with Side Information. 15.9 Rate Distortion with Side Information. 15.10 General Multiterminal Networks. Summary. Problems. Historical Notes. 16. Information Theory and Portfolio Theory. 16.1 The Stock Market: Some Definitions. 16.2 Kuhn-Tucker Characterization of the Log-Optimal Portfolio. 16.3 Asymptotic Optimality of the Log-Optimal Portfolio. 16.4 Side Information and the Growth Rate. 16.5 Investment in Stationary Markets. 16.6 Competitive Optimality of the Log-Optimal Portfolio. 16.7 Universal Portfolios. 16.8 Shannon-McMillan-Breiman Theorem (General AEP). Summary. Problems. Historical Notes. 17. Inequalities in Information Theory. 17.1 Basic Inequalities of Information Theory. 17.2 Differential Entropy. 17.3 Bounds on Entropy and Relative Entropy. 17.4 Inequalities for Types. 17.5 Combinatorial Bounds on Entropy. 17.6 Entropy Rates of Subsets. 17.7 Entropy and Fisher Information. 17.8 Entropy Power Inequality and Brunn-Minkowski Inequality. 17.9 Inequalities for Determinants. 17.10 Inequalities for Ratios of Determinants. Summary. Problems. Historical Notes. Bibliography. List of Symbols. Index.

45,034 citations

Journal ArticleDOI
01 May 1998
TL;DR: In this paper, a review of error control and concealment in video communication is presented, which are described in three categories according to the roles that the encoder and decoder play in the underlying approaches.
Abstract: The problem of error control and concealment in video communication is becoming increasingly important because of the growing interest in video delivery over unreliable channels such as wireless networks and the Internet. This paper reviews the techniques that have been developed for error control and concealment. These techniques are described in three categories according to the roles that the encoder and decoder play in the underlying approaches. Forward error concealment includes methods that add redundancy at the source end to enhance error resilience of the coded bit streams. Error concealment by postprocessing refers to operations at the decoder to recover the damaged areas based on characteristics of image and video signals. Last, interactive error concealment covers techniques that are dependent on a dialogue between the source and destination. Both current research activities and practice in international standards are covered.

1,611 citations

01 Jan 1996

1,354 citations

Journal ArticleDOI
J. Ribas-Corbera, Shaw-Min Lei1
TL;DR: This work presents a simple rate control technique that achieves high quality and low buffer delay by smartly selecting the values of the quantization parameters in typical discrete cosine transform video coders, and implements this technique in H.263 and MPEG-4 coders.
Abstract: An important motivation for the development of the emerging H.263+ and MPEG-4 coding standards is to enhance the quality of highly compressed video for two-way, real-time communications. In these applications, the delay produced by bits accumulated in the encoder buffer must be very small, typically below 100 ms, and the rate control strategy is responsible for encoding the video with high quality and maintaining a low buffer delay. In this work, we present a simple rate control technique that achieves these two objectives by smartly selecting the values of the quantization parameters in typical discrete cosine transform video coders. To do this, we derive models for bit rate and distortion in this type of coders, in terms of the quantization parameters. Using Lagrange optimization, we minimize distortion subject to the target bit constraint, and obtain formulas that indicate how to choose the quantization parameters. We implement our technique in H.263 and MPEG-4 coders, and compare its performance to TMN7 and VM7 rate control when the encoder buffer is small, for a variety of video sequences and bit rates. This new method has been adopted as a rate control tool in the test model TMN8 of H.263+ and (with some modifications) in the verification model VM8 of MPEG-4.

717 citations

Journal ArticleDOI
TL;DR: The rationale behind the development of the EDGE concept is given, the technology will provide significantly higher user bit rates and spectral efficiency, and performance is addressed by means of system simulations.
Abstract: Two of the major second-generation standards, GSM and TDMA/136, have built the foundation to offer a common global radio access for data services. Through use of a common physical layer, EDGE, both standards will have the same evolutionary path toward providing third-generation services. EDGE is currently subject to standardization in TIA TR45.3 and ETSI SMG, a process which will be finalized at the end of 1999. Compared to the existing data services in GSM and TDMA/136, EDGE will provide significantly higher user bit rates and spectral efficiency. EDGE can be introduced in these systems in a smooth way, using existing frequency plans of already deployed networks. This article gives the rationale behind the development of the EDGE concept, presents the EDGE technology, and addresses performance by means of system simulations.

462 citations

Frequently Asked Questions (12)
Q1. What are the two important issues that can help to mitigate the effect of residual errors?

Fast resynchronization of the bitstream and error concealment are two important issues that can help to mitigate the effect of residual errors. 

A theoretical analysis of the overall mean squared error ( MSE ) in hybrid video coding is presented for the case of error prone transmission. The main focus of this paper is to show the accuracy of the derived analytical model and its applicability to the analysis and optimization of an entire video transmission system. 

Other prediction techniques like overlapped block motion compensation (OBMC) or deblocking filters inside the DPCM loop may also contribute to the overall loop filter. 

for a channel characterized by and , only one out of 10 000 blocks will have to be discarded, i.e., less than one GOB within 1000 frames. 

As source signals, the authors use the QCIF test sequences Mother&Daughter and Foreman which are encoded at 12.5 fps using 150 and 125 frames, respectively. 

a reduction of reduces the bit rate available to the video encoder and thus increases the distortion at the encoder regardless of transmission errors. 

Note that motion compensated prediction may cause spatial error propagation, such that errors may actually “survive” one INTRA update period . 

The general idea to use empirical models to describe DR performance has also been used for rate control as, for example, in [32], however, their focus is on the description of the overall performance, i.e., the average distortion for a whole sequence given and . 

A lot ofFEC would be needed to correct this error burst, thus lowering the available rate for the video bitstream in many blocks which are not affected by channels errors at all. 

it also depends on several implementation issues, like packetization, resynchronization, and error concealment, as well as on the encoded video sequence. 

For a given sequence, fixed packet size, and given decoder implementation, it can be shown that the error variance that is introduced can be expressed as(6)This linear relation is only valid for low residual error rates, i.e., . 

Together with the total bit rate , these two parameters completely describe the channel and can be used to, e.g., study the influence of burst errors versus independent symbol errors.