Journal Article•DOI•

Analysis of video transmission over lossy channels

K. Stuhlmuller, N. Farber¹, M. Link², Bernd Girod³•Institutions (3)

University of Erlangen-Nuremberg¹, Alcatel-Lucent², Stanford University³

01 Jun 2000-IEEE Journal on Selected Areas in Communications (IEEE)-Vol. 18, Iss: 6, pp 1012-1032

TL;DR: The main focus of this paper is to show the accuracy of the derived analytical model and its applicability to the analysis and optimization of an entire video transmission system.

read less

Abstract: A theoretical analysis of the overall mean squared error (MSE) in hybrid video coding is presented for the case of error prone transmission. Our model covers the complete transmission system including the rate-distortion performance of the video encoder, forward error correction, interleaving, and the effect of error concealment and interframe error propagation at the video decoder. The channel model used is a 2-state Markov model describing burst errors on the symbol level. Reed-Solomon codes are used for forward error correction. Extensive simulation results using an H.263 video codec are provided for verification. Using the model, the optimal tradeoff between INTRA and INTER coding as well as the optimal channel code rate can be determined for given channel parameters by minimizing the expected MSE at the decoder. The main focus of this paper is to show the accuracy of the derived analytical model and its applicability to the analysis and optimization of an entire video transmission system.

...read moreread less

Summary (2 min read)

Jump to: [Introduction] – [A. Overview] – [B. Simulation Environment] – [C. Distortion Measure] – [A. Optimal INTRA Rate] – [B. Optimal FEC Code Rate] – [C. Optimal Parameter Selection for the Transmission System] and [A. Derivation of Block Error Density]

Introduction

Interestingly, even this less ambitious problem is not well investigated in the literature.
Because of this interaction of system components, the influence of individual parameters is difficult to understand, and the design of the overall system might become a formidable task.
In this paper the authors consider only a single layer codec but include the effects of transmission errors and INTRA coding as well as the distortion-rate behavior of the video encoder.
Finally, joint optimization of source and channel coding parameters is investigated in Section V-C. II.

A. Overview

And introduce the most important model parameters.the authors.
Often, this in- TABLE I SUMMARY OF MODEL PARAMETERS volves packetization and some form oferror control.
Fast resynchronization of the bitstream and error concealment are two important issues that can help to mitigate the effect of residual errors.
First consider a variation of the code rate (see Section V-A).
First, a reduction of reduces the bit rate available to the video encoder and thus increases the distortion at the encoder regardless of transmission errors.

B. Simulation Environment

The simulation environment the authors use in this paper to verify the derived model is described as follows.
As source signals, the authors use the QCIF test sequencesMother&DaughterandForemanwhich are encoded at 12.5 fps using 150 and 125 frames, respectively.
This rate control reduces buffer variations to an acceptable amount, and hence allows the transmission over a constant bit rate channel with limited delay.
In either case, error concealment is done for any GOB that overlaps with the lost packet.

C. Distortion Measure

For the evaluation of the video transmission system, it is necessary to average the distortion over the whole sequence in order to provide a single figure of merit.
In the following section the authors model the distortion-rate performance of the video encoder.
The authors have found that the relationship with is approximately linear, i.e., (5) such that the total number of model parameters is six.
Errors that are introduced at a given point in time propagate due to the recursive structure of the decoder.

A. Optimal INTRA Rate

The influence of the INTRA rateon the decoded picture distortion is studied for a fixed channel code rate .
On the one hand, an increased percentage of INTRA coded macroblocks helps to reduce interframe error propagation, and therefore reducesas described by (8) and (9).
It can be seen that the model gives a very good approximation of the PSNR at the decoder.
Therefore, the INTRA mode can be used more generously, and higher optimal INTRA rates result.
On the other hand, the exact selection ofis less critical, since the optimum is rather flat.

B. Optimal FEC Code Rate

Analogous to the previous subsection, the authors now study the influence of the channel code rateon the decoded video quality PSNR for a fixed INTRA rate .
Fig. 8 shows that their model approximates the PSNRat the video decoder for different channel code rates very well.
As explained in Section III-B, this is due to the fact that the introduced errors are not independent any more.
Note that the variation of PSNRas a function of is more severe for theForemansequence than for theMother&Daughter sequence.
More importantly, the same reduction in code rate is more effective for theForemansequence because of the increased block size.

C. Optimal Parameter Selection for the Transmission System

In this subsection the authors optimize the rate of INTRA coded macroblocks and the channel code ratejointly.
Fig. 11 shows the optimal INTRA rate and the optimal channel code rate for a transmission over burst channels with different average burst lengths and symbol error rates in the range % %.
The authors have derived a theoretical framework for the decoded picture quality after video transmission over lossy channels.
In contrast, for bursty channels the use of FEC is limited and the INTRA update is essential.
The authors are mainly interested in the variance of the propagated error signal and in its average over time.

A. Derivation of Block Error Density

Then the probability of errors within a block of symbols is where is the average error probability.
From 1993 until 1999, he was a Chaired Professor of Electrical Engineering/Telecommunications at University of Erlangen-Nuremberg, Germany, and the Head of the Telecommunications Institute I, co-directing the Telecommunications Laboratory.
He has served as the Chairman of the Electrical Engineering Department from 1995 to 1997, and as Director of the Center of Excellence “3-D Image Analysis and Synthesis” from 1995 to 1999.

Did you find this useful? Give us your feedback

Figures (13)

Fig. 2. Distortion-rate curves at the encoder for the test sequencesMother&Daughter(top) andForeman(bottom) for 0%, 1%, 3%, 11%, 22%, and 33% (44%, 55%) of INTRA coded macroblocks .

Fig. 6. Effective average burst lengthL over the number of interleaved blocksi for P = 0:1; 0:2; 0:3; 0:4; 0:5. The channel burst length without interleaving is L = 8.

Fig. 10. Optimal PSNRat decoder and corresponding encoder PSNRe over the code rater. The INTRA rate is used as a free parameter for optimization. The channel is characterized by the average burst lengthL = 8 and the symbol error rateP = 1%. The test sequences areMother&Daughter(top) andForeman (bottom). The measurements (o) correspond very well with the performance bound predicted by the model (—).

Fig. 1. Video transmission scheme. The video encoder is described by its distortion-rate functionD ( ;R ) depending on the INTRA rate . The influence of the transmission using FEC is described by the residual word error rateP (r; P ; L ) depending on the channel code rater and the channel characteristicsP (error probability) andL (average burst length). At the video decoder, the effect of error propagation is given byD ( ; P ). The overall decoded video quality is denotedD .

Fig. 3. Block diagram of a hybrid motion compensated video codec with transmission errors.

Fig. 5. Residual word error rateP for the investigated error control channel.Top: Variation of the symbol error rateP for a fixed average burst length of L = 8. Bottom: Variation of the average burst lengthL for a fixed symbol error rate ofP = 0:1. The block size of the(n; k) RS code is set ton = 88 byte. Each curve corresponds to a fixed code rate= k=n.

Fig. 9. Optimal PSNRat decoder and corresponding encoder PSNRover the INTRA rate . The channel code rater is used as a free parameter for optimization. The channel is characterized by the average burst lengthL = 8 and the symbol error rateP = 1%. The test sequences areMother&Daughter(top) andForeman (bottom). The measurements (o) correspond very well with the performance bound predicted by the model (—).

Fig. 8. Measured (o) and modeled (—) PSNRat the decoder over the channel code rate. The channel is characterized by the average burst lengthL = 8 and the symbol error ratesP = 0%; 1%; 2:5%; 5%. The INTRA rate is = 3%. The test sequences areMother&Daughter(top) andForeman(bottom).

Fig. 4. Distortion caused by transmission errors(D ) over the INTRA rate( ). The measurements (o) are compared to the model calculation (—) for three

Fig. 14. PSNR at the decoder for optimally chosen parameters andr is plotted over the symbol error ratesP (top) and over the average burst lengthL (bottom). BSC denotes the memoryless channel (independent symbol errors). The test sequence isMother&Daughter.

Fig. 12. Optimal parameters andr for a transmission of the test sequenceMother&Daughterover channels with different average burst lengthsL for symbol error ratesP = 1%; 5%, and10%.

Fig. 7. Measured (o) and modeled (—) PSNRat the decoder over the INTRA rate . The channel is characterized by the average burst lengthL = 8 and the symbol error ratesP = 0%; 1%; 2:5%; 5%. The channel code rate isr = 0:64. The test sequences areMother&Daughter(top) andForeman(bottom).

Fig. 11. Optimal parameters andr for a transmission of the test sequenceMother&Daughterover channels with different symbol error ratesP and average burst lengthsL . The optimal parameters for a channel with independent errors (BSC) are also shown.

Content maybe subject to copyright Report

1012 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000

Analysis of Video Transmission over Lossy Channels

Klaus Stuhlmüller, Niko Färber, Member, IEEE, Michael Link, and Bernd Girod, Fellow, IEEE

Abstract—A theoretical analysis of the overall mean squared

error (MSE) in hybrid video coding is presented for the case of

error prone transmission. Our model covers the complete trans-

mission system including the rate-distortion performance of the

video encoder, forward error correction, interleaving, and the ef-

fect of error concealment and interframe error propagation at the

video decoder. The channel model used is a 2-state Markov model

describing burst errors on the symbol level. Reed–Solomon codes

are used for forward error correction. Extensive simulation results

using an H.263 video codec are provided for verification. Using the

model, the optimal tradeoff between INTRA and INTER coding as

well as the optimal channel code rate can be determined for given

channel parameters by minimizing the expected MSE at the de-

coder. The main focus of this paper is to show the accuracy of the

derived analytical model and its applicability to the analysis and

optimization of an entire video transmission system.

Index Terms—Error resilience, intra-update, joint source-

channel coding, robust video transmission, tradeoff source-

channel coding, video transmission system model.

I. INTRODUCTION

O TRANSMIT video over noisy channels, one uses both

source and channel coding. According to Shannon’s

Separation Principle, these components can be designed

independently without loss in performance [1]. However, this

important information-theoretic result is based on several

assumptions that might break down in practice. In particular,

it is based on 1) the assumption of an infinite block length for

both source and channel coding, and 2) an exact and complete

knowledge of the statistics of the (ergodic) transmission

channel. As a result of the first assumption, the Separation

Principle cannot be applied without performance loss to

applications with real-time constraints. This holds especially

for bursty channels which are characteristic for mobile radio

transmission or the Internet. As a consequence of the second

assumption, it applies only to point-to-point communications.

Therefore, Joint Source-Channel Coding and Error Resilient

Coding can be advantageous, in practice, and have become an

important research topic. Recent reviews and special issues in

the context of video coding include [2]–[5].

Despite increased research activity, joint source-channel

coding schemes for video are still in their infancy today. A

Manuscript received May 5, 1999; revised November 11, 1999. This work

was supported in part by the German DFN-Verein.

K. Stuhlmüller and N. Färber are with the Telecommunications Laboratory,

University of Erlangen-Nuremberg, Cauerstrasse 7/NT, 91058 Erlangen, Ger-

many (e-mail: stuhl@LNT.de; faerber@LNT.de).

M. Link is with Lucent Technologies, Nuremberg, Germany (e-mail:

mlink@lucent.com).

B. Girod is with the Information Systems Laboratory, Stanford University,

Stanford, CA USA (e-mail: girod@ee.stanford.edu).

Publisher Item Identifier S 0733-8716(00)04338-9.

pragmatic approach for today’s state of the art is to keep the

source coder and the channel coder separate, but to optimize

their parameters jointly. A key problem of this optimization

is the bit allocation between source and channel coding that

is also discussed in this paper. Interestingly, even this less

ambitious problem is not well investigated in the literature.

Often, the underlying transmission system is regarded as a

“black box,” and the video codec has to cope with whatever

bit error rate or packet error rate is offered. This approach

is indeed justified if video is added as another application

on top of a fixed transmission system. However, current and

future transmission systems provide increasing flexibility at

the interface to the transport level. For example, the enhanced

air interface of the GSM system (EDGE [6]) will include a

flexible link adaptation where either 1/1, 3/4, 2/3, or 1/2 of the

total bit rate can be allocated to the source while the rest is used

for channel coding. In fact, the advantage of this flexibility for

speech transmission is already exploited in the next generation

speech codec of the GSM system, called Adaptive Multi Rate

(AMR, [7]). In the future, software radios may even allow

configuration of the modulation scheme [8]. This trend toward

increased flexibility allows inclusion of channel coding (and

modulation) into the optimization.

More flexibility, on the other hand, also increases the com-

plexity of the system and makes parameter optimization more

difficult. The overall performance depends on many interre-

lated issues, such as the distortion-rate performance and error

resilience of the source codec, the error correction capability of

the channel codec, andthe characteristicofthechannel.Because

of this interaction of system components, the influence of indi-

vidual parameters is difficult to understand, and the design of

the overall system might become a formidable task. Often, sim-

ulations are used to study overallsystem performance (e.g., [9]).

However, measurements can rarely be generalized, and provide

only limited insight in the underlying problem. Furthermore,

simulations can become very complex for a large parameter

space. It is therefore desirable to develop appropriate models

to study and understand the interaction and tradeoffs between

system parameters.

The scope of this paper is to provide such a model for a

complete video transmission system. We use this model to

analyze the overall performance as a function of the most

important system parameters. In particular, the optimum

bit allocation between source and channel coding is found

analytically while also considering the optimal tradeoff be-

tween INTER and INTRA coding. Similar investigations have

been performed for vector quantization [10] and Lempel–Ziv

compression [11]. However, no analysis has been presented for

motion-compensated video coding that forms the basis of all

common video coding standards, including H.261, H.263

STUHLMÜLLER et al.: ANALYSIS OF VIDEO TRANSMISSION OVER LOSSY CHANNELS 1013

Fig. 1. Video transmission scheme. The video encoder is described by its distortion-rate function

(

;R

)

depending on the INTRA rate



. The influence of

the transmission using FEC is described by the residual word error rate

(

r; P ;L

)

depending on the channel code rate

and the channel characteristics

(error probability) and

(average burst length). At the video decoder, the effect of error propagation is given by

(

;P

)

. The overall decoded video quality

is denoted

MPEG-1, MPEG-2, and MPEG-4 [12]–[17]. In previous work,

we addressed the related problem of optimal transmission of

a given scalable video bit stream over a packet network by

optimizing the unequal error protection for the layers of the

video stream [18], [19]. In this paper we consider only a single

layer codec but include the effects of transmission errors and

INTRA coding as well as the distortion-rate behavior of the

video encoder.

This paper is organized as follows. We first outline the trans-

mission system in Section II. In Section III we model a hybrid

motion compensated video codec. The distortion-rate perfor-

mance of the video encoder is analyzed in Section III-A, while

a theoretical framework for interframe error propagation is pre-

sented in Section III-B. The influence of channel coding and

channel parameters are discussed in Section IV. Then, we com-

bine the models to describe the overall system performance, and

show in Section V that our model can approximate the decoded

picture quality very accurately. The impact of INTRA coding

and FEC is studied in Sections V-A and B, respectively. Finally,

joint optimization of source and channel coding parameters is

investigated in Section V-C.

II. V

IDEO TRANSMISSION SYSTEM

A. Overview

In this section we provide an overview of the video transmis-

sion system under consideration, and introduce the most impor-

tant model parameters. As can be seen from Fig. 1, the system

consists of three parts: the video encoder, the video decoder, and

the error control channel, which is definedas the combination of

the channel codec and the channel [20]. These components are

described briefly in the following paragraphs and are discussed

in more detail in Sections III and IV. All model parameters are

summarized in Table I for quick reference.

We assume that a space-time discrete video signal is used as

input to the video encoder which is characterized by its opera-

tional distortion-rate (DR) function

; i.e., the average

distortion

is expressed as a function of the average bit rate

andINTRA rate . The common DR relationship is extended

by the INTRA rate because of its significant influence on error

resilience. In fact, it is used as the first important parameter for

system optimization in this paper.

After source coding, the compressed video bitstream is pre-

pared for transmission by the channel codec. Often, this in-

TABLE I

UMMARY OF MODEL PARAMETERS

volves packetization and some form of error control. In this

paper we focus on forward error correction (FEC) that can be

combined with interleaving to reduce the effect of burst errors.

More specifically, we assume an (

) Reed–Solomon (RS)

block code with a block size of

symbols including in-

formation symbols. The second important parameter that is used

for system optimization is the code rate

. By reducing

the code rate, more channel coding redundancy is added to each

codeword which improves the error correction capability of the

code while reducing the throughput at the same time.

After channel encoding, the RS codewords are transmitted

over the channel. We use a two-state Markov model to describe

errors on the symbol level. As intuitive channel parameters, we

use the average symbol error rate

and the average burst

length

. Together with the total bit rate , these two param-

eters completely describe the channel and can be used to, e.g.,

study the influence of burst errors versus independent symbol

errors. Furthermore, the selected channel model allows calcula-

tion of the residual word error rate

after channel

decoding from theparameters of the Markov model and the code

rate. Thus, the overall performance of the error control channel,

1014 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000

including a burst channel and an RS channel codec, can be de-

scribed analytically.

Finally, the influence of residual errors on the decoded video

quality has to be considered. Depending on the error resilience

capabilities of the video decoder, a single lost codeword may

cause severe image distortion. Fast resynchronization of the bit-

stream and error concealment are two important issues that can

help to mitigate the effect of residual errors. Another important

issue is interframe error propagation because errors may be vis-

ible over many consecutive frames. Therefore, a model for in-

terframe error propagation is derived in this paper that describes

the additional distortion at the decoder

as a function

of the INTRA rate

and the residual word error rate .

After this brief description of each system component, it is

interesting to discuss the interactions and tradeoffs that influ-

ence the overall distortion

. First consider a

variation of the code rate

(see Section V-A). Note that for

a given channel bit rate

, the code rate controls the bit al-

location between source and channel coding. This has two ef-

fects on the picture quality of the video signal at the decoder

output. First, a reduction of

reduces the bit rate available to

the video encoder and thus increases the distortion at the en-

coder regardless of transmission errors. The actual

increase

is determined by the operational DR function

of the

video encoder. On the other hand, the residual word error rate

is reduced when reducing

, determined by the properties of the

error control channel according to

. Finally, a re-

duction in

leads to a reduction in depending on

several implementation issues as discussed above. Considering

the total distortion

at the video decoder output, these inter-

actions of the various components make it difficult to select the

optimum code rate. Basically, the characteristic of each compo-

nent may have significant influence.

Now consider a variation of the INTRA rate

which is used

as the second important optimization parameter in this paper

(see Section V-B). Since INTRA coded macroblocks do not de-

pend on the previous frame, error propagation can be reduced

by increasing the number of INTRA coded macroblocks, thus

reducing

. However, INTRA coding also reduces the coding

efficiency compared to motion compensated prediction. Hence,

the distortion at the encoder

is increased for a fixed bit rate

. Whether or not an increase in is advantageous for the

overall distortion

depends on the actual amount

of increase/decrease in each component. This illustrates that

each component needs to be modeled accurately before system

optimization can be attempted. This is particularly true for a

joint optimization of

and (see Section V-C ).

B. Simulation Environment

The simulation environment we use in this paper to verify the

derived model is described as follows.As source signals, we use

the QCIF test sequences Mother&Daughter and Foreman which

are encoded at 12.5 fps using 150 and 125 frames, respectively.

The sequences are selected because of their different character-

istic in motion and spatial detail. Although the model can also

be applied to other test sequences (see [21]), we do not provide

additional results because the selected sequences are sufficient

to discuss the effect of different source statistics.

For source coding, we use an H.263 compliant video en-

coder. No H.263 options are used, however, each Group Of

Blocks (GOB) is encoded with a header to improve resynchro-

nization. The encoder operates at a constant bit rate

which

is enforced by a simple rate control that is described as fol-

lows. Each frame is encoded with a fixed quantizer step size,

which is adapted frame by frame to obtain a given target bit

budget. The adaptation of the quantizer step size is performed

as follows. First, the mode decision is performed according to

TMN5 [22] for the whole frame, and then the resulting predic-

tion error is transformed and quantized with different quantizer

step sizes. Finally, the value that minimizes the difference be-

tween the accumulated number of transmitted bits and target

bits is selected. This rate control reduces buffer variations to an

acceptable amount, and hence allows the transmission over a

constant bit rate channel with limited delay. In practice, other

rate control algorithms should be used that can further reduce

buffer variations at improved performance. However, since rate

control is not the focus of the paper, the above approach is suf-

ficient.

Another issue that is related to the coding control of the

video encoder is the INTRA update scheme employed. Several

schemes have been proposed in the literature that either con-

sider the activity of image regions [23], [24], vary the shape

of INTRA update patterns [9], or include the INTRA mode

decision in a rate-distortion optimized encoding framework

[25]–[27]. In a very common scheme, which is also recom-

mended in H.263, each macroblock is assigned a counter that

is incremented if the macroblock is encoded in interframe

mode. If the counter reaches a threshold

( update interval),

the macroblock is encoded in INTRA mode and the counter

is reset to zero. By assigning a different initial offset to each

macroblock, the updates of individual macroblocks can be

spread out in time. In our simulations, we use a very similar

update scheme, however, with a variable threshold

instead

of the fixed value of

that is recommended in H.263.

The only difference is that we also increment the counter for

skipped (i.e., UNCODED) macroblocks to guarantee a regular

update of all image regions.

The channel parameters are selected as follows. Considering

the different complexity of the sequences, we chose a total

channel bit rate of

kbps for Mother&Daughter

and

kbps for Foreman. This allows varia-

tion of the INTRA rate and code rate over a wide range

without suffering too high distortions or buffer overflows.

Unless otherwise noted, the average burst length is set to

while the symbol error rate is selected from the set

[%].

The parameters of the RS code are considered next. We

use the very common choice of 8 bit per symbol, i.e., one

symbol corresponds to one byte. The block size is set to the

average GOB size which results in

bytes (80 000/12.5/9

712 bit) for the Mother&Daughter and bytes

(200000/12.5/9

1778 bit) for the Foreman sequence. Note

that this limits the delay introduced by channel coding to one

GOB, and therefore also allows for conversational services

STUHLMÜLLER et al.: ANALYSIS OF VIDEO TRANSMISSION OVER LOSSY CHANNELS 1015

with their strict delay constraints. The amount of information

symbols

is varied in increments of 8 bytes to achieve different

code rates.

Finally, we need to consider the operation of the video de-

coder in the case of errors. If the RS decoder fails to correct

the transmission errors in a block, the video decoder receives

an error indication or detects that there has been an error due to

bit stream syntax violations. In either case, error concealment

is done for any GOB that overlaps with the lost packet. No spe-

cial packetization is used, i.e., new GOB’s are not necessarily

aligned with the beginning of a packet. For error concealment,

the previous-frame GOB is simply copied to the current frame

buffer.

C. Distortion Measure

For the evaluation of the video transmission system, it is nec-

essary to average the distortion over the whole sequence in order

to provide a single figure of merit. Even though the time aver-

aged squared error is somewhat questionable as a measure of

subjective quality, this approach is still very useful, e.g., to pro-

vide an overview for a large set of simulations. Therefore, the

video quality is measured as the Mean-Squared-Error (MSE)

averaged over all frames of the video sequence throughout this

paper. Since PSNR is a measure more common in the video

coding community, we use PSNR

MSE to il-

lustrate simulation results. Note that the average PSNR is often

computed by first computing the PSNR for each frame and av-

eraging in time afterwards. The definition used in this paper al-

lows a better theoretical analysis (see Section III-B) and is more

consistent with subjective quality for strong quality variations.

In practice, however, there is no significant difference between

the two definitions.

Note that we need to distinguish between the picture quality

at the encoder and the picture quality at the decoder.Using

describe the overall MSE for a whole sequence after encoding,

we obtain

PSNR

(1)

for the corresponding PSNR value. At the decoder side we need

to recall that the result depends on the probabilistic nature of

the channel. Hence, the averaged distortion over many channel

realizations has to be considered. For the simulation results in

this paper, we use 30 random channel realizations for each par-

ticular setting of the video transmission system and average the

MSE over all frames and realizations. The resulting MSE and

PSNR are denoted

and

PSNR

(2)

respectively. In order to ensure that the distortion at the decoder

is measured in a steady state, only the last 50 encoded frames

are used to calculate

and .

As mentioned above, the overall MSE

is actually a su-

perposition of two distortion types. The distortion caused by

signal compression

and the distortion which is caused

by residual errors and interframe error propagation. Assuming

that

and are uncorrelated, we can calculate the overall

MSE as

(3)

Our experiments indicate that this assumption is valid. Even

though transmission errors may be clustered around active

regions, and thus their magnitude may be correlated with the

coding errors, usually their sign is not correlated to the sign of

the coding errors.

However, it should be noted that (3) combines two distortion

types that are likely to be perceived differently. The distortion

is caused by signal compression and consists of blocking ar-

tifacts, mosquito noise, ringing, blurring, etc. The distortion in-

troduced by transmission errors

consists of severe destruc-

tion of image content and may be large and infrequent. Sub-

jective tests are needed to determine how

and shall be

combined to give the best possible approximation of subjective

quality.

If subjective tests show that, e.g., the distortion

caused

by transmission errors is more annoying than the distortion

caused by the video encoder

, (3) and (2) can be changed

to a weighted sum or some other function of

and . The

determination of such a subjective quality function is beyond

the scope of this paper and is left to future research.

III. A

NALYSIS OF THE VIDEO CODEC

In this section, we analyze the performance of the video en-

coder and decoder. Although we use the ITU-T H.263 [13] video

compression standard throughout this paper, the model derived

can be used for other codecs that are based on hybrid motion

compensation.

In the following section we model the distortion-rate perfor-

mance of the video encoder. Then we introduce an analytical

model for the error propagation at the video decoder which can

explain the cumulative effect of transmission errors. We focus

on the main results and refer to the Appendixes for most deriva-

tions.

A. Video Encoder

In this section we model the Distortion-Rate (DR) perfor-

mance of a hybrid motion compensated video encoder. The

proposed model is an empirical model that is not derived

analytically. Instead, we focus on the input–output behavior

of the video encoder and emphasize simplicity and usability

over a complete theoretical description. On the one hand,

this approach is taken because we want to describe a com-

plete transmission system, which requires the complexity of

individual components to be kept at a reasonable level. On

the other hand, we found that theoretically founded models

often cannot describe experimental results very accurately due

to simplistic assumptions. For example, such a theoretically

founded model for the performance of motion compensated

prediction is described in [28] and [29], where the DR per-

formance is analyzed by deriving the power spectral density

of the prediction error with respect to the probability density

function of the displacement error. Although this model pro-

vides very interesting insights, it cannot describe the measured

1016 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000

DR performance of an H.263 encoder with sufficient accuracy.

Similar problems can be observed for the description of the

DR performance in transform coding [30] and DCT coding in

particular. Although several empirical distortion-rate models

have been published (e.g., [31]–[34], they are usually used for

rate control and cannot be used to model the distortion of an

entire video encoder for a given rate.

To avoid these limitations without an increase in model com-

plexity, we use a simple equation that relates the distortion at the

encoder

to the relevant parameters. In the simulation sce-

nario that we consider, there are two parameters with a signifi-

cant impact on

, namely the source rate that is allocated to

the video encoder, and second, the percentage of INTRA coded

macroblocks (INTRA rate)

that is enforced by the coding con-

trol to improve error robustness. The general idea to use empir-

ical models to describe DR performance has also been used for

rate control as, for example, in [32], however, our focus is on

the description of the overall performance, i.e., the average dis-

tortion for a whole sequence given

and .

One drawback of this approach is that the necessary model

parameters cannot be derived from commonly used signal sta-

tistics, like variance, correlation, or the power spectral density.

Instead, the parameters need to be estimated by fitting the model

to a subset of measured data points from the DR curve. Since the

proposed model uses only six parameters (see below), the nec-

essary subset is relatively small and can be obtained with rea-

sonable complexity. However, the obtained parameters are spe-

cific for a given video sequence and video codec. Furthermore,

the interpretation of these parameters is not always obvious.

This makes it difficult to, e.g., extend results from a sequence

with “complex motion” to a sequence with “moderate motion.”

However, we found that the model can describe the DR perfor-

mance of a wide range of test sequences with very good accu-

racy, once the parameters are selected correctly. Furthermore,

the simplicity of the model significantly increases its usability

and thus, in practice, outweighs the described drawbacks. Nev-

ertheless, it should be noted that a model of similar simplicity

that is founded on theoretical analysis would be highly desir-

able.

We use the DR model

(4)

where

is the distortion of the encoded sequence, measured

as the MSE, and

is the output rate of the video encoder. The

remaining variables (

, and ) are the parameters of the

DR model which depend on the encoded sequence as well as on

the percentage of INTRA coded macroblocks

. We have found

that the relationship with

is approximately linear, i.e.,

(5)

such that the total number of model parameters is six. According

to (5), it is sufficient to measure the DR curves for only two

differentINTRA rates. Intermediate values can then be obtained

by linear interpolation. This is also the approach used in the

following to obtain the model parameters.

Fig. 2 shows that (4) and (5) approximate the DR performance

of the video encoder very accurately. Although the experimental

results are obtained with an H.263 encoder, the DR curves for

other hybrid motion compensated video encoders, e.g., H.261

[12], MPEG-1 [15], or MPEG-2 [16], exhibit very similar be-

havior.

The model (4) was fitted to the measured points for

and for

%( % for Foreman) INTRA coded

macroblocks. The fitting was done by minimizing the sum

of squared MSE differences between the model and the

measured points. This resulted in two sets of parameters

for each sequence. These two parameter sets

together consist of six values, thus allowing us to determine

, and from (5).

The model parameters

, and

are used to interpolate the DR curves for other INTRA

rates

. The intermediate curves in Fig. 2 for 3%, 6%, 11%, and

22% (and 33%, 44% for Foreman) were generated by using (4)

and interpolating the parameters according to (5). The maximal

PSNR deviation between the model fitted that way and the

measured DR points is 0.22 dB for the Mother&Daughter

sequence and 0.3 dB for the Foreman sequence (Fig. 2).

Note that the parameters

, and

characterize the coding of the input video sequence with

the given hybrid motion compensated encoder, in this example

Mother&Daughter or Foreman coded with H.263 in baseline

mode. The parameters depend very much on the spatial detail

and the amount of motion in the sequence; e.g., for a sequence

with high motion and little spatial detail

is low, whereas

for a sequence with moderate motion and high spatial detail

is high.

B. Video Decoder

While motion compensated prediction yields significant

gains in coding efficiency, it also introduces interframe error

propagation in the case of transmission errors. Since these

errors decay slowly, they are very annoying. To optimize the

overall performance of video transmission systems in noisy

environments, it is therefore important to consider the effect

of error propagation. While several heuristic approaches have

been investigated in the literature to reduce the influence of

error propagation (e.g., [23], [24], and [35]), up until now

no theoretical framework has been proposed to model the

influence of transmission errors on the decoded picture quality.

The model proposed in the following includes the effects of

INTRA coding and spatial loop filtering and corresponds to

simulation results very accurately.

Note that two differenttypes of errors contributeto the overall

distortion at the decoder. First, the errors that are caused by

signal compression at the encoder

and, second, errors that

are caused by residual errors which cannot be corrected by the

channel decoder. Since the first type of error is sufficiently de-

scribed by (4), we now focus on the second type of error and use

the variable

to refer to it.

A simplified block diagram of a hybrid motion compensated

video codec is illustrated in Fig. 3, together with the relevant pa-

rameters that are introduced in the following. We describe errors

HTML Viewer

Frequently Asked Questions (12)

Q1. What are the two important issues that can help to mitigate the effect of residual errors?

Fast resynchronization of the bitstream and error concealment are two important issues that can help to mitigate the effect of residual errors.

Q2. What have the authors contributed in "Analysis of video transmission over lossy channels" ?

A theoretical analysis of the overall mean squared error ( MSE ) in hybrid video coding is presented for the case of error prone transmission. The main focus of this paper is to show the accuracy of the derived analytical model and its applicability to the analysis and optimization of an entire video transmission system.

Q3. What other techniques may contribute to the overall loop filter?

Other prediction techniques like overlapped block motion compensation (OBMC) or deblocking filters inside the DPCM loop may also contribute to the overall loop filter.

Q4. How many blocks will have to be discarded?

for a channel characterized by and , only one out of 10 000 blocks will have to be discarded, i.e., less than one GOB within 1000 frames.

Q5. How many frames are used to encode the QCIF test sequences?

As source signals, the authors use the QCIF test sequences Mother&Daughter and Foreman which are encoded at 12.5 fps using 150 and 125 frames, respectively.

Q6. What is the effect of a reduction of the bit rate on the video encoder?

a reduction of reduces the bit rate available to the video encoder and thus increases the distortion at the encoder regardless of transmission errors.

Q7. What is the effect of motion compensated prediction on the error energy?

Note that motion compensated prediction may cause spatial error propagation, such that errors may actually “survive” one INTRA update period .

Q8. What is the general idea to use empirical models to describe DR performance?

The general idea to use empirical models to describe DR performance has also been used for rate control as, for example, in [32], however, their focus is on the description of the overall performance, i.e., the average distortion for a whole sequence given and .

Q9. What is the way to correct this error burst?

A lot ofFEC would be needed to correct this error burst, thus lowering the available rate for the video bitstream in many blocks which are not affected by channels errors at all.

Q10. What are the main factors that affect the error variance of a video decoder?

it also depends on several implementation issues, like packetization, resynchronization, and error concealment, as well as on the encoded video sequence.

Q11. What is the value of the linear relation for a given sequence?

For a given sequence, fixed packet size, and given decoder implementation, it can be shown that the error variance that is introduced can be expressed as(6)This linear relation is only valid for low residual error rates, i.e., .

Q12. What are the two parameters used to describe the channel?

Together with the total bit rate , these two parameters completely describe the channel and can be used to, e.g., study the influence of burst errors versus independent symbol errors.

Analysis of video transmission over lossy channels

Summary (2 min read)

Introduction

A. Overview

B. Simulation Environment

C. Distortion Measure

A. Optimal INTRA Rate

B. Optimal FEC Code Rate

C. Optimal Parameter Selection for the Transmission System

A. Derivation of Block Error Density

Figures (13)

Citations

Cites background or methods or result from "Analysis of video transmission over..."

Cites background or methods from "Analysis of video transmission over..."

Cites background or methods from "Analysis of video transmission over..."

References

Related Papers (5)

Frequently Asked Questions (12)

Q1. What are the two important issues that can help to mitigate the effect of residual errors?

Q2. What have the authors contributed in "Analysis of video transmission over lossy channels" ?

Q3. What other techniques may contribute to the overall loop filter?

Q4. How many blocks will have to be discarded?

Q5. How many frames are used to encode the QCIF test sequences?

Q6. What is the effect of a reduction of the bit rate on the video encoder?

Q7. What is the effect of motion compensated prediction on the error energy?

Q8. What is the general idea to use empirical models to describe DR performance?

Q9. What is the way to correct this error burst?

Q10. What are the main factors that affect the error variance of a video decoder?

Q11. What is the value of the linear relation for a given sequence?

Q12. What are the two parameters used to describe the channel?