scispace - formally typeset
Search or ask a question
Journal Article•DOI•

A framework for efficient progressive fine granularity scalable video coding

Feng Wu1, Shipeng Li1, Ya-Qin Zhang1•
01 Mar 2001-IEEE Transactions on Circuits and Systems for Video Technology (IEEE)-Vol. 11, Iss: 3, pp 332-344
TL;DR: Experimental results show that the PFGS framework can improve the coding efficiency up to more than 1 dB over the FGS scheme in terms of average PSNR, yet still keeps all the original properties, such as fine granularity, bandwidth adaptation, and error recovery.
Abstract: A basic framework for efficient scalable video coding, namely progressive fine granularity scalable (PFGS) video coding is proposed. Similar to the fine granularity scalable (PGS) video coding in MPEG-4, the PFGS framework has all the features of FGS, such as fine granularity bit-rate scalability, channel adaptation, and error recovery. On the other hand, different from the PGS coding, the PFGS framework uses multiple layers of references with increasing quality to make motion prediction more accurate for improved video-coding efficiency. However, using multiple layers of references with different quality also introduces several issues. First, extra frame buffers are needed for storing the multiple reconstructed reference layers. This would increase the memory cost and computational complexity of the PFGS scheme. Based on the basic framework, a simplified and efficient PFGS framework is further proposed. The simplified PPGS framework needs only one extra frame buffer with almost the same coding efficiency as in the original framework. Second, there might be undesirable increase and fluctuation of the coefficients to be coded when switching from a low-quality reference to a high-quality one, which could partially offset the advantage of using a high-quality reference. A further improved PFGS scheme can eliminate the fluctuation of enhancement-layer coefficients when switching references by always using only one high-quality prediction reference for all enhancement layers. Experimental results show that the PFGS framework can improve the coding efficiency up to more than 1 dB over the FGS scheme in terms of average PSNR, yet still keeps all the original properties, such as fine granularity, bandwidth adaptation, and error recovery. A simple simulation of transmitting the PFGS video over a wireless channel further confirms the error robustness of the PFGS scheme, although the advantages of PFGS have not been fully exploited.
Citations
More filters
Journal Article•DOI•
Dapeng Wu1, Yiwei Thomas Hou2, Wenwu Zhu3, Ya-Qin Zhang3, Jon M. Peha1 •
TL;DR: Six key areas of streaming video are covered, including video compression, application-layer QoS control, continuous media distribution services, streaming servers, media synchronization mechanisms, and protocols for streaming media.
Abstract: Due to the explosive growth of the Internet and increasing demand for multimedia information on the Web, streaming video over the Internet has received tremendous attention from academia and industry. Transmission of real-time video typically has bandwidth, delay, and loss requirements. However, the current best-effort Internet does not offer any quality of service (QoS) guarantees to streaming video. Furthermore, for video multicast, it is difficult to achieve both efficiency and flexibility. Thus, Internet streaming video poses many challenges. In this article we cover six key areas of streaming video. Specifically, we cover video compression, application-layer QoS control, continuous media distribution services, streaming servers, media synchronization mechanisms, and protocols for streaming media. For each area, we address the particular issues and review major approaches and mechanisms. We also discuss the tradeoffs of the approaches and point out future research directions.

780 citations


Cites background from "A framework for efficient progressi..."

  • ...The essential difference between FGS and PFGS is that FGS only uses the base layer as a reference for motion prediction while PFGS uses multiple layers as references to reduce the prediction error, resulting in higher coding efficiency....

    [...]

  • ...PFGS shares the good features of FGS, such as fine granularity bit-rate scalability and error resilience....

    [...]

  • ...A variation of FGS is progressive fine granularity scalability (PFGS) [72]....

    [...]

  • ...Unlike FGS, which only has two layers, PFGS could have more than two layers....

    [...]

Journal Article•DOI•
TL;DR: The results show that at high SNR, the multiple description encoder does not need to fine-tune the optimization parameters of the system due to the correlated nature of the subcarriers, and FEC-based multiple description coding without temporal coding provides a greater advantage for smaller description sizes.
Abstract: Recently, multiple description source coding has emerged as an attractive framework for robust multimedia transmission over packet erasure channels. In this paper, we mathematically analyze the performance of n-channel symmetric FEC-based multiple description coding for a progressive mode of transmission over orthogonal frequency division multiplexing (OFDM) networks in a frequency-selective slowly-varying Rayleigh faded environment. We derive the expressions for the bounds of the throughput and distortion performance of the system in an explicit closed form, whereas the exact performance is given by an expression in the form of a single integration. Based on this analysis, the performance of the system can be numerically evaluated. Our results show that at high SNR, the multiple description encoder does not need to fine-tune the optimization parameters of the system due to the correlated nature of the subcarriers. It is also shown that, despite the bursty nature of the errors in a slow fading environment, FEC-based multiple description coding without temporal coding provides a greater advantage for smaller description sizes.

526 citations


Cites methods from "A framework for efficient progressi..."

  • ...In this paper, we mathematically analyze the performance of -channel symmetric FEC-based multiple description coding for a progressive mode of transmission over orthogonal frequency division multiplexing (OFDM) networks in a frequency-selective slowly-varying Rayleigh faded environment....

    [...]

Journal Article•DOI•
Yao Wang, Amy R. Reibman1, Shunan Lin2•
01 Jan 2005
TL;DR: In this article, the authors describe principles in designing multiple description coding (MDC) video coders employing temporal prediction and present several predictor structures that differ in their tradeoffs between mismatch-induced distortion and coding efficiency.
Abstract: Multiple description coding (MDC) is an effective means to combat bursty packet losses in the Internet and wireless networks. MDC is especially promising for video applications where retransmission is unacceptable or infeasible. When combined with multiple path transport (MPT), MDC enables traffic dispersion and hence reduces network congestion. This work describes principles in designing MD video coders employing temporal prediction and presents several predictor structures that differ in their tradeoffs between mismatch-induced distortion and coding efficiency. The paper also discusses example video communication systems integrating MDC and MPT.

494 citations


Cites methods from "A framework for efficient progressi..."

  • ...The MD-FEC algorithm discussed in Section III-G is applied to each GoP of the scalable bitstream from an MPEG4PFGS encoder [62] to generate MD video streams....

    [...]

Proceedings Article•DOI•
04 Nov 2003
TL;DR: A simple tree management algorithm is presented that provides the necessary path diversity and an adaptation framework for MDC based on scalable receiver feedback is described, which shows very significant benefits in using multiple distribution trees and MDC, with a 22 dB improvement in PSNR in some cases.
Abstract: We consider the problem of distributing "live" streaming media content to a potentially large and highly dynamic population of hosts. Peer-to-peer content distribution is attractive in this setting because the bandwidth available to serve content scales with demand. A key challenge, however, is making content distribution robust to peer transience. Our approach to providing robustness is to introduce redundance; both in network paths and in data. We use multiple, diverse distribution trees to provide redundancy in network paths and multiple description coding (MDC) to provide redundancy in data. We present a simple tree management algorithm that provides the necessary path diversity and describe an adaptation framework for MDC based on scalable receiver feedback. We evaluate these using MDC applied to real video data coupled with real usage traces from a major news site that experienced a large flash crowd for live streaming content. Our results show very significant benefits in using multiple distribution trees and MDC, with a 22 dB improvement in PSNR in some cases.

484 citations


Cites methods from "A framework for efficient progressi..."

  • ...The input stream is from a layered codec; in our implementation, we use a variant of the PFGS codec (also known as the SMART codec) reported in [37]....

    [...]

Journal Article•DOI•
TL;DR: A novel framework for fully scalable video coding that performs open-loop motion-compensated temporal filtering (MCTF) in the wavelet domain (in-band) is presented, and inspired by recent work on advanced prediction techniques, an algorithm for optimized multihypothesis temporal filtering is proposed.
Abstract: A novel framework for fully scalable video coding that performs open-loop motion-compensated temporal filtering (MCTF) in the wavelet domain (in-band) is presented in this paper. Unlike the conventional spatial-domain MCTF (SDMCTF) schemes, which apply MCTF on the original image data and then encode the residuals using the critically sampled discrete wavelet transform (DWT), the proposed framework applies the in-band MCTF (IBMCTF) after the DWT is performed in the spatial dimensions. To overcome the inefficiency of MCTF in the critically-sampled DWT, a complete-to-overcomplete DWT (CODWT) is performed. Recent theoretical findings on the CODWT are reviewed from the application perspective of fully-scalable IBMCTF, and constraints on the transform calculation that allow for fast and seamless resolution-scalable coding are established. Furthermore, inspired by recent work on advanced prediction techniques, an algorithm for optimized multihypothesis temporal filtering is proposed in this paper. The application of the proposed algorithm in MCTF-based video coding is demonstrated, and similar improvements as for the multihypothesis prediction algorithms employed in closed-loop video coding are experimentally observed. Experimental instantiations of the proposed IBMCTF and SDMCTF coders with multihypothesis prediction produce single embedded bitstreams, from which subsets are extracted to be compared against the current state-of-the-art in video coding.

193 citations


Cites methods from "A framework for efficient progressi..."

  • ...Then, the proposed framework is introduced as a modification of temporal filtering that allows the independent operation across different video resolutions by operating in the transform domain....

    [...]

References
More filters
Journal Article•DOI•
J.M. Shapiro1•
TL;DR: The embedded zerotree wavelet algorithm (EZW) is a simple, yet remarkably effective, image compression algorithm, having the property that the bits in the bit stream are generated in order of importance, yielding a fully embedded code.
Abstract: The embedded zerotree wavelet algorithm (EZW) is a simple, yet remarkably effective, image compression algorithm, having the property that the bits in the bit stream are generated in order of importance, yielding a fully embedded code The embedded code represents a sequence of binary decisions that distinguish an image from the "null" image Using an embedded coding algorithm, an encoder can terminate the encoding at any point thereby allowing a target rate or target distortion metric to be met exactly Also, given a bit stream, the decoder can cease decoding at any point in the bit stream and still produce exactly the same image that would have been encoded at the bit rate corresponding to the truncated bit stream In addition to producing a fully embedded bit stream, the EZW consistently produces compression results that are competitive with virtually all known compression algorithms on standard test images Yet this performance is achieved with a technique that requires absolutely no training, no pre-stored tables or codebooks, and requires no prior knowledge of the image source The EZW algorithm is based on four key concepts: (1) a discrete wavelet transform or hierarchical subband decomposition, (2) prediction of the absence of significant information across scales by exploiting the self-similarity inherent in images, (3) entropy-coded successive-approximation quantization, and (4) universal lossless data compression which is achieved via adaptive arithmetic coding >

5,559 citations

Journal Article•DOI•
Yao Wang, Qin-Fan Zhu1•
01 May 1998
TL;DR: In this paper, a review of error control and concealment in video communication is presented, which are described in three categories according to the roles that the encoder and decoder play in the underlying approaches.
Abstract: The problem of error control and concealment in video communication is becoming increasingly important because of the growing interest in video delivery over unreliable channels such as wireless networks and the Internet. This paper reviews the techniques that have been developed for error control and concealment. These techniques are described in three categories according to the roles that the encoder and decoder play in the underlying approaches. Forward error concealment includes methods that add redundancy at the source end to enhance error resilience of the coded bit streams. Error concealment by postprocessing refers to operations at the decoder to recover the damaged areas based on characteristics of image and video signals. Last, interactive error concealment covers techniques that are dependent on a dialogue between the source and destination. Both current research activities and practice in international standards are covered.

1,611 citations

Proceedings Article•DOI•
28 Aug 1996
TL;DR: The RLM protocol is described, its performance is evaluated with a preliminary simulation study that characterizes user-perceived quality by assessing loss rates over multiple time scales, and the implementation of a software-based Internet video codec is discussed.
Abstract: State of the art, real-time, rate-adaptive, multimedia applications adjust their transmission rate to match the available network capacity. Unfortunately, this source-based rate-adaptation performs poorly in a heterogeneous multicast environment because there is no single target rate --- the conflicting bandwidth requirements of all receivers cannot be simultaneously satisfied with one transmission rate. If the burden of rate-adaption is moved from the source to the receivers, heterogeneity is accommodated. One approach to receiver-driven adaptation is to combine a layered source coding algorithm with a layered transmission system. By selectively forwarding subsets of layers at constrained network links, each user receives the best quality signal that the network can deliver. We and others have proposed that selective-forwarding be carried out using multiple IP-Multicast groups where each receiver specifies its level of subscription by joining a subset of the groups. In this paper, we extend the multiple group framework with a rate-adaptation protocol called Receiver-driven Layered Multicast, or RLM. Under RLM, multicast receivers adapt to both the static heterogeneity of link bandwidths as well as dynamic variations in network capacity (i.e., congestion). We describe the RLM protocol and evaluate its performance with a preliminary simulation study that characterizes user-perceived quality by assessing loss rates over multiple time scales. For the configurations we simulated, RLM results in good throughput with transient short-term loss rates on the order of a few percent and long-term loss rates on the order of one percent. Finally, we discuss our implementation of a software-based Internet video codec and its integration with RLM.

1,284 citations

Journal Article•DOI•
TL;DR: A layered video compression algorithm which, when combined with RLM, provides a comprehensive solution for scalable multicast video transmission in heterogeneous networks.
Abstract: The "Internet Multicast Backbone," or MBone, has risen from a small, research curiosity to a large-scale and widely used communications infrastructure. A driving force behind this growth was the development of multipoint audio, video, and shared whiteboard conferencing applications. Because these real-time media are transmitted at a uniform rate to all of the receivers in the network, a source must either run at the bottleneck rate or overload portions of its multicast distribution tree. We overcome this limitation by moving the burden of rate adaptation from the source to the receivers with a scheme we call receiver-driven layered multicast, or RLM. In RLM, a source distributes a hierarchical signal by striping the different layers across multiple multicast groups, and receivers adjust their reception rate by simply joining and leaving multicast groups. We describe a layered video compression algorithm which, when combined with RLM, provides a comprehensive solution for scalable multicast video transmission in heterogeneous networks. In addition to a layered representation, our coder has low complexity (admitting an efficient software implementation) and high loss resilience (admitting robust operation in loosely controlled environments like the Internet). Even with these constraints, our hybrid DCT/wavelet-based coder exhibits good compression performance. It outperforms all publicly available Internet video codecs while maintaining comparable run-time performance. We have implemented our coder in a "real" application-the UCB/LBL videoconferencing tool vic. Unlike previous work on layered video compression and transmission, we have built a fully operational system that is currently being deployed on a very large scale over the MBone.

576 citations

Journal Article•DOI•
R. Talluri1•
TL;DR: In this paper, error resilience aspects of the video coding techniques that are standardized in the ISO MPEC-4 standard are described, including resynchronization strategies, data partitioning, reversible variable length codes (VLCs), and header extension codes.
Abstract: This article describes error resilience aspects of the video coding techniques that are standardized in the ISO MPEC-4 standard. It begins with a description of the general problems in robust wireless video transmission. The specific tools adopted into the ISO MPEG-4 standard to enable the communication of compressed video data over noisy wireless channels are presented in detail. These techniques include resynchronization strategies, data partitioning, reversible variable length codes (VLCs), and header extension codes. An overview of the evolving ISO MPEG-4 standard and its current status are described.

276 citations


"A framework for efficient progressi..." refers methods in this paper

  • ...To deal with this problem, some error resilience and concealment methods are introduced in [18], [ 19 ]....

    [...]