scispace - formally typeset
Search or ask a question

Showing papers on "Video quality published in 2001"


01 Jan 2001
TL;DR: The Digital Video Quality (DVQ) metric as discussed by the authors is based on the Discrete Cosine Transform (DCT) and incorporates aspects of early visual processing, including light adaptation, luminance and chromatic channels, spatial and temporal filtering, spatial frequency channels, contrast masking, and probability summation.
Abstract: The growth of digital video has given rise to a need for computational methods for evaluating the visual quality of digital video. We have developed a new digital video quality metric, which we call DVQ (Digital Video Quality) 1 . Here we provide a brief description of the metric, and give a preliminary report on its performance. DVQ accepts a pair of digital video sequences, and computes a measure of the magnitude of the visible difference between them. The metric is based on the Discrete Cosine Transform. It incorporates aspects of early visual processing, including light adaptation, luminance and chromatic channels, spatial and temporal filtering, spatial frequency channels, contrast masking, and probability summation. It also includes primitive dynamics of light adaptation and contrast masking. We have applied the metric to digital video sequences corrupted by various typical compression artifacts, and compared the results to quality ratings made by human observers.

376 citations


Journal ArticleDOI
TL;DR: A new digital video quality metric, which is based on the discrete cosine transform, which incorporates aspects of early visual pro- cessing, including light adaptation, luminance, and chromatic chan- nels; spatial and temporal filtering; spatial frequency channels; con- trast masking; and probability summation.
Abstract: The growth of digital video has given rise to a need for computational methods for evaluating the visual quality of digital video. We have developed a new digital video quality metric, which we call DVQ (digital video quality) (A. B. Watson, in Human Vision, Visual Processing, and Digital Display VIII, Proc. SPIE 3299, 139- 147 (1998)). Here, we provide a brief description of the metric, and give a preliminary report on its performance. DVQ accepts a pair of digital video sequences, and computes a measure of the magnitude of the visible difference between them. The metric is based on the discrete cosine transform. It incorporates aspects of early visual pro- cessing, including light adaptation, luminance, and chromatic chan- nels; spatial and temporal filtering; spatial frequency channels; con- trast masking; and probability summation. It also includes primitive dynamics of light adaptation and contrast masking. We have applied the metric to digital video sequences corrupted by various typical compression artifacts, and compared the results to quality ratings made by human observers. © 2001 SPIE and IS&T. (DOI: 10.1117/1.1329896)

308 citations


Journal ArticleDOI
01 Jan 2001
TL;DR: The key advantages of the adaptive framework are: (1) perceptual quality is changed gracefully during periods of QoS fluctuations and hand-offs; and (2) the resources are shared in a fair manner.
Abstract: With the emergence of broadband wireless networks and increasing demand of multimedia information on the Internet, wireless multimedia services are foreseen to become widely deployed in the next decade. Real-time video transmission typically has requirements on quality of service (QoS). However, wireless channels are unreliable and the channel bandwidth varies with time, which may cause severe degradation in video quality. In addition, for video multicast, the heterogeneity of receivers makes it difficult to achieve efficiency and flexibility. To address these issues, three techniques, namely, scalable video coding, network-aware adaptation of end systems, and adaptive QoS support from networks, have been developed. This paper unifies the three techniques and presents an adaptive framework, which specifically addresses video transport over wireless networks. The adaptive framework consists of three basic components: (1) scalable video representations; (2) network-aware end systems; and (3) adaptive services. Under this framework, as wireless channel conditions change, mobile terminals and network elements can scale the video streams and transport the scaled video streams to receivers with a smooth change of perceptual quality. The key advantages of the adaptive framework are: (1) perceptual quality is changed gracefully during periods of QoS fluctuations and hand-offs; and (2) the resources are shared in a fair manner.

291 citations


Journal ArticleDOI
TL;DR: It is found that video quality is significantly improved at the same communication rate when layered FEC is used and equation-based rate control achieves more fair bandwidth sharing amongst competing sessions as compared to existing multicast rate control schemes such as RLM and RLC.
Abstract: The use of scalable video with layered multicast has been shown to be an effective method to achieve rate control in heterogeneous networks. We propose the use of layered forward error correction (FEC) as an error-control mechanism in a layered multicast framework. By organizing FEC into multiple layers, receivers can obtain different levels of protection commensurate with their respective channel conditions. Efficient network utilization is achieved as FEC streams are multicast, and only to receivers that need them. Furthermore, FEC is used without overall rate expansion by selectively dropping data layers to make room for FEC layers. Effects of bursty losses are amortized by staggering the FEC streams in time, giving rise to a tradeoff between delay and quality. For rate control at the receivers, we propose an equation-based approach that computes network usage as a function of measured network characteristics. We show that equation-based rate control achieves more fair bandwidth sharing amongst competing sessions as compared to existing multicast rate control schemes such as RLM and RLC. Fairness is achieved since competing sessions sharing a path will measure similar network characteristics. Simulations and actual MBONE experiments are performed using error-resilient, scalable video compression. We find that video quality is significantly improved at the same communication rate when layered FEC is used.

241 citations


Journal ArticleDOI
TL;DR: A new optimal rate control algorithm for maximizing the FSNR is established using a Lagrange multiplier method defined on a curvilinear coordinate system and a piecewise R-D (rate-distortion)/R-Q ( rate-quantization) model is developed.
Abstract: Previously, fovcated video compression algorithms have been proposed which, in certain applications, deliver high-quality video at reduced bit rates by seeking to match the nonuniform sampling of the human retina. We describe such a framework here where foveated video is created by a nonuniform filtering scheme that increases the compressibility of the video stream. We maximize a new foveal visual quality metric. the foveal signal-to-noise ratio (FSNR) to determine the best compression and rate control parameters for a given target bit rate. Specifically, we establish a new optimal rate control algorithm for maximizing the FSNR using a Lagrange multiplier method defined on a curvilinear coordinate system. For optimal rate control, we also develop a piecewise R-D (rate-distortion)/R-Q (rate-quantization) model. A fast algorithm for searching for an optimal Lagrange multiplier /spl lambda/* is subsequently presented. For the new models, we show how the reconstructed video quality is affected, where the FSNR is maximized, and demonstrate the coding performance for H.263,+,++/MPEG-4 video coding. For H.263/MPEG video coding, a suboptimal rate control algorithm is developed for fast, high-performance applications. In the simulations, we compare the reconstructed pictures obtained using optimal rate control methods for foveated and normal video. We show that foveated video coding using the suboptimal rate control algorithm delivers excellent performance under 64 kb/s.

155 citations


Journal ArticleDOI
TL;DR: Results show that the proposed QoS mapping mechanism can exploit the relative DiffServ advantage and result in the persistent service differentiation among DiffServ levels and the enhanced end-to-end video quality with the same pricing constraint.
Abstract: This paper presents a futuristic framework for quality-of-service (QoS) mapping between practically categorized packet video and relative differentiated service (DiffServ or DS) network employing unified priority index and adaptive packet forwarding mechanism under a given pricing model (e.g., DiffServ level differentiated price/packet). Video categorization is based on the relative priority index (RPI), which represents the relative preference per each packet in terms of loss and delay. We propose an adaptive packet forwarding mechanism for a DiffServ network to provide persistent service differentiation. Effective QoS mapping is then performed by mapping video packets onto different DiffServ levels based on RPI. To verify the efficiency of proposed strategy, the end-to-end performance is evaluated through an error resilient packet video transmission using ITU-T H.263+ codec over a simulated DiffServ network. Results show that the proposed QoS mapping mechanism can exploit the relative DiffServ advantage and result in the persistent service differentiation among DiffServ levels and the enhanced end-to-end video quality with the same pricing constraint.

153 citations


Journal ArticleDOI
TL;DR: An end-to-end optimized video streaming system comprising of synergistic interaction between a source packetization strategy and an efficient and responsive, TCP-friendly congestion control protocol [Linear Increase Multiplicative Decrease with History (LIMD/H].
Abstract: In this work we present an end-to-end optimized video streaming system comprising of synergistic interaction between a source packetization strategy and an efficient and responsive, TCP-friendly congestion control protocol [Linear Increase Multiplicative Decrease with History (LIMD/H)]. The proposed source packetization scheme transforms a scalable/layered video bitstream so as to provide graceful resilience to network packet drops. The congestion control mechanism provides low variation in transmission rate in steady state and at the same time is reactive and provably TCP-friendly. While the two constituent algorithms identified above are novel in their own right, a key aspect of this work is the integration of these algorithms in a simple yet effective framework. This "application-transport" layer interaction approach is used to maximize the expected delivered video quality at the receiver. The integrated framework allows our system to gracefully tolerate and quickly react to sudden changes in the available connection capacity due to the onset of congestion, as verified in our simulations.

139 citations


Proceedings ArticleDOI
15 Oct 2001
TL;DR: From the experimental results, it indicates that to embed watermark in motion vector has the advantage of little degrading the video quality, little influence on the MPEG decoding speed, capability to embedWatermark in a short video sequence, and can be used to watermark on both the uncompressed and compressed video sequence.
Abstract: In this paper, we propose a video watermarking technique to hide copyright information in MPEG motion vector. In this method, watermark is embedded in larger value motion vectors, and specially in the less phase angle changed component. Then, the motion vector is modified into a new bitstream from which the watermark information can be easily retrieved. From the experimental results, it indicates that to embed watermark in motion vector has the advantage of little degrading the video quality, little influence on the MPEG decoding speed, capability to embed watermark in a short video sequence, and can be used to watermark on both the uncompressed and compressed video sequence.

138 citations


Proceedings ArticleDOI
Jaap A. Haitsma1, Ton Kalker
07 Oct 2001
TL;DR: This paper presents a video watermarking scheme, that is designed for the future digital cinema format, as it will be used on large projector screens in theatres, that exploits the temporal axis.
Abstract: In this paper a video watermarking scheme is presented, that is designed for the future digital cinema format, as it will be used on large projector screens in theatres. The watermark is designed such that it has minimal impact on the video quality, but still is detectable after capture with a handheld camera and conversion to, for instance, VHS, CD-Video or DVD format. In order to achieve the severe requirements, concerning visibility and robustness, the proposed watermarking system only exploits the temporal axis. A watermark is inserted by changing the mean of the luminance values of a frame according to the samples of the watermark. Watermark detection is performed by correlating the watermark sequence with extracted mean luminance values of a sequence of frames. A demonstrator, implementing the proposed algorithm, has been built which shows the aforementioned requirements can be met with the proposed scheme.

119 citations


Proceedings ArticleDOI
07 May 2001
TL;DR: A method for DCT-domain blind measurement of blocking artifacts by constituting a new block across any two adjacent blocks, the blocking artifact is modeled as a 2-D step function.
Abstract: A method for DCT-domain blind measurement of blocking artifacts is proposed. By constituting a new block across any two adjacent blocks, the blocking artifact is modeled as a 2-D step function. A fast DCT-domain algorithm has been derived to constitute the new block and extract all parameters needed. Then an human visual system (HVS) based measurement of blocking artifacts is conducted. Experimental results have shown the effectiveness and stability of our method. The proposed technique can be used for online image/video quality monitoring and control in applications of DCT-domain image/video processing.

112 citations


Journal ArticleDOI
TL;DR: Improved rate-control schemes that take into consideration the effects of the video buffer fill-up, an a priori channel model, and the channel feedback information are proposed to improve video quality.
Abstract: We investigate the scenario of using the automatic repeat request (ARQ) retransmission scheme for two-way video communications over wireless Rayleigh fading channels. Video quality is the major concern of these applications. We show that, during the retransmissions of error packets, due to the reduced channel throughput, the video encoder buffer may fill-up quickly and cause the TMN8 rate-control algorithm to significantly reduce the bits allocated to each video frame. This results in PSNR degradation and many skipped frames. To minimize the number of frames skipped, we propose improved rate-control schemes that take into consideration the effects of the video buffer fill-up, an a priori channel model, and the channel feedback information. We also show that a discrete cosine transform (DCT) coefficient soft-thresholding scheme can be applied to further improve video quality. As a result, our proposed rate-control schemes encode the video sequences with less frame skipping and with higher PSNR compared to TMN8.

Journal ArticleDOI
TL;DR: Results show that the video quality can be substantially improved by utilizing the frame error information at UDP and application layer, and several maximal distance separable (MDS) code-based packet level error control coding schemes are proposed.
Abstract: Packet video will become a significant portion of emerging and future wireless/Internet traffic. However, network congestion and wireless channel error yields tremendous packet loss and degraded video quality. In this paper, we propose a new complete user datagram protocol (CUDP), which utilizes channel error information obtained from the physical and link layers to assist error recovery at the packet level. We propose several maximal distance separable (MDS) code-based packet level error control coding schemes and derive analytical formulas to estimate the equivalent video frame loss for different versions of user datagram protocol (UDP). We validate the proposed packet coding and CUDP protocol using MPEG-coded video under various Internet packet loss and wireless channel profiles. Theoretic and simulation results show that the video quality can be substantially improved by utilizing the frame error information at UDP and application layer.

Book
30 Sep 2001
TL;DR: This paper presents State-Of-The-Art Video Transmission: Rate-Constrained Coder Control with Affine Multi-Frame Motion-Compensated Prediction, a novel approach to multi-Frame prediction that combines rate-constrained coder control and reinforcement learning.
Abstract: Preface. Introduction. 1. State-Of-The-Art Video Transmission. 2. Rate-Constrained Coder Control. 3. Long-Term Memory Motion-Compensated Prediction. 4. Affine Multi-Frame Motion-Compensated Prediction. 5. Fast Motion Estimation for Multi-Frame Prediction. 6. Error Resilient Video Transmission. 7. Conclusions. References. Index.

Book
12 Mar 2001
TL;DR: This unique volume provides an all-encompassing treatment of wireless video communications, compression, channel coding, and wireless transmission as a joint subject, and provides in a comprehensive manner "implementation-ready" overall system design and performance studies.
Abstract: From the Publisher: "Bridging the gap between the video compression and communication communities, this unique volume provides an all-encompassing treatment of wireless video communications, compression, channel coding, and wireless transmission as a joint subject. WIRELESS VIDEO COMMUNICATIONS begins with relatively simple compression and information theoretical principles, continues through state-of-the-art and future concepts, and concludes with implementation-ready system solutions. This book's deductive presentation and broad scope make it essential for anyone interested in wireless communications. It systematically converts the lessons of Shannon's information theory into design principles applicable to practical wireless systems. It provides in a comprehensive manner "implementation-ready" overall system design and performance studies, giving cognizance to the contradictory design requirements of video quality, bit rate, delay, complexity error resilience, and other related system design aspects. Topics covered include*information theoretical foundations*block-based and convolutional channel coding*very-low-bit-rate video codecs and multimode videophone transceivers*high-resolution video coding using both proprietary and standard schemes*CDMA/OFDM systems, third-generation and beyond adaptive video systems. WIRELESS VIDEO COMMUNICATIONS is a valuable reference for postgraduate researchers, system engineers, industrialists, managers and visual communications practitioners. About the Authors Lajos Hanzo has enjoyed a prolific 24-year career during which he has held various research and academic positions in Hungary, Germany, and the United Kingdom. He has coauthored five bookson mobile radio communications and published over 300 research papers on a variety of topics. Dr. Hanzo's research interests cover the entire spectrum of mobile multimedia communications, including voice, audio, video and graphic source compression, channel coding, modulation, networking and the joint optimization of these system components. He holds a chair in communications in the Department of Electronics and Computer Science at the University of Southampton, England, and he is a consultant to Multiple Access Communications Ltd. Peter J. Cherriman graduated in 1994 with an M.Eng. In information engineering from the University of Southampton. Since 1994, he has been with the Department of Electronics and Computer Science at the University of Southampton, where he completed his Ph.D. in mobile video networking. Dr. Cherriman is working on projects for the Mobile Virtual Centre of Excellence, U.K. His current areas of research include robust video coding, microcellular radio systems, power control, dynamic channel allocation, and multiple access protocols. Jurgen Streit received his Dipl.-Ing. in electronic engineering from the Aachen University of Technology, Germany, in 1993. Since 1992 he has been with the Department of Electronics and Computer Science at the University of Southampton, working with the Mobile Multimedia Communications Research Group. Dr. Streit earned a Ph.D. in image coding, and he is currently working as a software consultant."

Proceedings ArticleDOI
07 Oct 2001
TL;DR: Simulation results show that this scheme can guarantee a graceful video quality in adverse channel conditions and is effective for video transmission over the high loss environment found in ad-hoc networks.
Abstract: The increase in the bandwidth of the wireless channels and the computing power of the mobile devices makes it possible to offer video service for wireless networks in the near future. In an ad-hoc network, strong error protection is required because of the lack of a fixed infrastructure. On the other hand, the mesh structure of an ad-hoc network implies that there may be multiple paths existing between a source and destination, which can be used to enhance video transmissions. We propose a simple but robust scheme for reliable transmission of video in bandwidth limited ad-hoc networks. In our scheme, a video stream is layer coded. The base layer (BL) packets and the enhancement layer (EL) packets are transmitted separately on two disjoint paths using multipath transport (MPT). BL packets are protected by automatic repeat request (ARQ), and a lost BL packet is retransmitted through the path where EL packets are transmitted. An EL packet has lower priority than a retransmitted BL packet and may be dropped at the sender when congestion occurs. Simulation results show that this scheme can guarantee a graceful video quality in adverse channel conditions. It is effective for video transmission over the high loss environment found in ad-hoc networks.

Journal ArticleDOI
TL;DR: A sliding-window rate-control scheme that uses statistical information of the future video frames as a guidance to generate better video quality for video streaming involving constant bit rate channels is proposed and results show video quality improvements over the regular H.263 TMN8 encoder.
Abstract: In streaming video applications, video sequences are encoded off-line and stored in a server. Users may access the server over a constant bit rate channel. Examples of the streaming video applications are video-on-demand, archived video news, and noninteractive distance learning. Before the playback, part of the video bitstream is pre-loaded in the decoder buffer to ensure that every frame can be decoded at the scheduled time. For these streaming video applications, since the video is encoded off-line and the future video frames are available to the encoder, a more sophisticated bit-allocation scheme can be used to achieve better video quality. During the encoding process for streaming video, two requirements need to be considered: the pre-loading time that the video viewers have to wait and the physical buffer-size at the receiver (decoder) side. In this paper, we propose a sliding-window rate-control scheme that uses statistical information of the future video frames as a guidance to generate better video quality for video streaming involving constant bit rate channels. A quantized discrete cosine transform coefficient selection scheme based on the rate-distortion measurement is also used to improve the video quality. Simulation results show video quality improvements over the regular H.263 TMN8 encoder.

Patent
21 Aug 2001
TL;DR: In this article, an intra-frame complexity based dynamic GOP system and method for the encoding of MPEG-2 video streams is presented, which may be used as a rate control tool to improve video quality.
Abstract: This invention discloses an intra-frame complexity based dynamic GOP system and method for the encoding of MPEG-2 video streams. The present invention may be used as a rate control tool to improve video quality. The selective insertion of low complexity I frames in scene changes such as fades is disclosed. It is disclosed that the selective use of I frames in certain scene change situations can improve encoder performance, particularly when encoding at a low bit rate.

Proceedings ArticleDOI
08 Jun 2001
TL;DR: In this article, the authors proposed an adaptive scale estimation method to measure the scale of subjective visual impairment in digital video, which is based on Bayesian estimation of the sensory scale after each trial.
Abstract: The study of subjective visual quality, and the development of computed quality metrics, require accurate and meaningful measurement of visual impairment. A natural unit for impairment is the JND (just-noticeable-difference). In many cases, what is required is a measure of an impairment scale, that is, the growth of the subjective impairment, in JNDs, as some physical parameter (such as amount of artifact) is increased. Measurement of sensory scales is a classical problem in psychophysics. In the method of pair comparison, each trial consists of a pair of samples and the observer selects the one perceived to be greater on the relevant scale. This may be regarded as an extension of the method of forced-choice: from measurement of threshold (one JND), to measurement of the larger sensory scale (multiple JNDs). While simple for the observer, pair comparison is inefficient because if all samples are compared, many comparisons will be uninformative. In general, samples separated by about 1 JND are most informative. We have developed an efficient adaptive method for selection of sample pairs. As with the QUEST adaptive threshold procedure[1], the method is based on Bayesian estimation of the sensory scale after each trial. We call the method Efficient Adaptive Scale Estimation, or EASE ("to make less painful"). We have used the EASE method to measure impairment scales for digital video. Each video was derived from an original source (SRC) by the addition of a particular artifact, produced by a particular codec at a specific bit rate, called a hypothetical reference circuit (HRC). Different amounts of artifact were produced by linear combination of the source and compressed videos. On each pair-comparison trial the observer selected which of two sequences, containing different amounts of artifact, appeared more impaired. The scale is estimated from the pair comparison data using a maximum likelihood method. At the top of the scale, when all of the artifact is present, the scale value is the total number of JNDs corresponding to that SRC/HRC condition. We have measured impairment scales for 25 video sequences, derived from five SRCs combined with each of five HRCs. We find that EASE is a reliable method for measuring impairment scales and JNDs for processed video sequences. We have compared our JND measurements with mean opinion scores for the same sequences obtained at one viewing distance using the DSCQS method by the Video Quality Experts Group (VQEG), and we find that the two measures are highly correlated. The advantages of the JND measurements are that they are in absolute and meaningful units and are unlikely to be subject to context effects. We note that JND measurements offer a means of creating calibrated artifact samples, and of testing and calibrating video quality models. 1. BACKGROUND 1.1. Need for accurate subjective measures of video quality The design and use of digital video systems entail difficult tradeoffs amongst various quantities, of which the two most important are cost and visual quality. While there is no difficulty in measuring cost, beauty remains locked in the eye of the beholder. However in recent years a number of computational metrics have been developed which purport to measure video quality or video impairment. Metrics of this sort would be very valuable in providing a means for automatically specifying, monitoring, and optimizing the visual quality of digital video.

Patent
06 Apr 2001
TL;DR: In this paper, a method and apparatus for image compression using temporal and resolution layering of compressed image frames is presented, which allows a form of modularized decomposition of an image that supports flexible application of a variety of image enhancement techniques.
Abstract: A method and apparatus for image compression using temporal and resolution layering of compressed image frames (Figs. 8). In particular, layered compression (Fig. 8, items 81, 85, and 86) allows a form of modularized decomposition of an image that supports flexible application of a variety of image enhancement techniques (Fig. 2). Further, the invention provides a number of enhancements to handle a variety of video quality and compression problems (Fig. 7). Most of the enhancements are preferably embodied as a set of tools which can be applied to the tasks of enhancing image (Fig. 8.,item 82, 86) and compressing such images. The tools can be combined by a content developer in various ways, as desired, to optimize the visual quality and compression efficiency of a compressed data stream. Such tools include improved image filtering techniques, motion vector representation and determination, de-interlacing and noise reduction enhancements, motion analysis, imaging device characterization and correction, an enhanced 3-2 pulldown system (Fig. 1), frame rate methods for production, a modular bit rate technique, a multi-layer DCT structure (Fig. 20), variable length coding optimization, an augmentation system for MPEG-2 and MPEG-4, and guide vectors for the spatial enhancement layer (Fig.8, item 87).

Journal ArticleDOI
TL;DR: Experimental results show that AMIS dramatically outperforms existing structuring techniques, thanks to its efficient adaptivity, and the performances of the AMISP scheme in an MPEG-2 over RTP/UDP/IP scenario are evaluated.
Abstract: We address a new error-resilient scheme for broadcast quality MPEG-2 video streams to be transmitted over lossy packet networks. A new scene-complexity adaptive mechanism, namely Adaptive MPEG-2 Information Structuring (AMIS) is introduced. AMIS modulates the number of resynchronization points (i.e., slice headers and intra-coded macroblocks) in order to maximize the perceived video quality, assuming that the encoder is aware of the underlying packetization scheme, the packet loss probability (PLR), and the error-concealment technique implemented at the decoding side. The end-to-end video quality depends both on the encoding quality and the degradation due to data loss. Therefore, AMIS constantly determines the best compromise between the rate allocated to encode pure video information and the rate aiming at reducing the sensitivity to packet loss. Experimental results show that AMIS dramatically outperforms existing structuring techniques, thanks to its efficient adaptivity. We then extend AMIS with a forward-error-correction (FEC)-based protection algorithm to become AMISP. AMISP triggers the insertion of FEC packets in the MPEG-2 video packet stream. Finally, the performances of the AMISP scheme in an MPEG-2 over RTP/UDP/IP scenario are evaluated.

Patent
27 Dec 2001
TL;DR: In this article, a network device periodically determines available bandwidth, and divides the available bandwidth among multiple incoming bitstreams being multiplexed in order to increase downstream decoder buffer levels.
Abstract: Described herein are systems and methods for multiplexing and transmitting video data. The systems and methods use excess bandwidth in a channel available after meeting minimum transmission requirements for all bitstreams. A network device of the invention flexibly allocates this available bandwidth to minimize further rate reduction. More specifically, the network device periodically determines the available bandwidth, and divides the available bandwidth among multiple incoming bitstreams being multiplexed in order to increase downstream decoder buffer levels. By maintaining increased decoder buffer levels, future rate reduction of the video data may be avoided or applied to a lesser degree. Minimizing rate reduction in this manner improves bandwidth usage efficiency, and thus improves video data transmission and end-user output video quality.

Proceedings ArticleDOI
Yihong Gong1, X. Liu
07 Oct 2001
TL;DR: The visual content redundancy metric that tells how much redundancy a video contains, and how much the video can be curtailed without losing too much visual content is derived.
Abstract: We propose a video summarization method able to produce a motion video summary that minimizes the visual content redundancy for the input video. We derive the visual content redundancy metric that tells how much redundancy a video contains, and how much the video can be curtailed without losing too much visual content. Then we develop a method that produces a summary of the input video with the minimal redundancy measured. The video summary also contains the original audio segments that are partially synchronized with the summarized visual content. The experimental evaluations show the effectiveness of the redundancy metric and the summarization method.

Patent
Joel Jung1, Jorge E. Caviedes1
11 Dec 2001
TL;DR: In this article, a method of measuring video quality is proposed, which comprises a step of determining (21) at least one reference level (JND) above which visual artifacts become noticeable to a group of subjects, with a corresponding predetermined artifact metric (M), from a set of reference sequences (RS) of digital pictures only comprising a corresponding artifact.
Abstract: A method of measuring video quality is disclosed. This method comprises a step of determining (21) at least one reference level (JND) above which visual artifacts become noticeable to a group of subjects, with a corresponding predetermined artifact metric (M) ,from a set of reference sequences (RS) of digital pictures only comprising a corresponding artifact. The method of measuring video quality also comprises a step of measuring (22) at least one artifact level (L) of the input sequence (IS) with the corresponding predetermined artifact metric (M). It further comprises a step of computing (23) a video quality metric (OQM) of the input sequence from at least one ratio of the artifact level (L) to the reference level (JND) corresponding to a same predetermined artifact metric. Such a method of measuring video quality provides an objective quality metric.

01 Jan 2001
TL;DR: The results suggest that simply supplying video feedback to a remote observer may be useless at best or possibly damaging at worst, and what is needed is not necessarily more bandwidth, but better interfaces and tools to help observers to remain oriented such that they can extract what is needs from the video stream.
Abstract: It has been proposed that if we could configure individual personnel with micro-video cameras and wireless communications such that they could transmit a video stream of what they were seeing to a remote observer, this would be an enormous improvement in reconnaissanc e and battlefield command and control. We looked ahead, based on current video and wireless communications technologies and trends to what we can expect to have available in terms of streaming video quality of service (QOS) and we used those predictions to conduct an experiment to determine if this assertion of improvement is true. Participants viewed a digital video with a data rate associated with a given transmission technology. They were asked to maintain their orientation by tracking the position of the camera on a paper floor plan diagram. They were also asked to identify a number of objects and place them in the correct room on the floor plan. The results show that participants found all conditions except the live walkthrough control condition to be extremely difficult with poor performance on both the spatial orientation task and the object identification task. Bandwidth does affect error as increased data rate improves performance. Rapid head rotations seem to be the largest contributor to disorientation, especially with low data rate video. Our results suggest that simply supplying video feedback to a remote observer may be useless at best or possibly damaging at worst. What is needed is not necessarily more bandwidth, but better interfaces and tools to help observers to remain oriented such that they can extract what is needed from the video stream.

Proceedings ArticleDOI
07 Oct 2001
TL;DR: Results indicate that this adaptive JSCC scheme employing scalable video encoding together with a multiresolution modulation/coding approach leads to significant improvements in delivered video quality for specified channel conditions.
Abstract: In this paper we describe a multi-layered video transport scheme for wireless channels capable of adapting to channel conditions in order to maximize end-to-end quality of service (QoS). This scheme combines a scalable H.263+ video source coder with unequal error protection (UEP) across layers. The UEP is achieved by employing different channel codes together with a multiresolution modulation approach to transport the different priority layers. Adaptivity to channel conditions is provided through use of joint source-channel coding (JSCC) which attempts to jointly optimize the source and channel coding rates together with the modulation parameters to obtain the maximum achievable end-to-end QoS for the prevailing channel conditions. Results indicate that this adaptive JSCC scheme employing scalable video encoding together with a multiresolution modulation/coding approach leads to significant improvements in delivered video quality for specified channel conditions. In particular, the approach results in considerably improved graceful degradation properties for decreasing channel SNR.

Proceedings ArticleDOI
01 Jan 2001
TL;DR: This paper presents a purely dynamic based approach for content-based classification of video sequences in the form of a new global motion measure of foreground objects in the forms of cartoon and non-cartoon sequences.
Abstract: This paper describes a simple high-level classification of multimedia broadcast material into cartoon non-cartoon The input video sequences are from a broad range of material which is representative of entertainment viewing Classification of this type of high-level video genre is difficult because of its large inter-class variation The task is made more difficult when classification is over a small time (10's of seconds) introducing a great deal of intra-class variation This paper presents a purely dynamic based approach for content-based classification of video sequences in the form of a new global motion measure of foreground objects Experiments are reported on a diverse database consisting of: 8 cartoon and 20 non-cartoon sequences Results are shown in identification error rates against time of sequence used for classification The system produces a best identification error rate of 3% on 66 separate decisions based on 23 second sequences trained using a total of /spl sim/20 minutes of video

Proceedings ArticleDOI
25 Nov 2001
TL;DR: This paper proposes two FSM that generate frame sizes for full-length VBR videos preserving both GOP-periodicity and size-based video-segment transitions and uses QQ plots to show visual similarity of model-generated VBR video data-sets with original dataset.
Abstract: Video traffic is expected to be the major source for broadband integrated services digital networks (B-ISDN). In this paper we propose two FSM that generate frame sizes for full-length VBR videos preserving both GOP-periodicity and size-based video-segment transitions. First, two-pass algorithms for analysis of full-length VBR videos are presented. After two-pass analysis these algorithms identify and partition (size-based) classes of video segments. Frames in each segment class produce three datasets one each for I-, B-, and P-type frame. Each of these data-sets is modeled with an axis shifted gamma distribution Markov renewal processes model (size-based) video segment transitions. We have used QQ plots to show visual similarity of model-generated VBR video data-sets with original dataset. A leaky-bucket simulation study is used to show similarity of data and frame loss rates between model-generated VBR videos and original video. Our study of frame-based VBR video revealed even a low data loss rate could affect a large fraction of I frames causing a significant degradation of the quality of transmitted video.

Proceedings ArticleDOI
07 Oct 2001
TL;DR: This work employs oversampling to add redundancy to the original video data followed by a decomposition of the oversampled video frames into "equal" sub-images which can be coded and transmitted over separate channels.
Abstract: We address the problem of robust video transmission over unreliable networks. Our approach employs the principle of multiple-descriptions, through pre- and post-processing of the video data, without modification to the source or channel codecs. We employ oversampling to add redundancy to the original video data followed by a decomposition of the oversampled video frames into "equal" sub-images which can be coded and transmitted over separate channels. Simulations using two descriptions show that this approach maintains excellent reconstructed video quality when only one description is received.

Proceedings ArticleDOI
27 Aug 2001
TL;DR: This paper explores how the policing actions and associated rate guarantees provided by the Expedited Forwarding translate into perceived benefits for applications that are the presumed users of such enhancements.
Abstract: Over the past few years, there have been a number of proposals aimed at introducing different levels of service in the Internet. One of the more recent proposals is the Differentiated Services (Diff-Serv) architecture, and in this paper we explore how the policing actions and associated rate guarantees provided by the Expedited Forwarding (EF) translate into perceived benefits for applications that are the presumed users of such enhancements. Specifically, we focus on video streaming applications that arguably have relatively strong service quality requirements, and which should, therefore, stand to benefit from the availability of some form of enhanced service. Our goal is to gain a better understanding of the relation that exists between application level quality measures and the selection of the network level parameters that govern the delivery of the guarantees that an EF based service would provide. Our investigation, which is experimental in nature, relies on a number of standard streaming video servers and clients that have been modified and instrumented to allow quantification of the perceived quality of the received video stream. Quality assessments are performed using a Video Quality Measurement tool based on the ANSI objective quality standard. Measurements were made over both a local Diff-Serv testbed and across the QBone, a QoS enabled segment of the Internet2 infrastructure. The paper reports and analyzes the results of those measurements.

Book ChapterDOI
01 Jan 2001
TL;DR: While traditional analog systems still form the vast majority of television sets today, production studios, broadcasters and network providers have been installing digital video equipment at an ever-increasing rate.
Abstract: While traditional analog systems still form the vast majority of television sets today, production studios, broadcasters and network providers have been installing digital video equipment at an ever-increasing rate. The border line between analog and digital video is moving closer and closer to the consumer. Digital satellite and cable service have been available for a while, and recently terrestrial digital television broadcast has been introduced in a number of locations around the world.