scispace - formally typeset
Search or ask a question

Showing papers on "Video quality published in 2003"


Journal ArticleDOI
TL;DR: Context-based adaptive binary arithmetic coding (CABAC) as a normative part of the new ITU-T/ISO/IEC standard H.264/AVC for video compression is presented, and significantly outperforms the baseline entropy coding method of H.265.
Abstract: Context-based adaptive binary arithmetic coding (CABAC) as a normative part of the new ITU-T/ISO/IEC standard H.264/AVC for video compression is presented. By combining an adaptive binary arithmetic coding technique with context modeling, a high degree of adaptation and redundancy reduction is achieved. The CABAC framework also includes a novel low-complexity method for binary arithmetic coding and probability estimation that is well suited for efficient hardware and software implementations. CABAC significantly outperforms the baseline entropy coding method of H.264/AVC for the typical area of envisaged target applications. For a set of test sequences representing typical material used in broadcast applications and for a range of acceptable video quality of about 30 to 38 dB, average bit-rate savings of 9%-14% are achieved.

1,702 citations


Book ChapterDOI
TL;DR: EvalVid is targeted for researchers who want to evaluate their network designs or setups in terms of user perceived video quality, and has a modular construction, making it possible to exchange both the network and the codec.
Abstract: With EvalVid we present a complete framework and tool-set for evaluation of the quality of video transmitted over a real or simulated communication network. Besides measuring QoS parameters of the underlying network, like loss rates, delays, and jitter, we support also a subjective video quality evaluation of the received video based on the frame-by-frame PSNR calculation. The tool-set has a modular construction, making it possible to exchange both the network and the codec. We present here its application for MPEG-4 as example. EvalVid is targeted for researchers who want to evaluate their network designs or setups in terms of user perceived video quality. The tool-set is publicly available [11].

825 citations


01 Jan 2003
TL;DR: It is imperative for a video service system to be able to realize and quantify the video quality degradations that occur in the system, so that it can maintain, control and possibly enhance the quality of the video data.
Abstract: Digital video data, stored in video databases and distributed through communication networks, is subject to various kinds of distortions during acquisition, compression, processing, transmission, and reproduction. For example, lossy video compression techniques, which are almost always used to reduce the bandwidth needed to store or transmit video data, may degrade the quality during the quantization process. For another instance, the digital video bitstreams delivered over error-prone channels, such as wireless channels, may be received imperfectly due to the impairment occurred during transmission. Package-switched communication networks, such as the Internet, can cause loss or severe delay of received data packages, depending on the network conditions and the quality of services. All these transmission errors may result in distortions in the received video data. It is therefore imperative for a video service system to be able to realize and quantify the video quality degradations that occur in the system, so that it can maintain, control and possibly enhance the quality of the video data. An effective image and video quality metric is crucial for this purpose.

350 citations


Journal ArticleDOI
TL;DR: This paper proposes to combine multistream coding with multipath transport, to show that, in addition to traditional error control techniques, path diversity provides an effective means to combat transmission error in ad hoc networks.
Abstract: Enabling video transport over ad hoc networks is more challenging than over other wireless networks. The wireless links in an ad hoc network are highly error prone and can go down frequently because of node mobility, interference, channel fading, and the lack of infrastructure. However, the mesh topology of ad hoc networks implies that it is possible to establish multiple paths between a source and a destination. Indeed, multipath transport provides an extra degree of freedom in designing error resilient video coding and transport schemes. In this paper, we propose to combine multistream coding with multipath transport, to show that, in addition to traditional error control techniques, path diversity provides an effective means to combat transmission error in ad hoc networks. The schemes that we have examined are: 1) feedback based reference picture selection; 2) layered coding with selective automatic repeat request; and 3) multiple description motion compensation coding. All these techniques are based on the motion compensated prediction technique found in modern video coding standards. We studied the performance of these three schemes via extensive simulations using both Markov channel models and OPNET Modeler. To further validate the viability and performance advantages of these schemes, we implemented an ad hoc multiple path video streaming testbed using notebook computers and IEEE 802.11b cards. The results show that great improvement in video quality can be achieved over the standard schemes with limited additional cost. Each of these three video coding/transport techniques is best suited for a particular environment, depending on the availability of a feedback channel, the end-to-end delay constraint, and the error characteristics of the paths.

293 citations


Proceedings ArticleDOI
16 Jun 2003
TL;DR: A method for post-processing the secondary SSCQE data to produce quality scores that are highly correlated to the original DSCQS and DSCS data is given.
Abstract: International recommendations for subjective video quality assessment (e.g., ITU-R BT.500-11) include specifications for how to perform many different types of subjective tests. Some of these test methods are double stimulus where viewers rate the quality or change in quality between two video streams (reference and impaired). Others are single stimulus where viewers rate the quality of just one video stream (the impaired). Two examples of the former are the double stimulus continuous quality scale (DSCQS) and double stimulus comparison scale (DSCS). An example of the latter is single stimulus continuous quality evaluation (SSCQE). Each subjective test methodology has claimed advantages. For instance, the DSCQS method is claimed to be less sensitive to context (i.e., subjective ratings are less influenced by the severity and ordering of the impairments within the test session). The SSCQE method is claimed to yield more representative quality estimates for quality monitoring applications. This paper considers data from six different subjective video quality experiments, originally performed with SSCQE, DSCQS and DSCS methodologies. A subset of video clips from each of these six experiments were combined and rated in a secondary SSCQE subjective video quality test. We give a method for postprocessing the secondary SSCQE data to produce quality scores that are highly correlated to the original DSCQS and DSCS data. We also provide evidence that human memory effects for time-varying quality estimation seem to be limited to about 15 seconds.

217 citations


Proceedings Article
03 Jan 2003
TL;DR: This work considers the problem of enhancing the resolution of video through the addition of perceptually plausible high frequency information and uses the use of the previously enhanced frame to provide part of the training set for super-resolution enhancement of the current frame.
Abstract: We consider the problem of enhancing the resolution of video through the addition of perceptually plausible high frequency information. Our approach is based on a learned data set of image patches capturing the relationship between the middle and high spatial frequency bands of natural images. By introducing an appropriate prior distribution over such patches we can ensure consistency of static image regions across successive frames of the video, and also take account of object motion. A key concept is the use of the previously enhanced frame to provide part of the training set for super-resolution enhancement of the current frame. Our results show that a marked improvement in video quality can be achieved at reasonable computational cost.

140 citations


Proceedings ArticleDOI
25 May 2003
TL;DR: A new hardware architecture for variable block size motion estimation with full search at integer-pixel accuracy is proposed and can achieve real-time applications under the operating frequency of 64.11 MHz for 720/spl times/480 frame at 30 Hz.
Abstract: Variable block size motion estimation is adopted in the new video coding standard, MPEG-4 AVC/JVT/ITU-T H.264, due to its superior performance compared to the advanced prediction mode in MPEG-4 and H.263+. In this paper, we modified the reference software in a hardware-friendly way. Our main idea is to convert the sequential processing of each 8/spl times/8 sub-partition of a macro-block into parallel processing without sacrifice of video quality. Based on our algorithm, we proposed a new hardware architecture for variable block size motion estimation with full search at integer-pixel accuracy. The features of our design are 2-D processing element array with 1-D data broadcasting and 1-D partial result reuse, parallel adder tree, memory interleaving scheme, and high utilization. Simulation shows that our chip can achieve real-time applications under the operating frequency of 64.11 MHz for 720/spl times/480 frame at 30 Hz with search range of [-24, +23] in horizontal direction and [-16, +15] in vertical direction, which requires the computation power of more than 50 GOPS.

135 citations


Proceedings ArticleDOI
02 Nov 2003
TL;DR: An integrated power management approach that unifies low level architectural optimizations, OS power-saving mechanisms, and adaptive middleware techniques that supports tight coupling of inter-level parameters can enhance user experience on a handheld substantially is proposed.
Abstract: Optimizing user experience for streaming video applications on handheld devices is a significant research challenge In this paper, we propose an integrated power management approach that unifies low level architectural optimizations (CPU, memory, register), OS power-saving mechanisms (Dynamic Voltage Scaling) and adaptive middleware techniques (admission control, optimal transcoding, network traffic regulation) Specifically, we identify interaction parameters between the different levels and optimize them to significantly reduce power consumption With knowledge of device configurations, dynamic device parameters and changing system conditions, the middleware layer selects an appropriate video quality and fine tunes the architecture for optimized delivery of video Our performance results indicate that architectural optimizations that are cognizant of user level parameters(eg transcoded video quality) can provide energy gains as high as 575% for the CPU and memory Middleware adaptations to changing network noise levels can save as much as 70% of energy consumed by the wireless network interface Furthermore, we demonstrate how such an integrated framework, that supports tight coupling of inter-level parameters can enhance user experience on a handheld substantially

126 citations


Proceedings ArticleDOI
02 Nov 2003
TL;DR: This work presents a novel approach that uses pseudo-relevance feedback from retrieved items that are NOT similar to the query items without further inquiring user feedback and suggests a score combination scheme via posterior probability estimation.
Abstract: Video information retrieval requires a system to find information relevant to a query which may be represented simultaneously in different ways through a text description, audio, still images and/or video sequences. We present a novel approach that uses pseudo-relevance feedback from retrieved items that are NOT similar to the query items without further inquiring user feedback. We provide insight into this approach using a statistical model and suggest a score combination scheme via posterior probability estimation. An evaluation on the 2002 TREC Video Track queries shows that this technique can improve video retrieval performance on a real collection. We believe that negative pseudo-relevance feedback shows great promise for very difficult multimedia retrieval tasks, especially when combined with other different retrieval algorithms.

122 citations


Journal ArticleDOI
TL;DR: This paper presents a cross-layer mapping architecture for video transmission in wireless networks and describes the design and algorithms for each building block, which either builds upon or extend the state-of-the-art algorithms that were developed without much considerations of other layers.
Abstract: Providing quality-of-service (QoS) to video delivery in wireless networks has attracted intensive research over the years. A fundamental problem in this area is how to map QoS criterion at different layers and optimize QoS across the layers. In this paper, we investigate this problem and present a cross-layer mapping architecture for video transmission in wireless networks. There are several important building blocks in this architecture, among others, QoS interaction between video coding and transmission modules, QoS mapping mechanism, video quality adaptation, and source rate constraint derivation. We describe the design and algorithms for each building block, which either builds upon or extend the state-of-the-art algorithms that were developed without much considerations of other layers. Finally, we use simulation results to demonstrate the performance of the proposed architecture for progressive fine granularity scalability video transmission over time-varying and nonstationary wireless channel.

117 citations


Proceedings ArticleDOI
TL;DR: This paper discusses color image quality metrics and presents no-reference artifact metrics for blockiness, blurriness, and colorfulness, showing that these metrics are highly correlated with experimental data collected through subjective experiments.
Abstract: Color image quality depends on many factors, such as the initial capture system and its color image processing, compression, transmission, the output device, media and associated viewing conditions. In this paper, we are primarily concerned with color image quality in relation to compression and transmission. We review the typical visual artifacts that occur due to high compression ratios and/or transmission errors. We discuss color image quality metrics and present no-reference artifact metrics for blockiness, blurriness, and colorfulness. We show that these metrics are highly correlated with experimental data collected through subjective experiments. We use them for no-reference video quality assessment in different compression and transmission scenarios and again obtain very good results. We conclude by discussing the important effects viewing conditions can have on image quality.

Proceedings ArticleDOI
11 May 2003
TL;DR: A new framework for multimedia streaming that integrates the application and network layer functionalities to meet such stringent application requirements as delay and loss is presented and a multi-path selection method that chooses a set of paths maximizing the overall quality at the client is proposed.
Abstract: This paper presents a new framework for multimedia streaming that integrates the application and network layer functionalities to meet such stringent application requirements as delay and loss. The coordination between these two layers provides more robust media transmission even under severe network conditions. In this framework, a multiple description source coder is used to produce multiple independently-decodable streams that are routed over partially link-disjoint (non-shared) path to combat bursty packet losses. We model multi-path streaming and propose a multi-path streaming and propose a multi-path selection method that chooses a set of paths maximizing the overall quality at the client. Overlay infrastructure is then used to achieve multi-path routing over these selected paths. The simulation results show that the average peak signal-to-noise ratio (PSNR) improves by up to 8.1 dB, if the same source video is routed over intelligently selected multiple paths instead of the shortest path or maximally link-disjoint paths. In addition to PSNR improvement in quality, the end-user experiences a more continual steaming quality.

Journal ArticleDOI
TL;DR: A two-stage framework to generate MPEG-7-compliant hierarchical key frame summaries of video sequences by reducing the number of key frames to match the low-level browsing preferences of a user is proposed.
Abstract: A compact summary of video that conveys visual content at various levels of detail enhances user interaction significantly. In this paper, we propose a two-stage framework to generate MPEG-7-compliant hierarchical key frame summaries of video sequences. At the first stage, which is carried out off-line at the time of content production, fuzzy clustering and data pruning methods are applied to given video segments to obtain a nonredundant set of key frames that comprise the finest level of the hierarchical summary. The number of key frames allocated to each shot or segment is determined dynamically and without user supervision through the use of cluster validation techniques. A coarser summary is generated on-demand in the second stage by reducing the number of key frames to match the low-level browsing preferences of a user. The proposed method has been validated by experimental results on a collection of video programs.

Proceedings ArticleDOI
27 May 2003
TL;DR: This study investigated four variations of one form of video surrogate: a fast forward created by selecting every Nth frame from the full video, and tested the validity of six measures of user performance when interacting with video surrogates.
Abstract: To support effective browsing, interfaces to digital video libraries should include video surrogates (i.e., smaller objects that can stand in for the videos in the collection, analogous to abstracts standing in for documents). The current study investigated four variations (i.e., speeds) of one form of video surrogate: a fast forward created by selecting every Nth frame from the full video. In addition, it tested the validity of six measures of user performance when interacting with video surrogates. Forty-five study participants interacted with all four versions of the fast forward surrogate, and completed all six performance tasks with each. Surrogate speed affected performance on four of the measures: object recognition (graphical), action recognition, linguistic gist comprehension (full text), and visual gist comprehension. Based on these results, we recommend a fast forward default speed of 1:64 of the original video keyframes. In addition, users should control the choice of fast forward speed to adjust for content characteristics and personal preferences.

Patent
26 Jun 2003
TL;DR: In this article, a system for independently capturing video content from various video content sources and ratings data independently is presented, where the video content and ratings are stored with metadata so that the video contents and ratings can be searchable.
Abstract: A system (20) for independently capturing video content (22) from various video content sources and ratings data (24) independently. The video content and ratings data is stored with metadata so that the video content and ratings data is searchable. A synchronization engine (30) automatically links the video content to the ratings data. As such, selected video content and corresponding ratings data is presented to the user in a contiguous format in a synchronized manner over different platforms including the Internet.

Journal ArticleDOI
01 Aug 2003
TL;DR: A simple and fast method for face detection is proposed to define ROIs dynamically in real time applications using the color information Cr and RGB variance to determine the skin-color pixels.
Abstract: The ability to give higher priority to regions-of-interest (ROI) is the emerging functionality for present day video coding. A simple and fast method for face detection is proposed to define ROIs dynamically in real time applications. We use the color information Cr and RGB variance to determine the skin-color pixels. Because the two color spaces are commonly used in most hardware and video codec standards, there is no extra computation overhead required for conversion. Low-pass filtering is applied to the background to reduce used bits. For the video coding system, a region-based video codec based on H.263+ with the option mode of modified quantization is set up. We adjust the distortion weight parameter and variance at the macroblock layer to control the qualities at different regions. Experimental results show that the proposed methods can significantly improve quality at the ROI. Our methods are suitable for real time videoconferencing.

Patent
27 Jan 2003
TL;DR: In this article, a video information transmission apparatus for efficient transmission of MPEG video at real-time while controlling congestion on a QoS non-guaranteed IP network and suppressing degradation of a video quality is presented.
Abstract: This invention provides a video information transmission apparatus for efficiently transmitting a digital video such as MPEG video at real time while controlling congestion on a QoS non-guaranteed IP network and suppressing degradation of a video quality. A transmission control section 13 on a sender side outputs bit rate feedback information in accordance with congestion information on the network to a real-time encoder 12 , and controls a transmission bit rate to change the transmission bit rate in accordance with congestion information on the network. The bit rate feedback information is obtained based on the congestion information on the network on the sender side, or obtained on a receiver side and fed back.

Proceedings ArticleDOI
16 Jun 2003
TL;DR: The subjective ratings are used to validate the prediction performance of a real-time non-reference quality metric and compare codec performance as well as the effects of transmission errors on visual quality.
Abstract: This paper presents the results of our quality evaluations of video sequences encoded for and transmitted over a wireless channel. We selected content, codecs, bitrates and bit error patterns representative of mobile applications and we concentrated on the MPEG-4 and Motion JPEG2000 coding standards. We carried out subjective experiments using the Single Stimulus Continuous Quality Evaluation (SSCQE) method on this test material. We analyze the results of the subjective data and use them to compare codec performance and resilience to transmission errors. Finally, we use the subjective data to validate the prediction performance of a real-time non-reference quality metric.

Patent
07 Feb 2003
TL;DR: In this article, a power-scaling method for decoding digital video data in a power scalable manner is presented, which includes changing both a power consumption level associated with the video decoding system and a video presentation quality.
Abstract: A method for decoding digital video data in a power scalable manner is provided. The method initiates with monitoring a power level available for the video decoding system. Then, threshold power levels are identified. In response to the power level available crossing one of the threshold power levels, the method includes changing both a power consumption level associated with the video decoding system and a video presentation quality. A method for determining optimum pairings of power consumption and video quality for a video decoding system is also provided. In addition, a power scalable video device, an integrated circuit chip for a video decoding system and a graphical user interface are provided.

Proceedings ArticleDOI
16 Jun 2003
TL;DR: The experiments show that PSNR and SAE do not adequately reflect perceived video quality when changes in spatial resolution and frame rate are involved, and are therefore not adequate for assessing quality in a multi-dimensional rate control scheme.
Abstract: Multi-dimensional rate control schemes, which jointly adjust two or three coding parameters, have been recently proposed to achieve a target bit rate while maximizing some objective measures of video quality. The objective measures used in these schemes are the peak signal-to-noise ratio (PSNR) or the sum of absolute errors (SAE) of the decoded video. These objective measures of quality may differ substantially from subjective quality, especially when changes of spatial resolution and frame rate are involved. The proposed schemes are, therefore, not optimal in terms of human visual perception. We have investigated the impact on subjective video quality of the three coding parameters: spatial resolution, frame rate, and quantization parameter (QP). To this end, we have conducted two experiments using the H.263+ codec and five video sequences. In Experiment 1, we evaluated the impact of jointly adjusting QP and frame rate on subjective quality and bit rate. In Experiment 2, we evaluated the impact of jointly adjusting QP and spatial resolution. From these experiments, we suggest several general rules and guidelines that can be useful in the design of an optimal multi-dimensional rate control scheme. The experiments also show that PSNR and SAE do not adequately reflect perceived video quality when changes in spatial resolution and frame rate are involved, and are therefore not adequate for assessing quality in a multi-dimensional rate control scheme. This paper describes the method and results of the investigation.

Proceedings Article
01 Jan 2003
TL;DR: This paper proposes an adaptive middleware based approach to optimize backlight power consumption for mobile handheld devices when playing streaming MPEG-1 video, without significantly compromising on video quality and performance results indicate that up to 60% of the power consumed by the backlight can be saved.
Abstract: Mobile handheld devices have stringent constraints on power consumption because they run on batteries that have a limited lifetime. Conserving power to prolong battery life is of primary importance for these devices. Several factors such as backlight intensity, the hard disk, the CPU, the network interface and the nature of the application contribute significantly towards power consumption for a mobile device. While significant research effort has been made to optimize power consumption at the application, network and processor levels, comparatively little work has been done to reduce or adapt to the power consumed by the backlight. In this paper, we propose an adaptive middleware based approach to optimize backlight power consumption for mobile handheld devices when playing streaming MPEG-1 video, without significantly compromising on video quality. Our performance results indicate that up to 60% of the power consumed by the backlight can be saved by using the proposed approach.

Patent
18 Nov 2003
TL;DR: In this article, a measure of quality of compressed video signals is obtained without reference to the original uncompressed version, but generated directly from the coded image parameters, thereby avoiding the need to decode the compressed signal.
Abstract: A measure of quality of compressed video signals is obtained without reference to the original uncompressed version, but generated directly from the coded image parameters, thereby avoiding the need to decode the compressed signal. A first measure is generated from the quantizer step size and a second measure is generated as a function of the number of blocks in the picture that have only one transform coefficient. The two measures are combined. Adjustments may be made to the step-size based measure to compensate for spatial or temporal masking effects.

Proceedings ArticleDOI
20 Jan 2003
TL;DR: The HEterogeneous Receiver-Oriented (HeRO) Broadcasting protocol allows receivers of various communication capabilities to share the same periodic broadcast, and therefore enjoy the same video quality while requiring very little buffer space.
Abstract: Video-on-Demand is undoubtedly a promising technology for many important applications Several periodic broadcast techniques have been proposed for the cost-effective implementation of such systems However, the once-and-for-all implementation strategies of these broadcast schemes imply a common bandwidth requirement for all the clients Multiresolution techniques address this issue by sacrificing video quality We present an alternative approach which does not have this drawback Our protocol, the HEterogeneous Receiver-Oriented (HeRO) Broadcasting, allows receivers of various communication capabilities to share the same periodic broadcast, and therefore enjoy the same video quality while requiring very little buffer space This is achieved using a new data segmentation scheme with a surprising property We present the broadcast technique, and compare its performance with that of existing methods

Proceedings ArticleDOI
16 Jun 2003
TL;DR: This paper presents a subjective method and an objective method for combining multiple subjective data sets and demonstrates that the objective method can be used as an effective substitute for the expensive and time consuming subjective meta-test.
Abstract: International recommendations for subjective video quality assessment (e.g., ITU-R BT.500-11) include specifications for how to perform many different types of subjective tests. In addition to displaying the video sequences in different ways, subjective tests also have different rating scales, different words associated with these scales, and many other test variables that change from one laboratory to another (e.g., viewer expertise and criticality, cultural differences, physical test environments). Thus, it is very difficult to directly compare or combine results from two or more subjective experiments. The ability to compare and combine results from multiple subjective experiments would greatly benefit developers and users of video technology since standardized subjective data bases could be expanded upon to include new source material and past measurement results could be related to newer measurement results. This paper presents a subjective method and an objective method for combining multiple subjective data sets. The subjective method utilizes a large meta-test with selected video clips from each subjective data set. The objective method utilizes the functional relationships between objective video quality metrics (extracted from the video sequences) and corresponding subjective mean opinion scores (MOSs). The objective mapping algorithm, called the iterated nested least-squares algorithm (INLSA), relates two or more independent data sets that are themselves correlated with some common intermediate variables (i.e, the objective video quality metrics). We demonstrate that the objective method can be used as an effective substitute for the expensive and time consuming subjective meta-test.

Proceedings ArticleDOI
09 Jul 2003
TL;DR: In this paper, the authors investigate quality adaptation of the layered VBR video generated by MPEG-4 FGS and develop a quality adaptation scheme that maximizes perceptual video quality through minimizing quality variation while at the same time increasing the usage of available bandwidth.
Abstract: Dynamic behavior of the Internet's transmission resources makes it difficult to provide perceptually good quality of streaming video. MPEG-4 Fine-Grained Scalable coding is proposed to deal with this problem by distributing the data in enhancement layers over a wide range of bit rates. However, encoded video also exhibits significant data rate variability to provide a consistent quality video. We are, therefore, faced with the problem of trying to accommodate the mismatch between the available bandwidth variability and the encoded video variability. In this paper, we investigate quality adaptation of the layered VBR video generated by MPEG-4 FGS. Our goal is to develop a quality adaptation scheme that maximizes perceptual video quality through minimizing quality variation while at the same time increasing the usage of available bandwidth. We develop an optimal adaptation scheme and an online heuristic based on whether the network conditions are known a priori. Experimental results show that the online heuristic as well as the optimal adaptation algorithm provide consistent video quality when used over both TFRC and TCP.

Journal ArticleDOI
TL;DR: It is demonstrated that the network resources consumed by an individual user in a spread-spectrum CDMA network can be taken as the product of the allocated source-coding rate R/sub s/ and the energy per bit normalized to the multiple-access interference noise density /spl gamma//sub b/.
Abstract: We consider future generation wireless code-division multiple-access (CDMA) cellular networks supporting heterogeneous compressed video traffic and investigate transport schemes for maximizing the number of users that can be supported in a single cell while simultaneously maximizing the reconstructed video quality of individual users. More specifically, we demonstrate that the network resources consumed by an individual user in a spread-spectrum CDMA network can be taken as the product of the allocated source-coding rate R/sub s/ and the energy per bit normalized to the multiple-access interference noise density /spl gamma//sub b/. We propose a joint source coding and power control (JSCPC) approach for allocating these two quantities to an individual user, subject to a constraint on the total available bandwidth, to simultaneously maximize the per-cell capacity while maximizing the quality of the delivered video to individual users. We demonstrate the efficacy of this approach using the ITU-T H.263+ video source coder, although the approach is generally applicable to other source-coding schemes as well. The results indicate a significant improvement in delivered quality-of-service (QoS), measured in terms of the end-user average peak signal-to-noise ratio, that can be achieved at a given level of network loading. Furthermore, we demonstrate that without an appropriate JSCPC strategy the traditional soft-capacity limit associated with CDMA networks is no longer present. Indeed, a precipitous decrease in performance can be expected with increasing load. We show that this behavior can be avoided with the proposed JSCPC approach, thereby significantly extending the useful capacity of the CDMA network while exhibiting a more graceful degradation pattern under increasing load.

Proceedings ArticleDOI
12 Jun 2003
TL;DR: In this article, the authors carried out a number of subjective experiments using typical streaming content, codecs, bitrates and network conditions, and used the subjective data to corroborate the prediction accuracy of a real-time non-reference quality metric.
Abstract: We carried out a number of subjective experiments using typical streaming content, codecs, bitrates and network conditions. In an attempt to review subjective testing procedures for video streaming applications, we used both Single Stimulus Continuous Quality Evaluation (SSCQE) and Double Stimulus Impairment Scale (DSIS) methods on the same test material. We thus compare these testing methods and present an analysis of the experimental results in view of codec performance. Finally, we use the subjective data to corroborate the prediction accuracy of a real-time non-reference quality metric.

Proceedings ArticleDOI
16 Jun 2003
TL;DR: The results of training the NROQM using a large set of video sequences, which include degraded and enhanced video, show high correlation between objective and subjective scores, and the results of the first performance test show good objective-subjective correlations.
Abstract: In this paper we present a no-reference objective quality metric (NROQM) that has resulted from extensive research on impairment metrics, image feature metrics, and subjective image quality in several projects in Philips Research, and participation in the ITU Video Quality Experts Group. The NROQM is aimed at requirements including video algorithm development, embedded monitoring and control of image quality, and evaluation of different types of display systems. NROQM is built from metrics for desirable and non-desirable image features (sharpness, contrast, noise, clipping, ringing, and blocking artifacts), and accounts for their individual and combined contributions to perceived image quality. We describe our heuristic, incremental approach to modeling quality and training the NROQM, and its advantages to deal with imperfect data and imperfect metrics. The results of training the NROQM using a large set of video sequences, which include degraded and enhanced video, show high correlation between objective and subjective scores, and the results of the first performance test show good objective-subjective correlations as well. We also discuss issues that require further research such as fully content-independent metrics, measuring over-enhanced video quality, and the role of temporal impairment metrics.

Journal ArticleDOI
TL;DR: Experimental results show the validity of the proposed ELEP model and that the associated ULP scheme is robust to burst packet loss in the Internet, and graceful degradation of video quality is achieved by the proposed scheme as the packet loss rate of an Internet connection increases.
Abstract: This paper presents an unequal loss protection (ULP) scheme for robust transmission of motion compensated video over the Internet. By exploiting the temporal dependency between frames, forward error correction (FEC) codes across packets are assigned to different frames in a group of pictures in the sense of minimizing the effect of error propagation, thus improving video quality significantly. To achieve optimal allocation of FEC codes, we formulate the effect of packet loss on video quality degradation as an expected length of error propagation (ELEP) model, which makes sense intuitively, as fewer frames corrupted implies better quality of reconstructed video. Experimental results show the validity of the proposed ELEP model and that the associated ULP scheme is robust to burst packet loss in the Internet. More importantly, graceful degradation of video quality is achieved by the proposed scheme as the packet loss rate of an Internet connection increases.

Proceedings ArticleDOI
15 Dec 2003
TL;DR: This paper analyzes the statistical characters on MV (motion vector) and cost function (called SAD) and proposes four efficient predictions and early-termination strategy in ME (motion estimation) of H.264, which keeps video quality while decreasing greatly computation complexity.
Abstract: This paper analyzes the statistical characters on MV (motion vector) and cost function (called SAD) in ME (motion estimation) of H.264, and proposes four efficient predictions and early-termination strategy in ME, which keeps video quality while decreasing greatly computation complexity. Simulation results from QCIF format to HD (high definition) format show that the proposed method has good performance for H.264, and it is also easily be suitable to other standard platforms. This method has been adopted as part of the fast motion estimation in JVT test model.