scispace - formally typeset
Search or ask a question

Showing papers on "Video quality published in 2002"


Journal ArticleDOI
TL;DR: An analytic solution for adaptive intra mode selection and joint source-channel rate control under time-varying wireless channel conditions is derived and significantly improves the end-to-end video quality in wireless video coding and transmission.
Abstract: We first develop a rate-distortion (R-D) model for DCT-based video coding incorporating the macroblock (MB) intra refreshing rate. For any given bit rate and intra refreshing rate, this model is capable of estimating the corresponding coding distortion even before a video frame is coded. We then present a theoretical analysis of the picture distortion caused by channel errors and the subsequent inter-frame propagation. Based on this analysis, we develop a statistical model to estimate such channel errors induced distortion for different channel conditions and encoder settings. The proposed analytic model mathematically describes the complex behavior of channel errors in a video coding and transmission system. Unlike other experimental approaches for distortion estimation reported in the literature, this analytic model has very low computational complexity and implementation cost, which are highly desirable in wireless video applications. Simulation results show that this model is able to accurately estimate the channel errors induced distortion with a minimum delay in processing. Based on the proposed source coding R-D model and the analytic channel-distortion estimation, we derive an analytic solution for adaptive intra mode selection and joint source-channel rate control under time-varying wireless channel conditions. Extensive experimental results demonstrate that this scheme significantly improves the end-to-end video quality in wireless video coding and transmission.

390 citations


Journal ArticleDOI
TL;DR: A new framework for rate-distortion (R-D) analysis is presented, where the coding rate R and distortion D are considered as functions of /spl rho/ which is the percentage of zeros among the quantized transform coefficients.
Abstract: We present a new framework for rate-distortion (R-D) analysis, where the coding rate R and distortion D are considered as functions of /spl rho/ which is the percentage of zeros among the quantized transform coefficients. Previously (see He, Z. et al., Int. Conf. Acoustics, Speech and Sig. Proc., 2001), we observed that, in transform coding of images and videos, the rate function R(/spl rho/) is approximately linear. Based on this linear rate model, a simple and unified rate control algorithm was proposed for all standard video coding systems, such as MPEG-2, H.263, and MPEG-4. We further develop a distortion model and an optimum bit allocation scheme in the /spl rho/ domain. This bit allocation scheme is applied to MPEG-4 video coding to allocate the available bits among different video objects. The bits target of each object is then achieved by our /spl rho/-domain rate control algorithm. When coupled with a macroblock classification scheme, the above bit allocation and rate control scheme can also be applied to other video coding systems, such as H.263, at the macroblock level. Our extensive experimental results show that the proposed algorithm controls the encoder bit rate very accurately and improves the video quality significantly (by up to 1.5 dB).

279 citations


Journal ArticleDOI
TL;DR: A significant enhancement of the method by means of a new neural approach, the random NN model, and its learning algorithm are reported on, both of which offer better performances for the application.
Abstract: An important and unsolved problem today is that of automatic quantification of the quality of video flows transmitted over packet networks. In particular, the ability to perform this task in real time (typically for streams sent themselves in real time) is especially interesting. The problem is still unsolved because there are many parameters affecting video quality, and their combined effect is not well identified and understood. Among these parameters, we have the source bit rate, the encoded frame type, the frame rate at the source, the packet loss rate in the network, etc. Only subjective evaluations give good results but, by definition, they are not automatic. We have previously explored the possibility of using artificial neural networks (NNs) to automatically quantify the quality of video flows and we showed that they can give results well correlated with human perception. In this paper, our goal is twofold. First, we report on a significant enhancement of our method by means of a new neural approach, the random NN model, and its learning algorithm, both of which offer better performances for our application. Second, we follow our approach to study and analyze the behavior of video quality for wide range variations of a set of selected parameters. This may help in developing control mechanisms in order to deliver the best possible video quality given the current network situation, and in better understanding of QoS aspects in multimedia engineering.

265 citations


Journal ArticleDOI
TL;DR: The experimental results show that the proposed method of measuring blocking artifacts is effective and stable across a wide variety of images, and the proposed blocking-artifact reduction method exhibits satisfactory performance as compared to other post-processing techniques.
Abstract: Blocking artifacts continue to be among the most serious defects that occur in images and video streams compressed to low bit rates using block discrete cosine transform (DCT)-based compression standards (e.g., JPEG, MPEG, and H.263). It is of interest to be able to numerically assess the degree of blocking artifact in a visual signal, for example, in order to objectively determine the efficacy of a compression method, or to discover the quality of video content being delivered by a web server. We propose new methods for efficiently assessing, and subsequently reducing, the severity of blocking artifacts in compressed image bitstreams. The method is blind, and operates only in the DCT domain. Hence, it can be applied to unknown visual signals, and it is efficient since the signal need not be compressed or decompressed. In the algorithm, blocking artifacts are modeled as 2-D step functions. A fast DCT-domain algorithm extracts all parameters needed to detect the presence of, and estimate the amplitude of blocking artifacts, by exploiting several properties of the human vision system. Using the estimate of blockiness, a novel DCT-domain method is then developed which adaptively reduces detected blocking artifacts. Our experimental results show that the proposed method of measuring blocking artifacts is effective and stable across a wide variety of images. Moreover, the proposed blocking-artifact reduction method exhibits satisfactory performance as compared to other post-processing techniques. The proposed technique has a low computational cost hence can be used for real-time image/video quality monitoring and control, especially in applications where it is desired that the image/video data be processed directly in the DCT-domain.

250 citations


Journal ArticleDOI
TL;DR: New methods of performing selective encryption and spatial/frequency shuffling of compressed digital content that maintain syntax compliance after content has been secured are introduced.
Abstract: We introduce new methods of performing selective encryption and spatial/frequency shuffling of compressed digital content that maintain syntax compliance after content has been secured. The tools described have been proposed to the MPEG-4 Intellectual Property Management and Protection (IPMP) standardization group and have been adopted into the MPEG-4 IPMP Final Proposed Draft Amendment (FPDAM). We describe the application of the new methods to the protection of MPEG-4 video content in the wireless environment, and illustrate how they are used to leverage established encryption algorithms for the protection of only the information fields in the bitstream that are critical to the reconstructed video quality, while maintaining compliance to the syntax of MPEG-4 video, and thereby reduces the amount of data to be encrypted and guarantees the inheritance of many of the good properties of the unprotected bitstreams that have been carefully studied and built, such as error resiliency and network friendliness. The encrypted content bitstream works with many existing random access, network bandwidth adaptation, and error control techniques that have been developed for standard-compliant compressed video, thus making it especially suitable for wireless multimedia applications. Standard compliance also allows subsequent signal processing techniques to be applied to the encrypted bitstream.

226 citations


01 Jun 2002
TL;DR: The goal of this report is to provide a complete description of the ITS video quality metric (VQM) algorithms and techniques, which provide close approximations to the overall quality impressions, or mean opinion scores, of digital video impairments that have been graded by panels of viewers.
Abstract: Objective metrics for measuring digital video performance are required by Government and industry for specification of system performance requirements, comparison of competing service offerings, service level agreements, network maintenance, and optimization of the use of limited network resources such as transmission bandwidth. To be accurate, digital video quality measurements must be based on the perceived quality of the actual video being received by the users of the digital video system rather than the measured quality of traditional video test signals (e.g., color bar). This is because the performance of digital video systems is variable and depends upon the dynamic characteristics of both the original video (e.g., spatial detail, motion) and the digital transmission system (e.g., bit rate, error rate). The goal of this report is to provide a complete description of the ITS video quality metric (VQM) algorithms and techniques. The ITS automated objective measurement algorithms provide close approximations to the overall quality impressions, or mean opinion scores, of digital video impairments that have been graded by panels of viewers.

196 citations


Journal ArticleDOI
TL;DR: This work develops unique algorithms for assessing the quality of foveated image/video data using a model of human visual response and demonstrates that quality vs. compression is enhanced considerably by the foveation approach.
Abstract: Most image and video compression algorithms that have been proposed to improve picture quality relative to compression efficiency have either been designed based on objective criteria such as signal-to-noise-ratio (SNR) or have been evaluated, post-design, against competing methods using an objective sample measure. However, existing quantitative design criteria and numerical measurements of image and video quality both fail to adequately capture those attributes deemed important by the human visual system, except, perhaps, at very low error rates. We present a framework for assessing the quality of and determining the efficiency of foveated and compressed images and video streams. Image foveation is a process of nonuniform sampling that accords with the acquisition of visual information at the human retina. Foveated image/video compression algorithms seek to exploit this reduction of sensed information by nonuniformly reducing the resolution of the visual data. We develop unique algorithms for assessing the quality of foveated image/video data using a model of human visual response. We demonstrate these concepts on foveated, compressed video streams using modified (foveated) versions of H.263 that are standard-compliant. We rind that quality vs. compression is enhanced considerably by the foveation approach.

178 citations


Journal ArticleDOI
07 Aug 2002
TL;DR: Investigations are conducted to simplify and refine a vision-model-based video quality metric without compromising its prediction accuracy and the results show a strong correlation between the objective blocking ratings and the mean opinion scores on blocking artifacts.
Abstract: In this paper investigations are conducted to simplify and refine a vision-model-based video quality metric without compromising its prediction accuracy. Unlike other vision-model-based quality metrics, the proposed metric is parameterized using subjective quality assessment data recently provided by the Video Quality Experts Group. The quality metric is able to generate a perceptual distortion map for each and every video frame. A perceptual blocking distortion metric (PBDM) is introduced which utilizes this simplified quality metric. The PBDM is formulated based on the observation that blocking artifacts are noticeable only in certain regions of a picture. A method to segment blocking dominant regions is devised, and perceptual distortions in these regions are summed up to form an objective measure of blocking artifacts. Subjective and objective tests are conducted and the performance of the PBDM is assessed by a number of measures such as the Spearman rank-order correlation, the Pearson correlation, and the average absolute error The results show a strong correlation between the objective blocking ratings and the mean opinion scores on blocking artifacts.

135 citations


Proceedings ArticleDOI
Zhou Wang, Ligang Lu1, Alan C. Bovik
10 Dec 2002
TL;DR: A new philosophy in designing image/video quality metrics is followed, which uses structural distortion as an estimation of perceived visual distortion in order to develop a new approach for video quality assessment.
Abstract: Objective image/video quality measures play important roles in various image/video processing applications, such as compression, communication, printing, analysis, registration, restoration and enhancement. Most proposed quality assessment approaches in the literature are error sensitivity-based methods. We follow a new philosophy in designing image/video quality metrics, which uses structural distortion as an estimation of perceived visual distortion. We develop a new approach for video quality assessment. Experiments on the video quality experts group (VQEG) test data set shows that the new quality measure has higher correlation with subjective quality measurement than the proposed methods in VQEG's Phase I tests for full-reference video quality assessment.

135 citations


Journal ArticleDOI
TL;DR: This work investigates the relations of rate, distortion and power consumption in video coding and proposes a power-minimized bit-allocation scheme considering the processing power, for source coding and channel coding, jointly with the transmission power.
Abstract: Video communication over wireless links using handheld devices is a challenging task due to the time-varying characteristics of the wireless channels and limited battery resources. Rate-distortion (RD) analysis plays a key role in video coding and communication systems, and usually the RD relation does not assume any power constraint. We investigate the relations of rate, distortion and power consumption. Based on those relations, we propose a power-minimized bit-allocation scheme considering the processing power, for source coding and channel coding, jointly with the transmission power. The total bits are allocated between source and channel coders, according to wireless channel conditions and video quality requirements, to minimize the total power consumption for a single user and a group of users in a cell, respectively. Simulation results show that our proposed joint power-control and bit-allocation scheme achieves high power savings compared to the conventional scheme.

118 citations


Journal ArticleDOI
TL;DR: A perceptual video quality system is proposed, that uses a linear combination of three indicators, the “edginess” of the luminance, the normalized color error and the temporal decorrelation, that showed the highest variance weighted regression overall correlation of all models.
Abstract: Modern video coding systems such as ISO-MPEG1,2,4 exploit properties of the human visual system, to reduce the bit rate at which a video sequence is coded, given a certain required video quality. As a result, to the degree in which such exploitation is successful, accurate prediction of the quality of the output video of such systems, should also take the human visual system into account. In this paper, we propose a perceptual video quality system, that uses a linear combination of three indicators. The indicators are, the “edginess” of the luminance, the normalized color error and the temporal decorrelation. In the benchmark by the Video Quality Expert Group (VQEG), a combined ITU-T and ITU-R expert group, the model showed the highest variance weighted regression overall correlation of all models.

Proceedings ArticleDOI
09 Oct 2002
TL;DR: The 3D video recorder is presented, a system capable of recording, processing, and playing three-dimensional video from multiple points of view, and the player builds upon point-based rendering techniques and is thus capable of rendering high-quality images in real-time.
Abstract: We present the 3D video recorder, a system capable of recording, processing, and playing three-dimensional video from multiple points of view. We first record 2D video streams from several synchronized digital video cameras and store pre-processed images to disk. An off-line processing stage converts these images into a time-varying three-dimensional hierarchical point-based data structure and stores this 3D video to disk. We show how we can trade-off 3D video quality with processing performance and devise efficient compression and coding schemes for our novel 3D video representation. A typical sequence is encoded at less than 7 megabits per second at a frame rate of 8.5 frames per second. The 3D video player decodes and renders 3D videos from hard-disk in real-time, providing interaction features known from common video cassette recorders, like variable-speed forward and reverse, and slow motion. 3D video playback can be enhanced with novel 3D video effects such as freeze-and-rotate and arbitrary scaling. The player builds upon point-based rendering techniques and is thus capable of rendering high-quality images in real-time. Finally, we demonstrate the 3D video recorder on multiple real-life video sequences.

Journal ArticleDOI
TL;DR: The proposed framework, which is referred to as adaptive motion-compensation FGS (AMC-FGS), provides improved video quality and the new scalability structures provide the FGS framework with the flexibility to provide tradeoffs between resilience, higher coding efficiency and terminal complexity for more efficient wireless transmission.
Abstract: Transmission of video over wireless and mobile networks requires a scalable solution that is capable of adapting to the varying channel conditions in real-time (bit-rate scalability). Furthermore, video content needs to be coded in a scalable fashion to match the capabilities of a variety of devices (complexity scalability). These two scalability properties provide the flexibility that is necessary to satisfy the "anywhere, anytime and anyone" network paradigm of wireless systems. MPEG-4 fine-granular-scalability (FGS) is a flexible low-complexity solution for video streaming over heterogeneous networks (e.g., the Internet and wireless networks) and is highly resilient to packet losses. However, the flexibility and packet-loss resilience come at the expense of decreased coding efficiency compared with nonscalable coding. A novel scalable video-coding framework and corresponding compression methods for wireless video streaming is introduced. Building on the FGS approach, the proposed framework, which we refer to as adaptive motion-compensation FGS (AMC-FGS), provides improved video quality of up to 2 dB. Furthermore, the new scalability structures provide the FGS framework with the flexibility to provide tradeoffs between resilience, higher coding efficiency and terminal complexity for more efficient wireless transmission.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: Simulation results show the implemented QoS control schemes that use the feedbacks from the agent and control transmission rate and robustness provide better video quality than a simple rate control mechanism.
Abstract: We propose a of control architecture for mobile multimedia streaming in which RTP monitoring agents report QoS information to media servers. The RTP monitoring agents lie midway between wired networks and radio links, monitor RTP packets sent from media servers to mobile terminals, and report quality information to media servers so that the servers can realize network-adaptive QoS control. By analyzing the information sent from the agents, the servers can distinguish quality degradation caused by network congestion from that caused by radio link errors, and can improve service quality by controlling the transmission rate and robustness against packet loss. Simulation results show our implemented QoS control schemes that use the feedbacks from the agent and control transmission rate and robustness provide better video quality than a simple rate control mechanism.

Journal ArticleDOI
TL;DR: This work proposes a low-delay interleaving scheme that uses the video encoder buffer as a part of interleaved memory and proposes a conditional retransmission strategy that reduces the number of retransmissions.
Abstract: We consider the scenario of using Automatic Repeat reQuest (ARQ) retransmission for two-way low-bit-rate video communications over wireless Rayleigh fading channels. Low-delay constraint may require that a corrupted retransmitted packet not be retransmitted again, and thus there will be packet errors at the decoder which results in video quality degradation. We propose a scheme to improve the video quality. First, we propose a low-delay interleaving scheme that uses the video encoder buffer as a part of interleaving memory. Second, we propose a conditional retransmission strategy that reduces the number of retransmissions. Simulation results show that our proposed scheme can effectively reduce the number of packet errors and improve the channel utilization. As a result, we reduce the number of skipped frames and obtain a peak signal-to-noise ratio improvement up to about 4 dB compared to H.263 TMN-8.

Patent
Eric Barrau1
10 Jan 2002
TL;DR: In this article, a scalable video transcoding method for transcoding an input video signal coded according to MPEG-2 video standard, resulting in four transcoding architectures is presented, and a cost-effective control strategy of said switches based on an energy prediction of said coding error is also proposed.
Abstract: The invention relates to a scalable video transcoding method for transcoding an input video signal ( 103 ) coded according to MPEG-2 video standard, resulting in four transcoding architectures. Scalability is obtained by means of to switches ( 120 ) and ( 130 ) determining whether or not reconstruction ( 118 ) and motion compensation ( 128 ) of the coding error ( 119 ) are performed. Each architecture thus defined having a different processing complexity, the overall processing resources available can be optimally used and minimized along a group of frames in the transcoding of parts of said frames in accordance with one of these four architectures, while ensuring a good video quality of tanscoded signal ( 109 ). A cost-effective control strategy of said switches based on an energy prediction of said coding error is also proposed.

Proceedings ArticleDOI
10 Dec 2002
TL;DR: This paper presents a class of low-complexity video quality metrics based on the standard spatial observer (SSO) model, and suggests that local masking is a key feature to improve the correlation of the basic SSO model.
Abstract: Video quality metrics are intended to replace human evaluation with evaluation by machine. To accurately simulate human judgement, they must include some aspects of the human visual system. In this paper we present a class of low-complexity video quality metrics based on the standard spatial observer (SSO). In these metrics, the basic SSO model is improved with several additional features from the current human vision models. To evaluate the metrics, we make use of the data set recently produced by the Video Quality Experts Group (VQEG), which consists of subjective ratings of 160 samples of digital video covering a wide range of quality. For each metric we examine the correlation between its predictions and the subjective ratings. The results show that SSO-based models with local masking obtain the same degree of accuracy as the best metric considered by VQEG (P5), and significantly better correlations than the other VQEG models. The results suggest that local masking is a key feature to improve the correlation of the basic SSO model.

Patent
Michael Horowitz1, Rick Flott1
23 Aug 2002
TL;DR: In this article, a system and method for concealing video errors is presented. Butts et al. proposed a method to encode, reorder, and packetize video information into video data packets for transmission over a communication network such that the system conceals errors caused by lost video packets when the system receives, depacketizes, orders, and decodes the data packets.
Abstract: The present invention provides, in one embodiment, a system and method for concealing video errors. The system encodes, reorders, and packetizes video information into video data packets for transmission over a communication network such that the system conceals errors caused by lost video data packets when the system receives, depacketizes, orders, (915) and decodes the data packets. In one embodiment, the system and method encodes and packetizes video information, such that adjacent macroblocks are not placed in the same video data packets (925) . Additionally, the system and method may provide information accompanying the video data packets to facilitate the decoding process. An advantage to such a scheme is that errors due to video data packet loss are spatially distributed over a video frame. Thus, if regions of data surrounding a lost macroblock are successfully decoded, the decoder may predict motion vectors and spatial content with a higher degree of accuracy, which leads to higher video quality.

Journal ArticleDOI
TL;DR: Error control and power allocation for transmitting wireless video over CDMA networks are considered in conjunction with multiuser detection and a combined optimization problem is formulated and given the optimal joint rate and power allocations for each of these three receivers.
Abstract: Error control and power allocation for transmitting wireless video over CDMA networks are considered in conjunction with multiuser detection. We map a layered video bitstream to several CDMA fading channels and inject multiple source/parity layers into each of these channels at the transmitter. At the receiver, we employ a linear minimum mean-square error (MMSE) multiuser detector in the uplink and two types of blind linear MMSE detectors, i.e., the direct-matrix-inversion blind detector and the subspace blind detector, in the downlink, for demodulating the received data. For given constraints on the available bandwidth and transmit power, the transmitter determines the optimal power allocation among different CDMA fading channels and the optimal number of source and parity packets to send that offer the best video quality. We formulate a combined optimization problem and give the optimal joint rate and power allocation for each of these three receivers. Simulation results show a performance gain of up to 3.5 dB with joint optimization over with rate optimization only.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: Experimental results show higher than 1 dB PSNR gain and smaller PSNR variation by comparing the proposed CQVRC with current MPEG-4 FGS VM approach, where BL is encoded with MPEG- 4 VM rate control and uniform bit allocation is applied to EL.
Abstract: This paper presents a constant quality video rate control (CQVRC) scheme for MPEG-4 FGS (fine-grain scalability) video. The proposed scheme can mitigate quality variations among consecutive frames during the period when there is significant change in the video source or the transmission bandwidth. CQVRC utilizes both non-scalable base layer (BL) and scalable enhancement layer (EL) of MPEG-4 for smoother video quality. To take advantage of the relaxed delay requirement of streaming video, CQVRC for BL exploits a larger decoder buffer, future frame information, and temporal scene segmentation. CQVRC for EL is achieved by embedding one rate-distortion (R-D) pair in each bitplane to construct a piecewise linear R-D model. Experiment results show higher than 1 dB PSNR gain and smaller PSNR variation by comparing the proposed CQVRC with current MPEG-4 FGS VM approach, where BL is encoded with MPEG-4 VM rate control and uniform bit allocation is applied to EL.

Proceedings ArticleDOI
01 Dec 2002
TL;DR: The most successful and computationally cheapest scheme obtains an accuracy of 82% on the task of picking the "consistent" speaker from a set including three confusers, and a final experiment demonstrates the potential utility of the scheme for speaker localization in video.
Abstract: This paper considers schemes for determining which of a set of faces on screen, if any, is producing speech in a video soundtrack. Whilst motivated by the TREC 2002 (Video Retrieval Track) monologue detection task, the schemes are also applicable to voice and face-based biometrics systems, for assessing lip synchronization quality in movie editing and computer animation, and for speaker localization in video. Several approaches are discussed: two implementations of a generic mutual-information-based measure of the degree of synchrony between signals, which can be used with or without prior speech and face detection, and a stronger model-based scheme which follows speech and face detection with an assessment of face and lip movement plausibility. Schemes are compared on a corpus of 1016 test cases containing multiple faces and multiple speakers, a test set 200 times larger than the nearest comparable test set of which we are aware. The most successful and computationally cheapest scheme obtains an accuracy of 82% on the task of picking the "consistent" speaker from a set including three confusers. A final experiment demonstrates the potential utility of the scheme for speaker localization in video.

Patent
Jack L. Kouloheris1, Ligang Lu1, Zhou Wang1
13 Dec 2002
TL;DR: In this paper, a method to predict visual quality of a DCT (discrete cosine transform) based compressed image or video stream without referring to its source is proposed, based on an estimation of quantization errors using MPEG quantization scales and statistics of the inverse quantized DCT coefficients, a blind estimation of the 8×8 and 16×16 blocking effect, and an adaptive combination of the quantization error estimation and the blocking effect estimation using the MPEG motion vector information.
Abstract: A method to predict visual quality of a DCT (discrete cosine transform) based compressed image or video stream without referring to its source. When applied to an MPEG video stream, the method is based on (1) an estimation of quantization errors using MPEG quantization scales and statistics of the inverse quantized DCT coefficients, (2) a blind estimation of the 8×8 and 16×16 blocking effect, and (3) an adaptive combination of the quantization error estimation and the blocking effect estimation using the MPEG motion vector information. The method may be used in many applications, such as network video servers, switches and multiplexers for automatic quality monitoring and control of video services, video encoders, decoders, transcoders, and statistical multiplexers for picture quality optimization.

Journal ArticleDOI
TL;DR: A methodology using circular backpropagation (CBP) neural networks for the objective quality assessment of motion picture expert group (MPEG) video streams and provides a satisfactory, continuous-time approximation for actual scoring curves, which was validated statistically in terms of confidence analysis.
Abstract: The increasing use of compression standards in broadcasting digital TV has raised the need for established criteria to measure perceived quality. Novel methods must take into account the specific artifacts introduced by digital compression techniques. This paper presents a methodology using circular backpropagation (CBP) neural networks for the objective quality assessment of motion picture expert group (MPEG) video streams. Objective features are continuously extracted from compressed video streams on a frame-by-frame basis; they feed the CBP network estimating the corresponding perceived quality. The resulting adaptive modeling of subjective perception supports a real-time system for monitoring displayed video quality. The overall system mimics perception but does not require an analytical model of the underlying physical phenomenon. The ability to process compressed video streams represents a crucial advantage over existing approaches, as avoiding the decoding process greatly enhances the system's real-time performance. Experimental evidence confirmed the approach validity. The system was tested on real test videos; they included different contents ranging from fiction to sport. The neural model provided a satisfactory, continuous-time approximation for actual scoring curves, which was validated statistically in terms of confidence analysis. As expected, videos with slow-varying contents such as fiction featured the best performances.

Proceedings ArticleDOI
07 Nov 2002
TL;DR: A new technique for watermarking of MPEG compressed video streams is proposed, which is fast and reliable, and is suitable for copyright protection and real-time content authentication applications.
Abstract: A new technique for watermarking of MPEG compressed video streams is proposed. The watermarking scheme operates directly in the domain of MPEG program streams. Perceptual models are used during the embedding process in order to preserve the video quality. The watermark is embedded in the compressed domain and is detected without the use of the original video sequence. Experimental evaluation demonstrates that the proposed scheme is able to withstand a variety of attacks. The resulting watermarking system is fast and reliable, and is suitable for copyright protection and real-time content authentication applications.

Proceedings ArticleDOI
07 Nov 2002
TL;DR: This paper introduces two video quality assessment models, one of which requires the original video as a reference and is a structural distortion measurement based approach, which is different from traditional error sensitivity based methods.
Abstract: There has been an increasing need recently to develop objective quality measurement techniques that can predict perceived video quality automatically. This paper introduces two video quality assessment models. The first one requires the original video as a reference and is a structural distortion measurement based approach, which is different from traditional error sensitivity based methods. Experiments on the video quality experts group (VQEG) test data set show that the new quality measure has higher correlation with subjective quality evaluation than the proposed methods in VQEG's Phase I tests for full-reference video quality assessment. The second model is designed for quality estimation of compressed MPEG video stream without referring to the original video sequence. Preliminary experimental results show that it correlates well with our full-reference quality assessment model.

Proceedings ArticleDOI
10 Dec 2002
TL;DR: An adaptive video transcoding method is obtained, which combines intra-coding and inter-coded with a fast motion vector reestimation method to strike a good balance between computational complexity and transcoded video quality.
Abstract: Fast forward and fast reverse playbacks are two common video browsing functions provided in many analog and digital video players They help users quickly find and access video segments of interest by scanning through the content of a video at a faster than normal playback speed We propose a video transcoding approach to realizing fast forward and reverse video playbacks by generating a new compressed video from a pre-coded video To reduce the computational requirements, we design and compare several fast algorithms for estimating the motion vectors required in transcoded video To accommodate changes due to frame skipping for fast video playback, we also alter the group-of-pictures structure of transcoded video In addition, subjective tests are conducted to assess the minimum video peak-signal-to-noise-ratio degradation that is perceptible to viewers at different fast playback speeds To this end, we obtain an adaptive video transcoding method, which combines intra-coding and inter-coding with a fast motion vector reestimation method to strike a good balance between computational complexity and transcoded video quality Experimental results are reported to show the efficacy of the proposed method

Patent
21 Mar 2002
TL;DR: In this paper, a block-based motion-compensated frame interpolation method was proposed using a blockbased video coder operating in low bit rates, which maps from block-wise motion to pixelwise motion in a motion vector mapping unit.
Abstract: A block-based motion-compensated frame interpolation method and apparatus using a block-based video coder operating in low bit rates. Smooth movement of objects between video frames can be obtained without the complexity of pixelwise motion estimation that is present in standard MCI. An additional motion search for interpolating all of the individual pixel trajectories is not required because the interpolation uses block-based motion vector information from a standard codec such as H.26x/MPEG. Video quality is improved by increasing smoothness and the frame rate is increased without a substantial increase in the computational complexity. The proposed block-based MCI method maps from block-wise motion to pixel-wise motion in a motion vector mapping unit. A morphological closure operation and pattern block refinement segmentation of the blocks are provided to close holes in the moving object block and replace the morphologically closed motion block with the most similar pattern selected from a group of 34 patterns. Experimental results show that the visual quality of coded low-bit-rate video can be significantly improved as compared to the frame repetition scheme at the expense of a small increase in the complexity of the decoder.

Proceedings ArticleDOI
24 Jun 2002
TL;DR: The proposed sub-picture coding technique facilitates both region-of-interest coding and unequal error protection by partitioning images to regions of interest and separating the corresponding coded data units from each other.
Abstract: Region-of-interest coding and unequal error protection are two important tools in video communication systems to improve the received visual quality. One common property of the two techniques is that unequal coding or transmission is applied to improve the quality of the most important parts of images. The proposed sub-picture coding technique facilitates both region-of-interest coding and unequal error protection by partitioning images to regions of interest and separating the corresponding coded data units from each other. Simulation results show that the overall subjective quality is considerably improved compared to the conventional coding schemes.

01 Jan 2002
TL;DR: This paper investigates perceptual objective quality assessment technologies and uses them to exploit the relationships between perceptual video quality, output bit rate, and quantization scales of video encoders to investigate an adaptive video quality control mechanism based on an application-level perceptual videoquality scheme.
Abstract: There has been an increased interest in adaptive video quality control and dynamically adjusting the output video bit rate based on the status of the network. However, network-level performance parameters cannot accurately reflect the video quality perceived by the end users. Our goal is to investigate an adaptive perceptual video quality control mechanism based on an application-level perceptual video quality scheme. In this paper we investigate perceptual objective quality assessment technologies and use them to exploit the relationships between perceptual video quality, output bit rate, and quantization scales of video encoders. We also implemented a real time Video over IP (VIP) network application that uses a feedback channel to relay measurements, taken at the end user, to the source side, to enable the calculation of the perceptual video quality degradation caused by IP packet loss. Using this experimental setup, we were able to investigate appropriate rules for adaptive perceptual video quality control based on an application-level perceptual video quality scheme.

01 Jan 2002
TL;DR: A range of new assessment methods that have been developed in research contexts over the past few years are described and demonstrated, and a discussion of their suitability for use in different evaluation contexts are discussed.
Abstract: The aim of the tutorial is to introduce participants to new methods for establishing audiovisual quality requirements for a range of realtime multimedia applications Until recently, there were few HCI-specific methods for assessing audio and video quality and the usability of videoconferencing systems The tutorial will survey methods commonly used in the telecommunications industry, and explain why these are usually not suitable for assessing usability of audio and video The main part of the tutorial consists of description and demonstration of a range of new assessment methods that have been developed in research contexts over the past few years, and a discussion of their suitability for use in different evaluation contexts As well as methods for measuring task performance and user satisfaction, we will cover innovative approaches to assessing the impact on user behaviour and user cost such as eye-tracking and physiological responses and the way in which users communicate with each other