scispace - formally typeset
Search or ask a question

Showing papers on "Video quality published in 2007"


Proceedings ArticleDOI
12 Nov 2007
TL;DR: The impact on image quality of rendered arbitrary intermediate views is investigated and analyzed in a second part, comparing compressed multi-view video plus depth data at different bit rates with the uncompressed original.
Abstract: A study on the video plus depth representation for multi-view video sequences is presented. Such a 3D representation enables functionalities like 3D television and free viewpoint video. Compression is based on algorithms for multi-view video coding, which exploit statistical dependencies from both temporal and inter-view reference pictures for prediction of both color and depth data. Coding efficiency of prediction structures with and without inter-view reference pictures is analyzed for multi-view video plus depth data, reporting gains in luma PSNR of up to 0.5 dB for depth and 0.3 dB for color. The main benefit from using a multi-view video plus depth representation is that intermediate views can be easily rendered. Therefore the impact on image quality of rendered arbitrary intermediate views is investigated and analyzed in a second part, comparing compressed multi-view video plus depth data at different bit rates with the uncompressed original.

485 citations


Journal ArticleDOI
TL;DR: Improved video quality assessment algorithms are obtained by incorporating a recent model of human visual speed perception and incorporating the model as spatiotemporal weighting factors, where the weight increases with the information content and decreases with the perceptual uncertainty in video signals.
Abstract: Motion is one of the most important types of information contained in natural video, but direct use of motion information in the design of video quality assessment algorithms has not been deeply investigated. Here we propose to incorporate a recent model of human visual speed perception [Nat. Neurosci. 9, 578 (2006)] and model visual perception in an information communication framework. This allows us to estimate both the motion information content and the perceptual uncertainty in video signals. Improved video quality assessment algorithms are obtained by incorporating the model as spatiotemporal weighting factors, where the weight increases with the information content and decreases with the perceptual uncertainty. Consistent improvement over existing video quality assessment algorithms is observed in our validation with the video quality experts group Phase I test data set.

224 citations


Journal ArticleDOI
TL;DR: DGR improves the average video peak signal-to-noise ratio (PSNR) by 3dB, compared to a traditional geographic routing scheme, and has the following advantages: lower delay, substantially longer network lifetime, and a better received video quality.

190 citations


Journal ArticleDOI
TL;DR: This paper poses the cross-layer problem as a distortion minimization given delay constraints and derive analytical solutions by modifying existing joint source-channel coding theory aimed at fulfilling rate, rather than delay, constraints and proposes real-time algorithms that explicitly consider the available information about previously transmitted packets.
Abstract: Existing wireless networks provide dynamically varying resources with only limited support for the quality of service required by the bandwidth-intense, loss-tolerant and delay-sensitive multimedia applications. This variability of resources does not significantly impact delay insensitive data transmission (e.g., file transfers), but has considerable consequences for multimedia applications. Recently, the research focus has been to adapt existing algorithms and protocols at the lower layers of the protocol stack to better support multimedia transmission applications and conversely, to modify application layer solutions to cope with the varying wireless networks resources. In this paper, we show that significant improvements in wireless multimedia performance can be obtained by deploying a joint application-layer adaptive packetization and prioritized scheduling and MAC-layer retransmission strategy. We deploy a state-of-the-art wavelet coder for the compression of the video data that enables on-the-fly adaptation to changing channel conditions and inherent prioritization of the video bitstream. We pose the cross-layer problem as a distortion minimization given delay constraints and derive analytical solutions by modifying existing joint source-channel coding theory aimed at fulfilling rate, rather than delay, constraints. We also propose real-time algorithms that explicitly consider the available information about previously transmitted packets. The obtained results show significant improvements in terms of video quality as opposed to ad-hoc optimizations currently deployed, while the complexity associated with performing this optimization in real time, i.e., at transmission time, is limited

185 citations


Journal ArticleDOI
TL;DR: A cross-layer design framework based on a novel two-level superposition coded multicasting (SCM) scheme is introduced and simulation results show that much improved video quality is achievable with this approach.
Abstract: The advances in broadband Internet access and scalable video technologies have made it possible for Internet Protocol television (IPTV) to become the next killer application for modern Internet carriers in metropolitan areas. With the recent release of IEEE 802.16d/e (worldwide interoperability for microwave access or WiMAX), broadband wireless access (BWA) is envisioned to further extend IPTV services to a new application scenario with wireless and mobility dimensions. It is a very strategic but challenging leverage for a carrier to glimpse the potential of IPTV by using WiMAX as the access network. Challenges are posed for IPTV over WiMAX due to multicasting under a diversity of fading conditions. A cross-layer design framework based on a novel two-level superposition coded multicasting (SCM) scheme is introduced. Simulation results show that much improved video quality is achievable with our approach.

183 citations


Proceedings ArticleDOI
Zhengye Liu1, Yanming Shen1, Shivendra S. Panwar1, Keith W. Ross1, Yao Wang1 
27 Aug 2007
TL;DR: The simulation results show that the distributed incentive mechanism designed in this paper can provide differentiated video quality commensurate with a peer's contribution to other peers, and can also discourage free-riders.
Abstract: In this paper, we design a distributed incentive mechanism for mesh-pull P2P live streaming networks. In our system, a video is encoded into layers with lower layers having more importance. The system is heterogeneous with peers having different uplink bandwidths. We design a distributed protocol in which a peer contributing more uplink bandwidth receives more layers and consequently better video quality. Previous approaches consider single-layer video, where each peer receives the same video quality no matter how much bandwidth it contributes to the system. The simulation results show that our approach can provide differentiated video quality commensurate with a peer's contribution to other peers, and can also discourage free-riders. Furthermore, we also compare our layered approach with a multiple description coding (MDC) approach, and conclude that the layered approach is more promising, primarily due to its higher coding efficiency.

131 citations


Journal ArticleDOI
TL;DR: Using the Quality Layers post-processing to evaluate and signal the impact on rate and distortion of the various enhancement information pieces, a significant gain is achieved: quality layers significantly outperform the basic standard extractor that was initially proposed in SVC.
Abstract: The concept of quality layers that has been introduced in scalable video coding (SVC) amendment of MPEG4-AVC is presented. By using the Quality Layers post-processing to evaluate and signal the impact on rate and distortion of the various enhancement information pieces, a significant gain is achieved: quality layers significantly outperform the basic standard extractor that was initially proposed in SVC. For the standard set of test sequences, in a range of acceptable video quality, an average quality gain of up to 0.5 dB is achieved. Furthermore, the technique can be used for combined (spatial, temporal and quality) scalability. Thanks to the signaling of this information in the header of the network abstraction layer units or in a supplemental enhancement information message, the adaptation can be performed with a simple parser, e.g., at decoder side or in an intelligent network node designed for rate adaptation.

131 citations


Journal ArticleDOI
TL;DR: This paper considers the case of a noisy channel in Wyner-Ziv coding (WZC) and address distributed joint source-channel coding (JSCC), while targeting at video transmission over packet erasure channels with a single Raptor code, for both SWC and erasure protection.
Abstract: Extending recent works on distributed source coding, this paper considers distributed source-channel coding and targets at the important application of scalable video transmission over wireless networks. The idea is to use a single channel code for both video compression (via Slepian-Wolf coding) and packet loss protection. First, we provide a theoretical code design framework for distributed joint source-channel coding over erasure channels and then apply it to the targeted video application. The resulting video coder is based on a cross-layer design where video compression and protection are performed jointly. We choose Raptor codes - the best approximation to a digital fountain - and address in detail both encoder and decoder designs. Using the received packets together with a correlated video available at the decoder as side information, we devise a new iterative soft-decision decoder for joint Raptor decoding. Simulation results show that, compared to one separate design using Slepian-Wolf compression plus erasure protection and another based on FGS coding plus erasure protection, the proposed joint design provides better video quality at the same number of transmitted packets. Our work represents the first in capitalizing the latest in distributed source coding and near-capacity channel coding for robust video transmission over erasure channels.

122 citations


Proceedings ArticleDOI
21 May 2007
TL;DR: A survey and classification of contemporary image and video quality metrics is presented along with the favorable quality assessment methodologies and emphasis is given to those metrics that can be related to the quality as perceived by the end-user.
Abstract: The accurate prediction of quality from an end-user perspective has received increased attention with the growing demand for compression and communication of digital image and video services over wired and wireless networks. The existing quality assessment methods and metrics have a vast reach from computational and memory efficient numerical methods to highly complex models incorporating aspects of the human visual system. It is hence crucial to classify these methods in order to find the favorable approach for an intended application. In this paper a survey and classification of contemporary image and video quality metrics is therefore presented along with the favorable quality assessment methodologies. Emphasis is given to those metrics that can be related to the quality as perceived by the end-user. As such, these perceptual-based image and video quality metrics may build a bridge between the assessment of quality as experienced by the end-user and the quality of service parameters that are usually deployed to quantify service integrity.

116 citations


Journal ArticleDOI
TL;DR: This paper identifies the key design issues for video broadcast over WiMAX in the multi-BS mode and presents an end-to-end solution which fully addresses key issues such as synchronization, energy efficiency and robust video quality and proposes a methodology to optimize the coverage, the spectrum efficiency and the video quality.
Abstract: Video broadcast and mobile TV have received significant interests from both academia and industry recently. The emerging mobile WiMAX (802.16e) is capable of providing high data rate and flexible quality of service (QoS) mechanisms, making the support of mobile TV very attractive. However, how to efficiently deliver video broadcast over WiMAX is not straightforward, especially in the multi-BS mode. The multi-BS mode requires multiple BSs to be synchronized in the transmission of common multicast/broadcast data. In this paper, we first identify the key design issues for video broadcast over WiMAX in the multi-BS mode. Then, we present an end-to-end solution which fully addresses key issues such as synchronization, energy efficiency and robust video quality. Moreover, we propose a methodology to optimize the coverage, the spectrum efficiency and the video quality. Results show that our proposed scheme can significantly improve the coverage and spectrum efficiency while satisfying video quality requirements.

115 citations


Journal ArticleDOI
TL;DR: The results show that the real-time video quality of the overall system can be greatly improved by cross-layer signaling, and this paper presents cross- layer architecture for adaptive video multicast streaming over multirate wireless LANs where layer-specific information is passed in both directions.
Abstract: Multicast video streaming over multirate wireless LANs imposes strong demands on video codecs and the underlying network. It is not sufficient that only the video codec or only the underlying protocols adapt to changes in the wireless link quality. Research efforts should be applied in both and in a synchronized way. Cross layer design is a new paradigm that addresses this challenge by optimizing communication network architectures across traditional layer boundaries. This paper presents cross-layer architecture for adaptive video multicast streaming over multirate wireless LANs where layer-specific information is passed in both directions, top-down and bottom-up. The authors jointly consider three layers of the protocol stack: the application, data link and physical layers. The authors analyze the performance of the proposed architecture and extensively evaluate it via simulations. The results show that the real-time video quality of the overall system can be greatly improved by cross-layer signaling.

Proceedings ArticleDOI
15 Apr 2007
TL;DR: A novel quality metric for video sequences is proposed that utilizes motion information in video sequences, which is the main difference in moving from images to video.
Abstract: Quality assessment plays a very important role in almost all aspects of multimedia signal processing such as acquisition, coding, display, processing etc. Several objective quality metrics have been proposed for images, but video quality assessment has received relatively little attention and most video quality metrics have been simple extension of metrics for images. In this paper, we propose a novel quality metric for video sequences that utilizes motion information in video sequences, which is the main difference in moving from images to video. This metric is capable of capturing temporal artifacts in video sequences in addition to spatial distortions. Results are presented that demonstrate the efficacy of our quality metric by comparing model performance against subjective scores on the database developed by the video quality experts group.

Journal ArticleDOI
29 Mar 2007
TL;DR: The results indicate that ACR method is an effective subjective quality assessment method, which shows compatible performance with DSCQS method and can evaluate a larger number of video sequences.
Abstract: In this paper, we compared two subjective assessment methods DSCQS(Double Stimulus Continuous Quality Scale method) and ACR(Absolute Category Rating). These methods are widely used in order to evaluate video quality for multimedia application. We performed subjective quality tests using DSCQS and ACR methods. The subjective scores obtained by the DSCQS and ACR methods show that these methods are highly correlated in terms of MOS(Mean Opinion Score) and have slightly lower correlation in terms of DMOS(Difference Mean Opinion Score). The results indicate that ACR method is an effective subjective quality assessment method, which shows compatible performance with DSCQS method and can evaluate a larger number of video sequences.

Journal ArticleDOI
TL;DR: A quality metric is proposed that accounts for both encoding parameters (quantization and frame rate), and intrinsic video sequence characteristics (motion speed) and shows that for the purpose of video rate control, optimization using the classical PSNR does not match up to that of subjective quality data.
Abstract: The purpose of this study is to propose a quality metric of video encoded with variable frame rates and quantization parameters suitable for mobile video broadcasting applications. As a first step, experiments are conducted to assess the subjective quality of video sequences encoded with variable frame rates and quantization parameters. Resulting experimental data show that for the purpose of video rate control, optimization using the classical PSNR does not match up to that of subjective quality data. The second step bridges this gap between PSNR and subjective quality data by constructing a new quality metric that accounts for both encoding parameters (quantization and frame rate), and intrinsic video sequence characteristics (motion speed). The average correlation coefficient for five video sequences tested is as high as 0.93 with the proposed metric, in contrast with the PSNR's 0.70

Journal ArticleDOI
TL;DR: The main component of the proposed solution is a low-complexity, distributed, and dynamic routing algorithm, which relies on prioritized queuing to select the path and time reservation for the various packets, while explicitly considering instantaneous channel conditions, queuing delays and the resulting interference.
Abstract: Emerging multi-hop wireless networks provide a low-cost and flexible infrastructure that can be simultaneously utilized by multiple users for a variety of applications, including delay-sensitive multimedia transmission. However, this wireless infrastructure is often unreliable and provides dynamically varying resources with only limited quality of service (QoS) support for multimedia applications. To cope with the time-varying QoS, existing algorithms often rely on non-scalable, flow-based optimizations to allocate the available network resources (paths and transmission opportunities) across the various multimedia users. Moreover, previous research seldom optimizes jointly the dynamic routing with the adaptation and protection techniques available at the medium access control (MAC) or physical (PHY) layers. In this paper, we propose a distributed packet-based cross-layer algorithm to maximize the decoded video quality of multiple users engaged in simultaneous real-time streaming sessions over the same multi-hop wireless network. Our algorithm explicitly considers packet-based distortion impact and delay constraints in assigning priorities to the various packets and then relies on priority queuing to drive the optimization of the various users' transmission strategies across the protocol layers as well as across the multi-hop network. The proposed solution is enabled by the scalable coding of the video content (i.e. users can transmit and consume video at different quality levels) and the cross-layer optimization strategies, which allow priority-based adaptation to varying channel conditions and available resources. The cross-layer strategies - application layer packet scheduling, the policy for choosing the relays, the MAC retransmission strategies, the PHY modulation and coding schemes - are optimized per packet, at each node, in a distributed manner. The main component of the proposed solution is a low-complexity, distributed, and dynamic routing algorithm, which relies on prioritized queuing to select the path and time reservation for the various packets, while explicitly considering instantaneous channel conditions, queuing delays and the resulting interference. Our results demonstrate the merits and need for end-to-end cross-layer optimization in order to provide an efficient solution for real-time video transmission using existing protocols and infrastructures. Importantly, our proposed delay-driven, packet-based transmission is superior in terms of both network scalability and video quality performance to previous flow-based solutions that statically allocate resources based on predetermined paths and rate requirements. In addition, the results provide important insights that can guide the design of network infrastructures and streaming protocols for video streaming.

Book Chapter
01 Jan 2007
TL;DR: In this article, an integrated cross-layer optimization algorithm aimed at maximizing the decoded video quality of delay-constrained streaming in a multi-hop wireless mesh network that supports quality-of-service (QoS) is proposed.
Abstract: © Cambridge University Press 2008. The proliferation of wireless multi-hop communication infrastructures in office or residential environments depends on their ability to support a variety of emerging applications requiring real-time video transmission between stations located across the network. We propose an integrated cross-layer optimization algorithm aimed at maximizing the decoded video quality of delay-constrained streaming in a multi-hop wireless mesh network that supports quality-of-service (QoS). The key principle of our algorithm lays in the synergistic optimization of different control parameters at each node of the multi-hop network, across the protocol layers-application, network, medium access control (MAC) and physical (PHY) layers, as well as end-to-end, across the various nodes. To drive this optimization, we assume an overlay network infrastructure, which is able to convey information on the conditions of each link. Various scenarios that perform the integrated optimization using different levels (“horizons”) of information about the network status are examined. The differences between several optimization scenarios in terms of decoded video quality and required streaming complexity are quantified. Our results demonstrate the merits and the need for cross-layer optimization in order to provide an efficient solution for real-time video transmission using existing protocols and infrastructures. In addition, they provide important insights for future protocol and system design targeted at enhanced video streaming support across wireless mesh networks. Introduction Wireless mesh networks are built based on a mixture of fixed and mobile nodes interconnected via wireless links to form a multi-hop ad-hoc network.

Posted Content
TL;DR: This work proposes video-aware opportunistic network coding schemes that take into account both aspects, namely (i) the decodability of network codes by several receivers and (ii) the distortion values and playout deadlines of video packets.
Abstract: In this paper, we study video streaming over wireless networks with network coding capabilities. We build upon recent work, which demonstrated that network coding can increase throughput over a broadcast medium, by mixing packets from different flows into a single packet, thus increasing the information content per transmission. Our key insight is that, when the transmitted flows are video streams, network codes should be selected so as to maximize not only the network throughput but also the video quality. We propose video-aware opportunistic network coding schemes that take into account both (i) the decodability of network codes by several receivers and (ii) the importance and deadlines of video packets. Simulation results show that our schemes significantly improve both video quality and throughput.

Proceedings ArticleDOI
06 Dec 2007
TL;DR: In this paper, the authors proposed video-aware opportunistic network coding schemes that take into account both aspects, namely (i) the decodability of network codes by several receivers and (ii) the distortion values and playout deadlines of video packets.
Abstract: In this paper, we study video streaming over wireless networks with network coding capabilities. We build upon recent work, which demonstrated that network coding can increase throughput over a broadcast medium, by mixing packets from different flows into a single packet, thus increasing the information content per transmission. Our key insight is that, when the transmitted flows are video streams, network codes should be selected so as to maximize not only the network throughput but also the video quality. We propose video-aware opportunistic network coding schemes that take into account both aspects, namely (i) the decodability of network codes by several receivers and (ii) the distortion values and playout deadlines of video packets. Simulation results show that our schemes significantly improve both video quality and throughput.

Journal ArticleDOI
TL;DR: Experimental results indicate the effectiveness and efficiency of the proposed optimization framework, especially when the delay budget imposed by the upper layer applications is small, where more than 10% distortion gain can be achieved.
Abstract: Video summarization has gained increased popularity in the emerging multimedia communication applications, however, very limited work has been conducted to address the transmission problem of video summary frames. In this paper, we propose a cross-layer optimization framework for delivering video summaries over wireless networks. Within a rate-distortion theoretical framework, the source coding, allowable retransmission, and adaptive modulation and coding have been jointly optimized, which reflects the joint selection of parameters at physical, data link and application layers. The goal is to achieve the best video quality and content coverage of the received summary frames and to meet the delay constraint. The problem is solved using Lagirangian relaxation and dynamic programming. Experimental results indicate the effectiveness and efficiency of the proposed optimization framework, especially when the delay budget imposed by the upper layer applications is small, where more than 10% distortion gain can be achieved.

Patent
26 Oct 2007
TL;DR: In this paper, a feedback loop is formed by the means for classifying the digital video image using a previous video image to generate a new ROI and thus allow for tracking of targets as they move through the imager field-of-view.
Abstract: An apparatus for capturing a video image comprising a means for generating a digital video image, a means for classifying the digital video image into one or more regions of interest and a background image, and a means for encoding the digital video image, wherein the encoding is selected to provide at least one of; enhancement of the image clarity of the one or more ROI relative to the background image encoding, and decreasing the video quality of the background image relative to the one or more ROI. A feedback loop is formed by the means for classifying the digital video image using a previous video image to generate a new ROI and thus allow for tracking of targets as they move through the imager field-of-view.

Journal ArticleDOI
TL;DR: An in-depth analysis of the media distortion characteristics allows us to define a low complexity algorithm for an optimal flow rate allocation in multipath network scenarios, and shows that a greedy allocation of rate along paths with increasing error probability leads to an optimal solution.
Abstract: We address the problem of joint path selection and source rate allocation in order to optimize the media specific quality of service in streaming of stored video sequences on multipath networks. An optimization problem is proposed in order to minimize the end-to-end distortion, which depends on video sequence dependent parameters, and network properties. An in-depth analysis of the media distortion characteristics allows us to define a low complexity algorithm for an optimal flow rate allocation in multipath network scenarios. In particular, we show that a greedy allocation of rate along paths with increasing error probability leads to an optimal solution. We argue that a network path shall not be chosen for transmission, unless all other available paths with lower error probability have been chosen. Moreover, the chosen paths should be used at their maximum available end-to-end bandwidth. Simulation results show that the optimal flow rate allocation carefully adapts the total streaming rate and the number of chosen paths, to the end-to-end transmission error probability. In many scenarios, the optimal rate allocation provides more than 20% improvement in received video quality, compared to heuristic-based algorithms. This motivates its use in multipath networks, where it optimizes media specific quality of service, and simultaneously saves network resources at the price of a very low computational complexity.

Proceedings ArticleDOI
02 Jul 2007
TL;DR: A novel information hiding algorithm for H.264/AVC, taking advantage of the specific features of this advanced video compression standard, and achieving a comparatively high data hiding rate with a little increase on bit rate.
Abstract: This paper presents a novel information hiding algorithm for H264/AVC, taking advantage of the specific features of this advanced video compression standard The algorithm hides 1 bit in each qualified intra 4x4 luma block by modifying intra 4x4 prediction modes (I4-modes) based on the mapping between I4-modes and hiding bits Information hiding rate is controlled by a parameter, embedding strength, which is also embedded into bitstream Hidden information can be retrieved by decoding the intra prediction modes from bitstream, requiring neither original media nor complete video decoding Experimental results show that the proposed algorithm almost has no impact on video quality and video stream features A comparatively high data hiding rate is obtained with a little increase on bit rate

Patent
04 Apr 2007
TL;DR: In this paper, an approach and method to decode video data while maintaining a target video quality using an integrated error control system including error detection, resynchronization and error recovery is described.
Abstract: Apparatus and method to decode video data while maintaining a target video quality using an integrated error control system including error detection, resynchronization and error recovery are described. Robust error control can be provided by a joint encoder-decoder functionality including multiple error resilience designs. In one aspect, error recovery may be an end-to-end integrated multi-layer error detection, resynchronization and recovery mechanism designed to achieve reliable error detection and error localization. The error recovery system may include cross-layer interaction of error detection, resynchronization and error recovery subsystems. In another aspect, error handling of a scalable coded bitstream is coordinated across a base-layer and enhancement layer of scalable compressed video.

Journal ArticleDOI
TL;DR: An integrated system view of admission control and scheduling for both contention and poll-based access of IEEE 802.11e medium access control (MAC) protocol is proposed and a new concept called time fairness is introduced, which is critical in enhancing the video performance when different transmitter-receiver pairs deploy different cross-layer strategies.
Abstract: This paper presents efficient mechanisms for delay- sensitive transmission of video over IEEE 802.11a/e wireless local area networks (WLANs). Transmitting video over WLANs in real time is very challenging due to the time-varying wireless channel and video content characteristics. This paper provides a comprehensive view of how to adapt the quality of service signaling, IEEE 802.11e parameters and cross-layer design to optimize the video quality at the receiver. We propose an integrated system view of admission control and scheduling for both contention and poll-based access of IEEE 802.11e medium access control (MAC) protocol and outline the merits of each approach for video transmission. We also show the benefits of using a cross-layer optimization by sharing the application, MAC, and physical layer parameters of the open systems interconnection stack to enhance the video quality. We will show through analysis and simulation that controlling the contention-based access in IEEE 802.11e is simple to realize in real products and how different cross-layer strategies used in poll-based access lead to a larger number of stations being simultaneously admitted and/or a higher video quality for the admitted stations. Finally, we introduce a new concept called time fairness, which is critical in enhancing the video performance when different transmitter-receiver pairs deploy different cross-layer strategies.

Proceedings ArticleDOI
03 Sep 2007
TL;DR: This paper looks at recent trends and developments in video quality research, in particular the emergence of new generations of quality metrics (compared to those focused on compression artifacts), including comprehensive audiovisual quality measurement.
Abstract: This paper gives a brief overview of the current state of the art of video quality metrics and discusses their achievements as well as shortcomings. It also summarizes the main standardization efforts by the Video Quality Experts Group (VQEG). It then looks at recent trends and developments in video quality research, in particular the emergence of new generations of quality metrics (compared to those focused on compression artifacts), including comprehensive audiovisual quality measurement.

Journal ArticleDOI
TL;DR: A novel framework is presented that can provide online estimates of VVoIP QoE on network paths without end-user involvement and without requiring any video sequences and features the "G AP-model", which is an offline model ofQoE expressed as a function of measurable network factors such as bandwidth, delay, jitter, and loss.
Abstract: Increased access to broadband networks has led to a fast-growing demand for voice and video over IP (VVoIP) applications such as Internet telephony (VoIP), videoconferencing, and IP television (IPTV). For pro-active troubleshooting of VVoIP performance bottlenecks that manifest to end-users as performance impairments such as video frame freezing and voice dropouts, network operators cannot rely on actual end-users to report their subjective quality of experience (QoE). Hence, automated and objective techniques that provide real-time or online VVoIP QoE estimates are vital. Objective techniques developed to-date estimate VVoIP QoE by performing frame-to-frame peak-signal-to-noise ratio (PSNR) comparisons of the original video sequence and the reconstructed video sequence obtained from the sender-side and receiver-side, respectively. Since processing such video sequences is time consuming and computationally intensive, existing objective techniques cannot provide online VVoIP QoE. In this paper, we present a novel framework that can provide online estimates of VVoIP QoE on network paths without end-user involvement and without requiring any video sequences. The framework features the "G AP-model", which is an offline model of QoE expressed as a function of measurable network factors such as bandwidth, delay, jitter, and loss. Using the GAP-model, our online framework can produce VVoIP QoE estimates in terms of "Good", "Acceptable", or "Poor" (GAP) grades of perceptual quality solely from the on-line measured network conditions.

Proceedings ArticleDOI
01 Mar 2007
TL;DR: The results show that the proposed approach provides powerful means of estimating the video quality experienced by users for low resolution video streaming services.
Abstract: The scope of this work is the estimation of video quality for low resolution video sequences typical in (mobile) video streaming applications. Since the video quality experienced by users depends considerably on the spatial (edges, colors, ...) and temporal (movement speed, direction, ...) features of the video sequence, this paper presents a two-step approach to quality estimation. Firstly, shots between two scene changes are analyzed and their content class is found. Secondly, based on the content class, frame rate and bitrate, an estimation of quality is carried out. In this paper, the design of the content classifier as well as an appropriate choice of the content classes and their characteristics is discussed. Moreover, the design of quality metric is presented, based on the mean opinion score obtained by a survey. The performance of the proposed method is evaluated and compared to several common methods. The results show that the proposed approach provides powerful means of estimating the video quality experienced by users for low resolution video streaming services.

Proceedings ArticleDOI
29 Sep 2007
TL;DR: An analytical framework for optimal video rate allocation is developed and evaluated, based on observed available bit rate and round trip time over each access network, as well as the video distortion-rate characteristics, which enables autonomous rate allocation at each device in a media- and network-aware fashion.
Abstract: Contemporary wireless devices integrate multiple networking technologies, such as cellular, WiMax and IEEE 802.11a/b/g, as alternative means of accessing the Internet. Efficient utilization of available bandwidth over heterogeneous access networks is important, especially for media streaming applications with high data rates and stringent delay requirements. In this work we consider the problem of rate allocation among multiple video streaming sessions sharing multiple access networks. We develop and evaluate an analytical framework for optimal video rate allocation, based on observed available bit rate (ABR) and round trip time (RTT) over each access network, as well as the video distortion-rate (DR) characteristics. The rate allocation is formulated as a convex optimization problem that minimizes the sum of expected distortion of all video streams. We then present a distributed approximation of the optimization, which enables autonomous rate allocation at each device in a media- and network-aware fashion. Performance of the proposed allocation scheme is compared against robust rate control based on H∞ optimal control and two heuristic schemes employing TCP style additive-increase-multiplicative-decrease (AIMD) principles. Wesimulate in NS-2 [1] simultaneous streaming of multiple high-definition(HD) video streams over multiple access networks, using ABR and RTT traces collected on Ethernet, IEEE 802.11g, and IEEE 802.11b networks deployed in a corporate environment. In comparison with heuristic AIMD-based schemes, rate allocation from both the media-aware convex optimization scheme and H∞ optimal control benefit from proactive avoidance of network congestion, and can reduce the average packet loss ratio from 27% to below 2%, while improving the average received video quality by 3.3 - 4.5 dB in PSNR.

Book
01 Jan 2007
TL;DR: This book presents a complete pipeline for High Dynamic Range image and video processing from acquisition, through compression and quality evaluation, to display, and covers successful examples of the HDR technology applications in computer graphics and computer vision.
Abstract: As new displays and cameras offer enhanced color capabilities, there is a need to extend the precision of digital content. High Dynamic Range (HDR) imaging encodes images and video with higher than normal 8 bit-per-color-channel precision, enabling representation of the complete color gamut and the full visible range of luminance.However, to realize transition from the traditional toHDRimaging, it is necessary to develop imaging algorithms that work with the high-precision data. Tomake such algorithms effective and feasible in practice, it is necessary to take advantage of the limitations of the human visual system by aligning the data shortcomings to those of the human eye, thus limiting storage and processing precision. Therefore, human visual perception is the key component of the solutions we discuss in this book. This book presents a complete pipeline forHDR image and video processing fromacquisition, through compression and quality evaluation, to display. At the HDR image and video acquisition stage specialized HDR sensors or multi-exposure techniques suitable for traditional cameras are discussed. Then, we present a practical solution for pixel values calibration in terms of photometric or radiometric quantities, which are required in some technically oriented applications. Also, we cover the problem of efficient image and video compression and encoding either for storage or transmission purposes, including the aspect of backward compatibility with existing formats. Finally, we review existing HDR display technologies and the associated problems of image contrast and brightness adjustment. For this purpose tone mapping is employed to accommodate HDR content to LDR devices. Conversely, the so-called inverse tone mapping is required to upgrade LDR content for displaying on HDR devices. We overview HDR-enabled image and video quality metrics, which are needed to verify algorithms at all stages of the pipeline. Additionally, we cover successful examples of the HDR technology applications, in particular, in computer graphics and computer vision. The goal of this book is to present all discussed components of the HDR pipeline with the main focus on video. For some pipeline stages HDR video solutions are either not well established or do not exist at all, in which case we describe techniques for single HDR images. In such cases we attempt to select the techniques, which can be extended into temporal domain. Whenever needed, relevant background information on human perception is given, which enables better understanding of the design choices behind the discussed algorithms and HDR equipment. Table of Contents: Introduction / Representation of an HDR Image / HDR Image and Video Acquisition / HDR Image Quality / HDR Image, Video, and Texture Compression / Tone Reproduction / HDR Display Devices / LDR2HDR: Recovering Dynamic Range in Legacy Content / HDRI in Computer Graphics / Software

Journal ArticleDOI
TL;DR: This work focuses on music videos which exhibit a broad range of structural and semantic relationships between the music and the video content, and aims to identify such relationships by achieving a two-level automatic structuring of the musicand the video.
Abstract: The study of the associations between audio and video content has numerous important applications in the fields of information retrieval and multimedia content authoring. In this work, we focus on music videos which exhibit a broad range of structural and semantic relationships between the music and the video content. To identify such relationships, a two-level automatic structuring of the music and the video is achieved separately. Note onsets are detected from the music signal, along with section changes. The latter is achieved by a novel algorithm which makes use of feature selection and statistical novelty detection approaches based on kernel methods. The video stream is independently segmented to detect changes in motion activity, as well as shot boundaries. Based on this two-level segmentation of both streams, four audio-visual correlation measures are computed. The usefulness of these correlation measures is illustrated by a query by video experiment on a 100 music video database, which also exhibits interesting genre dependencies