scispace - formally typeset
Search or ask a question

Showing papers on "Video quality published in 2014"


Book ChapterDOI
06 Sep 2014
TL;DR: In large video collections with clusters of typical categories, such as “birthday party” or “flash-mob”, category-specific video summarization can produce higher quality video summaries than unsupervised approaches that are blind to the video category.
Abstract: In large video collections with clusters of typical categories, such as “birthday party” or “flash-mob”, category-specific video summarization can produce higher quality video summaries than unsupervised approaches that are blind to the video category.

430 citations


Journal ArticleDOI
TL;DR: It is shown that the proposed NSS and motion coherency models are appropriate for quality assessment of videos, and they are utilized to design a blind VQA algorithm that correlates highly with human judgments of quality.
Abstract: We propose a blind (no reference or NR) video quality evaluation model that is nondistortion specific. The approach relies on a spatio-temporal model of video scenes in the discrete cosine transform domain, and on a model that characterizes the type of motion occurring in the scenes, to predict video quality. We use the models to define video statistics and perceptual features that are the basis of a video quality assessment (VQA) algorithm that does not require the presence of a pristine video to compare against in order to predict a perceptual quality score. The contributions of this paper are threefold. 1) We propose a spatio-temporal natural scene statistics (NSS) model for videos. 2) We propose a motion model that quantifies motion coherency in video scenes. 3) We show that the proposed NSS and motion coherency models are appropriate for quality assessment of videos, and we utilize them to design a blind VQA algorithm that correlates highly with human judgments of quality. The proposed algorithm, called video BLIINDS, is tested on the LIVE VQA database and on the EPFL-PoliMi video database and shown to perform close to the level of top performing reduced and full reference VQA algorithms.

383 citations


Book
24 Aug 2014
TL;DR: This book provides a detailed explanation of the various parts of the HEVC standard, insight into how it was developed, and in-depth discussion of algorithms and architectures for its implementation.
Abstract: This book provides developers, engineers, researchers and students with detailed knowledge about the High Efficiency Video Coding (HEVC) standard. HEVC is the successor to the widely successful H.264/AVC video compression standard, and it provides around twice as much compression as H.264/AVC for the same level of quality. The applications for HEVC will not only cover the space of the well-known current uses and capabilities of digital video they will also include the deployment of new services and the delivery of enhanced video quality, such as ultra-high-definition television (UHDTV) and video with higher dynamic range, wider range of representable color, and greater representation precision than what is typically found today. HEVC is the next major generation of video coding design a flexible, reliable and robust solution that will support the next decade of video applications and ease the burden of video on world-wide network traffic. This book provides a detailed explanation of the various parts of the standard, insight into how it was developed, and in-depth discussion of algorithms and architectures for its implementation.

356 citations


Journal ArticleDOI
TL;DR: This work presents a VQA algorithm that estimates quality via separate estimates of perceived degradation due to spatial distortion and joint spatial and temporal distortion, and demonstrates that this algorithm performs well in predicting video quality and is competitive with current state-of-the-art V QA algorithms.
Abstract: Algorithms for video quality assessment (VQA) aim to estimate the qualities of videos in a manner that agrees with human judgments of quality. Modern VQA algorithms often estimate video quality by comparing localized space-time regions or groups of frames from the reference and distorted videos, using comparisons based on visual features, statistics, and/or perceptual models. We present a VQA algorithm that estimates quality via separate estimates of perceived degradation due to (1) spatial distortion and (2) joint spatial and temporal distortion. The first stage of the algorithm estimates perceived quality degradation due to spatial distortion; this stage operates by adaptively applying to groups of spatial video frames the two strategies from the most apparent distortion algorithm with an extension to account for temporal masking. The second stage of the algorithm estimates perceived quality degradation due to joint spatial and temporal distortion; this stage operates by measuring the dissimilarity between the reference and distorted videos represented in terms of two-dimensional spatiotemporal slices. Finally, the estimates obtained from the two stages are combined to yield an overall estimate of perceived quality degradation. Testing on various video-quality databases demonstrates that our algorithm performs well in predicting video quality and is competitive with current state-of-the-art VQA algorithms.

188 citations


Journal ArticleDOI
TL;DR: A classification and review of latest published research work in the area of NR image and video quality assessment is presented and the NR methods of visual quality assessment considered for review are structured into categories and subcategories based on the types of methodologies used for the underlying processing employed for quality estimation.
Abstract: The field of perceptual quality assessment has gone through a wide range of developments and it is still growing. In particular, the area of no-reference (NR) image and video quality assessment has progressed rapidly during the last decade. In this article, we present a classification and review of latest published research work in the area of NR image and video quality assessment. The NR methods of visual quality assessment considered for review are structured into categories and subcategories based on the types of methodologies used for the underlying processing employed for quality estimation. Overall, the classification has been done into three categories, namely, pixel-based methods, bitstream-based methods, and hybrid methods of the aforementioned two categories. We believe that the review presented in this article will be helpful for practitioners as well as for researchers to keep abreast of the recent developments in the area of NR image and video quality assessment. This article can be used for various purposes such as gaining a structured overview of the field and to carry out performance comparisons for the state-of-the-art methods.

162 citations


Proceedings ArticleDOI
16 Jun 2014
TL;DR: This paper presents the first large-scale study characterizing the impact of cellular network performance on mobile video user engagement from the perspective of a network operator, and quantifies the effect that 31 different network factors have on user behavior in mobile video.
Abstract: Mobile network operators have a significant interest in the performance of streaming video on their networks because network dynamics directly influence the Quality of Experience (QoE). However, unlike video service providers, network operators are not privy to the client- or server-side logs typically used to measure key video performance metrics, such as user engagement. To address this limitation, this paper presents the first large-scale study characterizing the impact of cellular network performance on mobile video user engagement from the perspective of a network operator. Our study on a month-long anonymized data set from a major cellular network makes two main contributions. First, we quantify the effect that 31 different network factors have on user behavior in mobile video. Our results provide network operators direct guidance on how to improve user engagement --- for example, improving mean signal-to-interference ratio by 1 dB reduces the likelihood of video abandonment by 2%. Second, we model the complex relationships between these factors and video abandonment, enabling operators to monitor mobile video user engagement in real-time. Our model can predict whether a user completely downloads a video with more than 87% accuracy by observing only the initial 10 seconds of video streaming sessions. Moreover, our model achieves significantly better accuracy than prior models that require client- or server-side logs, yet we only use standard radio network statistics and/or TCP/IP headers available to network operators.

139 citations


Journal ArticleDOI
TL;DR: A robust transfer video indexing (RTVI) model is developed, equipped with a novel sample-specific robust loss function, which employs the confidence score of a Web image as prior knowledge to suppress the influence and control of this image in the learning process.
Abstract: Semantic video indexing, also known as video annotation or video concept detection in literatures, has been attracting significant attention in recent years. Due to deficiency of labeled training videos, most of the existing approaches can hardly achieve satisfactory performance. In this paper, we propose a novel semantic video indexing approach, which exploits the abundant user-tagged Web images to help learn robust semantic video indexing classifiers. The following two major challenges are well studied: 1) noisy Web images with imprecise and/or incomplete tags; and 2) domain difference between images and videos. Specifically, we first apply a non-parametric approach to estimate the probabilities of images being correctly tagged as confidence scores. We then develop a robust transfer video indexing (RTVI) model to learn reliable classifiers from a limited number of training videos together with the abundance of user-tagged images. The RTVI model is equipped with a novel sample-specific robust loss function, which employs the confidence score of a Web image as prior knowledge to suppress the influence and control the contribution of this image in the learning process. Meanwhile, the RTVI model discovers an optimal kernel space, in which the mismatch between images and videos is minimized for tackling the domain difference problem. Besides, we devise an iterative algorithm to effectively optimize the proposed RTVI model and a theoretical analysis on the convergence of the proposed algorithm is provided as well. Extensive experiments on various real-world multimedia collections demonstrate the effectiveness of the proposed robust semantic video indexing approach.

130 citations


Journal ArticleDOI
TL;DR: In this paper, information hiding methods in the H.264/AVC compressed video domain are surveyed and perspectives and recommendations are presented to provide a better understanding of the current trend of information hiding and to identify new opportunities for information hiding in compressed video.
Abstract: Information hiding refers to the process of inserting information into a host to serve specific purpose(s). In this paper, information hiding methods in the H.264/AVC compressed video domain are surveyed. First, the general framework of information hiding is conceptualized by relating the state of an entity to a meaning (i.e., sequences of bits). This concept is illustrated by using various data representation schemes such as bit plane replacement, spread spectrum, histogram manipulation, divisibility, mapping rules, and matrix encoding. Venues at which information hiding takes place are then identified, including prediction process, transformation, quantization, and entropy coding. Related information hiding methods at each venue are briefly reviewed, along with the presentation of the targeted applications, appropriate diagrams, and references. A timeline diagram is constructed to chronologically summarize the invention of information hiding methods in the compressed still image and video domains since 1992. A comparison among the considered information hiding methods is also conducted in terms of venue, payload, bitstream size overhead, video quality, computational complexity, and video criteria. Further perspectives and recommendations are presented to provide a better understanding of the current trend of information hiding and to identify new opportunities for information hiding in compressed video.

128 citations


Journal ArticleDOI
TL;DR: The results indicate that there is no general superiority of UGVs over AGVs, and videos generated by users are rated more highly than agency-generated videos under both low and high technical qualities, but the advantage is significantly lower under high technical quality.

126 citations


Proceedings ArticleDOI
08 Jul 2014
TL;DR: In this paper, the authors consider the problem of optimizing video de- livery for a network supporting video clients streaming stored video, and present a simple asymptotically optimal online algorithm, NOVA, to solve the problem.
Abstract: We consider the problem of optimizing video de- livery for a network supporting video clients streaming stored video. Specifically, we consider the joint optimization of network resource allocation and video quality adaptation. Our objective is to fairly maximize video clients' Quality of Experience (QoE) re- alizing tradeoffs among the mean quality, temporal variability in quality, and fairness, incorporating user preferences on rebuffer- ing and cost of video delivery. We present a simple asymptotically optimal online algorithm, NOVA, to solve the problem. NOVA is asynchronous, and using minimal communication, distributes the tasks of resource allocation to network controller, and quality adaptation to respective video clients. Video quality adaptation in NOVA is also optimal for standalone video clients, and is well suited for use in the DASH framework. Further, NOVA can be extended for use with more general QoE models, networks shared with other traffic loads and networks using fixed/legacy resource allocation.

124 citations


Proceedings ArticleDOI
01 Dec 2014
TL;DR: This proposed SDN application is designed to monitor network conditions of streaming flow in real time and dynamically change routing paths using multi-protocol label switching (MPLS) traffic engineering (TE) to provide reliable video watching experience.
Abstract: Today's over the top (OTT) video service providers take advantage of content distribution networks (CDNs) and adaptive bitrate (ABR) streaming where a video player adjusts resolutions based on end-to-end network conditions. Although the mechanisms are useful to improve user-perceived video quality, they do not resolve the root causes of congestion problems. To pinpoint a bottleneck and improve video quality-of-experience (QoE), we leverage a software-defined networking (SDN) platform from OTT video service provider's point of view. Our proposed SDN application is designed to monitor network conditions of streaming flow in real time and dynamically change routing paths using multi-protocol label switching (MPLS) traffic engineering (TE) to provide reliable video watching experience. We use an off-the-shelf SDN platform to show the feasibility of our approaches.

Journal ArticleDOI
TL;DR: 4D Video Textures introduce a novel representation for rendering video‐realistic interactive character animation from a database of 4D actor performance captured in a multiple camera studio that achieves >90% reduction in size and halves the rendering cost.
Abstract: 4D Video Textures 4DVT introduce a novel representation for rendering video-realistic interactive character animation from a database of 4D actor performance captured in a multiple camera studio. 4D performance capture reconstructs dynamic shape and appearance over time but is limited to free-viewpoint video replay of the same motion. Interactive animation from 4D performance capture has so far been limited to surface shape only. 4DVT is the final piece in the puzzle enabling video-realistic interactive animation through two contributions: a layered view-dependent texture map representation which supports efficient storage, transmission and rendering from multiple view video capture; and a rendering approach that combines multiple 4DVT sequences in a parametric motion space, maintaining video quality rendering of dynamic surface appearance whilst allowing high-level interactive control of character motion and viewpoint. 4DVT is demonstrated for multiple characters and evaluated both quantitatively and through a user-study which confirms that the visual quality of captured video is maintained. The 4DVT representation achieves >90% reduction in size and halves the rendering cost.

Journal ArticleDOI
TL;DR: A Hammerstein-Wiener model is presented for predicting the time-varying subjective quality (TVSQ) of rate-adaptive videos and it is shown that the model is able to reliably predict the TVSQ of rate adaptive videos.
Abstract: Newly developed hypertext transfer protocol (HTTP)-based video streaming technologies enable flexible rate-adaptation under varying channel conditions. Accurately predicting the users' quality of experience (QoE) for rate-adaptive HTTP video streams is thus critical to achieve efficiency. An important aspect of understanding and modeling QoE is predicting the up-to-the-moment subjective quality of a video as it is played, which is difficult due to hysteresis effects and nonlinearities in human behavioral responses. This paper presents a Hammerstein-Wiener model for predicting the time-varying subjective quality (TVSQ) of rate-adaptive videos. To collect data for model parameterization and validation, a database of longer duration videos with time-varying distortions was built and the TVSQs of the videos were measured in a large-scale subjective study. The proposed method is able to reliably predict the TVSQ of rate adaptive videos. Since the Hammerstein-Wiener model has a very simple structure, the proposed method is suitable for online TVSQ prediction in HTTP-based streaming.

Proceedings ArticleDOI
01 Oct 2014
TL;DR: Experimental results show that without training on human opinion scores the proposed method is comparable to state-of-the-art NR-VQA algorithms.
Abstract: In this paper, we propose a novel “Opinion Free” (OF) No-Reference Video Quality Assessment (NR-VQA) algorithm based on frame-level unsupervised feature learning and hysteresis temporal pooling. The system consists of three components: feature extraction with max-min pooling, frame quality prediction and temporal pooling. Frame level features are first extracted by unsupervised feature learning and used to train a linear Support Vector Regressor (SVR) for predicting quality scores frame by frame. Frame-level quality scores are then combined by temporal pooling to obtain a single video quality score. We tested the proposed method on the LIVE video quality database and experimental results show that without training on human opinion scores the proposed method is comparable to state-of-the-art NR-VQA algorithms.

Journal ArticleDOI
30 Nov 2014
TL;DR: An overview of the work that proposed and detailed a new transmission paradigm exploiting content reuse and the widespread availability of low-cost storage, which uses caching in helper stations and/or devices, combined with highly spectrally efficient short-range communications to deliver video files.
Abstract: Wireless video is the main driver for rapid growth in cellular data traffic. Traditional methods for network capacity increase are very costly and do not exploit the unique features of video, especially asynchronous content reuse. In this paper we give an overview of our work that proposed and detailed a new transmission paradigm exploiting content reuse and the widespread availability of low-cost storage. Our network structure uses caching in helper stations (femtocaching) and/or devices, combined with highly spectrally efficient short-range communications to deliver video files. For femtocaching, we develop optimum storage schemes and dynamic streaming policies that optimize video quality. For caching on devices, combined with device-to-device (D2D) communications, we show that communications within clusters of mobile stations should be used; the cluster size can be adjusted to optimize the tradeoff between frequency reuse and the probability that a device finds a desired file cached by another device in the same cluster. In many situations the network throughput increases linearly with the number of users, and the tradeoff between throughput and outage is better than in traditional base-station centric systems. Simulation results with realistic numbers of users and channel conditions show that network throughput can be increased by two orders of magnitude compared to conventional schemes.

Journal ArticleDOI
TL;DR: Outputs from this work may be used in the development of quality of experience (QoE) oriented streaming applications for HEVC in loss prone networks and quantifies the effects of network impairment on HEVC video streaming from the perspective of the end user.
Abstract: Users of modern portable consumer devices (smartphones, tablets etc.) expect ubiquitous delivery of high quality services, which fully utilise the capabilities of their devices. Video streaming is one of the most widely used yet challenging services for operators to deliver with assured service levels. This challenge is more apparent in wireless networks where bandwidth constraints and packet loss are common. The lower bandwidth requirements of High Efficiency Video Coding (HEVC) provide the potential to enable service providers to deliver high quality video streams in low-bandwidth networks; however, packet loss may result in greater damage in perceived quality given the higher compression ratio. This work considers the delivery of HEVC encoded video streams in impaired network environments and quantifies the effects of network impairment on HEVC video streaming from the perspective of the end user. HEVC encoded streams were transmitted over a test network with both wired and wireless segments that had imperfect communication channels subject to packet loss. Two different error concealment methods were employed to mitigate packet loss and overcome reference decoder robustness issues. The perceptual quality of received video was subjectively assessed by a panel of viewers. Existing subjective studies of HEVC quality have not considered the implications of network impairments. Analysis of results has quantified the effects of packet loss in HEVC on perceptual quality and provided valuable insight into the relative importance of the main factors observed to influence user perception in HEVC streaming. The outputs from this study show the relative importance and relationship between those factors that affect human perception of quality in impaired HEVC encoded video streams. The subjective analysis is supported by comparison with commonly used objective quality measurement techniques. Outputs from this work may be used in the development of quality of experience (QoE) oriented streaming applications for HEVC in loss prone networks.

Posted Content
TL;DR: In this article, a client rate adaptation algorithm is proposed to yield consistent video quality in HTTP-based adaptive streaming (HAS), where clients have visibility into incoming video within a finite horizon and use the client-side video buffer as a breathing room for not only network bandwidth variability, but also video bitrate variability.
Abstract: In conventional HTTP-based adaptive streaming (HAS), a video source is encoded at multiple levels of constant bitrate representations, and a client makes its representation selections according to the measured network bandwidth. While greatly simplifying adaptation to the varying network conditions, this strategy is not the best for optimizing the video quality experienced by end users. Quality fluctuation can be reduced if the natural variability of video content is taken into consideration. In this work, we study the design of a client rate adaptation algorithm to yield consistent video quality. We assume that clients have visibility into incoming video within a finite horizon. We also take advantage of the client-side video buffer, by using it as a breathing room for not only network bandwidth variability, but also video bitrate variability. The challenge, however, lies in how to balance these two variabilities to yield consistent video quality without risking a buffer underrun. We propose an optimization solution that uses an online algorithm to adapt the video bitrate step-by-step, while applying dynamic programming at each step. We incorporate our solution into PANDA -- a practical rate adaptation algorithm designed for HAS deployment at scale.

Journal ArticleDOI
TL;DR: Simulation results show that LinGO delivers live video flows with QoE support and robustness in mobile and dynamic topologies, as needed in future IoT environments.

Journal ArticleDOI
01 Jul 2014
TL;DR: Experimental results on a state-of-the-art benchmark for background subtraction on real-world video data indicate that the pROST method succeeds at a broad variety of background subtracted scenarios, and it outperforms competing approaches when video quality is deteriorated by camera jitter.
Abstract: An increasing number of methods for background subtraction use Robust PCA to identify sparse foreground objects. While many algorithms use the $$\ell _1$$ l 1 -norm as a convex relaxation of the ideal sparsifying function, we approach the problem with a smoothed $$\ell _p$$ l p -quasi-norm and present pROST, a method for robust online subspace tracking. The algorithm is based on alternating minimization on manifolds. Implemented on a graphics processing unit, it achieves realtime performance at a resolution of $$160 \times 120$$ 160 × 120 . Experimental results on a state-of-the-art benchmark for background subtraction on real-world video data indicate that the method succeeds at a broad variety of background subtraction scenarios, and it outperforms competing approaches when video quality is deteriorated by camera jitter.

Journal ArticleDOI
TL;DR: An advanced foveal imaging model is proposed to generate the perceived representation of video by integrating visual attention into the foveation mechanism, and a novel approach to predict video fixations is proposed by mimicking the essential functionality of eye movement.
Abstract: Contrast sensitivity of the human visual system to visual stimuli can be significantly affected by several mechanisms, e.g., vision foveation and attention. Existing studies on foveation based video quality assessment only take into account static foveation mechanism. This paper first proposes an advanced foveal imaging model to generate the perceived representation of video by integrating visual attention into the foveation mechanism. For accurately simulating the dynamic foveation mechanism, a novel approach to predict video fixations is proposed by mimicking the essential functionality of eye movement. Consequently, an advanced contrast sensitivity function, derived from the attention driven foveation mechanism, is modeled and then integrated into a wavelet-based distortion visibility measure to build a full reference attention driven foveated video quality (AFViQ) metric. AFViQ exploits adequately perceptual visual mechanisms in video quality assessment. Extensive evaluation results with respect to several publicly available eye-tracking and video quality databases demonstrate promising performance of the proposed video attention model, fixation prediction approach, and quality metric.

Journal ArticleDOI
TL;DR: The proposed quality model correlates well with the subjective ratings with a Pearson correlation coefficient of 0.985 when the model parameters are predicted from content features and outperforms several well-known quality models.
Abstract: In this paper, we investigate the impact of spatial, temporal, and amplitude resolution on the perceptual quality of a compressed video. Subjective quality tests were carried out on a mobile device and a total of 189 processed video sequences with 10 source sequences included in the test. Subjective data reveal that the impact of spatial resolution (SR), temporal resolution (TR), and quantization stepsize (QS) can each be captured by a function with a single content-dependent parameter, which indicates the decay rate of the quality with each resolution factor. The joint impact of SR, TR, and QS can be accurately modeled by the product of these three functions with only three parameters. The impact of SR and QS on the quality are independent of that of TR, but there are significant interactions between SR and QS. Furthermore, the model parameters can be predicted accurately from a few content features derived from the original video. The proposed model correlates well with the subjective ratings with a Pearson correlation coefficient of 0.985 when the model parameters are predicted from content features. The quality model is further validated on six other subjective rating data sets with very high accuracy and outperforms several well-known quality models.

Journal ArticleDOI
TL;DR: In this article, a set of acceptability-based QoE models, denoted as A-QoE, is proposed based on the results of comprehensive user studies on subjective quality acceptance assessments.
Abstract: Quality of experience (QoE) measures the overall perceived quality of mobile video delivery from subjective user experience and objective system performance. Current QoE prediction models have two main limitations: (1) insufficient consideration of the factors influencing QoE, and (2) limited studies on QoE models for acceptability prediction. In this paper, a set of novel acceptability-based QoE models, denoted as A-QoE, is proposed based on the results of comprehensive user studies on subjective quality acceptance assessments. The models are able to predict users' acceptability and pleasantness in various mobile video usage scenarios. Statistical nonlinear regression analysis has been used to build the models with a group of influencing factors as independent predictors, which include encoding parameters and bitrate, video content characteristics, and mobile device display resolution. The performance of the proposed A-QoE models has been compared with three well-known objective Video Quality Assessment metrics: PSNR, SSIM and VQM. The proposed A-QoE models have high prediction accuracy and usage flexibility. Future user-centred mobile video delivery systems can benefit from applying the proposed QoE-based management to optimize video coding and quality delivery strategies.

Proceedings ArticleDOI
02 May 2014
TL;DR: A secure video steganography algorithm based on the principle of linear block code that has high embedding efficiency and the system's quality is close to the original video quality.
Abstract: Due to the high speed of internet and advances in technology, people are becoming more worried about information being hacked by attackers. Recently, many algorithms of steganography and data hiding have been proposed. Steganography is a process of embedding the secret information inside the host medium (text, audio, image and video). Concurrently, many of the powerful steganographic analysis software programs have been provided to unauthorized users to retrieve the valuable secret information that was embedded in the carrier files. Some steganography algorithms can be easily detected by steganalytical detectors because of the lack of security and embedding efficiency. In this paper, we propose a secure video steganography algorithm based on the principle of linear block code. Nine uncompressed video sequences are used as cover data and a binary image logo as a secret message. The pixels' positions of both cover videos and a secret message are randomly reordered by using a private key to improve the system's security. Then the secret message is encoded by applying Hamming code (7, 4) before the embedding process to make the message even more secure. The result of the encoded message will be added to random generated values by using XOR function. After these steps that make the message secure enough, it will be ready to be embedded into the cover video frames. In addition, the embedding area in each frame is randomly selected and it will be different from other frames to improve the steganography scheme's robustness. Furthermore, the algorithm has high embedding efficiency as demonstrated by the experimental results that we have obtained. Regarding the system's quality, the Pick Signal to Noise Ratio (PSNR) of stego videos are above 51 dB, which is close to the original video quality. The embedding payload is also acceptable, where in each video frame we can embed 16 Kbits and it can go up to 90 Kbits without noticeable degrading of the stego video's quality.

Proceedings ArticleDOI
TL;DR: A broadcast scenario rate-distortion performance analysis and mutual comparison of one of the latest video coding standards H.265/HEVC with recently released proprietary video coding scheme VP9 indicate a general dominance of HEVC based encoding algorithm in comparison to other alternatives, while VP9 and AVC showing similar performance.
Abstract: Current increasing effort of broadcast providers to transmit UHD (Ultra High Definition) content is likely to increase demand for ultra high definition televisions (UHDTVs). To compress UHDTV content, several alter- native encoding mechanisms exist. In addition to internationally recognized standards, open access proprietary options, such as VP9 video encoding scheme, have recently appeared and are gaining popularity. One of the main goals of these encoders is to efficiently compress video sequences beyond HDTV resolution for various scenarios, such as broadcasting or internet streaming. In this paper, a broadcast scenario rate-distortion performance analysis and mutual comparison of one of the latest video coding standards H.265/HEVC with recently released proprietary video coding scheme VP9 is presented. Also, currently one of the most popular and widely spread encoder H.264/AVC has been included into the evaluation to serve as a comparison baseline. The comparison is performed by means of subjective evaluations showing actual differences between encoding algorithms in terms of perceived quality. The results indicate a dominance of HEVC based encoding algorithm in comparison to other alternatives if a wide range of bit-rates from very low to high bit-rates corresponding to low quality up to transparent quality when compared to original and uncompressed video is considered. In addition, VP9 shows competitive results for synthetic content and bit-rates that correspond to operating points for transparent or close to transparent quality video.

Proceedings ArticleDOI
19 Mar 2014
TL;DR: An optimization solution that uses an online algorithm to adapt the video bitrate step-by-step, while applying dynamic programming at each step is proposed and incorporated into PANDA -- a practical rate adaptation algorithm designed for HAS deployment at scale.
Abstract: In conventional HTTP-based adaptive streaming (HAS), a video source is encoded at multiple levels of constant bitrate representations, and a client makes its representation selections according to the measured network bandwidth. While greatly simplifying adaptation to the varying network conditions, this strategy is not the best for optimizing the video quality experienced by end users. Quality fluctuation can be reduced if the natural variability of video content is taken into consideration. In this work, we study the design of a client rate adaptation algorithm to yield consistent video quality. We assume that clients have visibility into incoming video within a finite horizon. We also take advantage of the client-side video buffer, by using it as a breathing room for not only network bandwidth variability, but also video bitrate variability. The challenge, however, lies in how to balance these two variabilities to yield consistent video quality without risking a buffer underrun. We propose an optimization solution that uses an online algorithm to adapt the video bitrate step-by-step, while applying dynamic programming at each step. We incorporate our solution into PANDA -- a practical rate adaptation algorithm designed for HAS deployment at scale.

Proceedings ArticleDOI
01 Dec 2014
TL;DR: This work studies the visual quality of streaming video and proposes a fusion-based video quality assessment (FVQA) index, where fusion coefficients are learned from training video samples in the same group to predict its quality.
Abstract: In this work, we study the visual quality of streaming video and propose a fusion-based video quality assessment (FVQA) index to predict its quality. In the first step, video sequences are grouped according to their content complexity to reduce content diversity within each group. Then, at the second step, several existing video quality assessment methods are fused to provide the final video quality score, where fusion coefficients are learned from training video samples in the same group. We demonstrate the superior performance of FVQA as compared with other video quality assessment methods using the MCL-V video quality database.

Journal ArticleDOI
TL;DR: Evaluating the performance of seven state-of-the-art video quality metrics with respect to compressed medical ultrasound video sequences indicates that the visual information fidelity, structural similarity index, and universal quality index metrics show good correlation with the subjective scores provided by medical experts.
Abstract: The quality of experience and quality of service provided in the healthcare sector are critical in evaluating the reliable delivery of the healthcare services provided. Medical images and videos play a major role in modern e-health services and have become an integral part of medical data communication systems. The quality evaluation of medical images and videos is an essential process, and one of the ways of addressing it is via the use of quality metrics. In this paper, we evaluate the performance of seven state-of-the-art video quality metrics with respect to compressed medical ultrasound video sequences. We study the performance of each video quality metric in representing the diagnostic quality of the video, by evaluating the correlation of each metric with the subjective opinions of medical experts. The results indicate that the visual information fidelity, structural similarity index, and universal quality index metrics show good correlation with the subjective scores provided by medical experts. The tests also investigate the performance of the emerging video compression standard, high-efficiency video coding-HEVC, for medical ultrasound video compression. The results show that, using HEVC with the considered ultrasound video sequences, a diagnostically reliable compressed ultrasound video can be obtained for compression with values of the quantization parameter up to 35.

Journal ArticleDOI
TL;DR: A new video quality model (VQM) that accounts for the perceptual impact of variable frame delays (VFD) in videos with demonstrated top performance on the laboratory for image and video engineering (LIVE) mobile video quality assessment ( VQA) database.
Abstract: We announce a new video quality model (VQM) that accounts for the perceptual impact of variable frame delays (VFD) in videos with demonstrated top performance on the laboratory for image and video engineering (LIVE) mobile video quality assessment (VQA) database. This model, called VQM_VFD, uses perceptual features extracted from spatial-temporal blocks spanning fixed angular extents and a long edge detection filter. VQM_VFD predicts video quality by measuring multiple frame delays using perception based parameters to track subjective quality over time. In the performance analysis of VQM_VFD, we evaluated its efficacy at predicting human opinions of visual quality. A detailed correlation analysis and statistical hypothesis testing show that VQM_VFD accurately predicts human subjective judgments and substantially outperforms top-performing image quality assessment and VQA models previously tested on the LIVE mobile VQA database. VQM_VFD achieved the best performance on the mobile and tablet studies of the LIVE mobile VQA database for simulated compression, wireless packet-loss, and rate adaptation, but not for temporal dynamics. These results validate the new model and warrant a hard release of the VQM_VFD algorithm. It is freely available for any purpose, commercial, or noncommercial at http://www.its.bldrdoc.gov/vqm/ .

Journal ArticleDOI
01 Feb 2014
TL;DR: The results show that WCCP significantly improves the network performance and the quality of received video in the sink nodes, and outperforms the existing state-of-the-art congestion control protocols.
Abstract: The growing interest in applications of Wireless Multimedia Sensor Networks (WMSNs) imposes new challenges on congestion control protocols in such networks. In this paper, we propose a new content-aware cross layer WMSN Congestion Control Protocol (WCCP) by considering the characteristics of multimedia content. WCCP employs a Source Congestion Avoidance Protocol (SCAP) in the source nodes, and a Receiver Congestion Control Protocol (RCCP) in the intermediate nodes. SCAP uses Group of Picture (GOP) size prediction to detect congestion in the network, and avoids congestion by adjusting the sending rate of source nodes and distribution of the departing packets from the source nodes. In addition, RCCP monitors the queue length of the intermediate nodes to detect congestion in both monitoring and event-driven traffics. Moreover, to improve the received video quality in base stations, WCCP keeps the I-frames and ignores the other less important frame types of compressed video, in the congestion situations. The proposed WCCP protocol is evaluated through simulations based on various performance metrics such as packet loss rate, frame loss rate, Peak Signal-to-Noise Ratio (PSNR), end-to-end delay, throughput, and energy consumption. The results show that WCCP significantly improves the network performance and the quality of received video in the sink nodes, and outperforms the existing state-of-the-art congestion control protocols.

Journal ArticleDOI
TL;DR: Several centralized and distributed algorithms and heuristics are proposed that allow nodes inside the network to steer the HAS client's quality selection process and are able to enforce management policies by limiting the set of available qualities for specific clients.
Abstract: HTTP adaptive streaming (HAS) services allow the quality of streaming video to be automatically adapted by the client application in face of network and device dynamics Due to their advantages compared to traditional techniques, HAS-based protocols are widely used for over-the-top (OTT) video streaming However, they are yet to be adopted in managed environments, such as ISP networks A major obstacle is the purely client-driven design of current HAS approaches, which leads to excessive quality oscillations, suboptimal behavior, and the inability to enforce management policies Moreover, the provider has no control over the quality that is provided, which is essential when offering a managed service This article tackles these challenges and facilitates the adoption of HAS in managed networks Specifically, several centralized and distributed algorithms and heuristics are proposed that allow nodes inside the network to steer the HAS client's quality selection process The algorithms are able to enforce management policies by limiting the set of available qualities for specific clients Additionally, simulation results show that by coordinating the quality selection process across multiple clients, the proposed algorithms significantly reduce quality oscillations by a factor of five and increase the average delivered video quality by at least 14%