scispace - formally typeset
Search or ask a question
Proceedings Article•

A fusion approach to video quality assessment based on temporal decomposition

TL;DR: This work decomposes an input video clip into multiple smaller intervals, measure the quality of each interval separately, and applies a fusion approach to integrating these scores into a final one to improve MOVIE and is also competitive with other state-of-the-art video quality metrics.
Abstract: In this work, we decompose an input video clip into multiple smaller intervals, measure the quality of each interval separately, and apply a fusion approach to integrating these scores into a final one. To give more details, an input video clip is first decomposed into smaller units along the temporal domain, called the temporal decomposition units (TDUs). Next, for each TDU that consists of a small number of frames, we adopt a proper video quality metric (specifically, the MOVIE index in this work) to compute the quality scores of all frames and, based on the sociological findings, choose the worst scores of TDUs for data fusion. Finally, a regression approach is used to fuse selected worst scores from all TDUs to get the ultimate quality score of the input video as a whole. We conduct extensive experiments on the LIVE video database, and show that the proposed approach indeed improves MOVIE and is also competitive with other state-of-the-art video quality metrics.
Citations
More filters
Journal Article•DOI•
01 Jan 2013
TL;DR: This work provides an in-depth review of recent developments in the field of visual quality assessment and puts equal emphasis on video quality databases and metrics as this is a less investigated area.
Abstract: Research on visual quality assessment has been active during the last decade. In this work, we provide an in-depth review of recent developments in the field. As compared with existing survey papers, our current work has several unique contributions. First, besides image quality databases and metrics, we put equal emphasis on video quality databases and metrics as this is a less investigated area. Second, we discuss the application of visual quality evaluation to perceptual coding as an example for applications. Third, we benchmark the performance of state-of-the-art visual quality metrics with experiments. Finally, future trends in visual quality assessment are discussed.

63 citations

Proceedings Article•DOI•
01 Sep 2019
TL;DR: Dense blocks for the U-Net architecture are introduced, which can alleviate the problem of gradient disappearance, while also reducing the number of parameters.
Abstract: Deep convolutional neural networks (DCNN) have demonstrated their potential to generate reasonable results in image inpainting. Some existing method uses convolution to generate surrounding features, then passes features by fully connected layers, and finally predicts missing regions. Although the final result is semantically reasonable, some blurred situations generated because the standard convolution is used, which conditioned on the effective pixels and the substitute values in the masked holes. In this paper, we introduce dense blocks for the U-Net architecture, which can alleviate the problem of gradient disappearance, while also reducing the number of parameters. The most important is that it can enhance the transfer of features and make more efficient use of them. Partial convolution is used to solve the problem of artifacts such as color differences and blurring. Experiments on the place365 dataset demonstrate our approach can generate more detailed and semantically reasonable results in random area image inpainting.

33 citations

Proceedings Article•DOI•
11 Oct 2020
TL;DR: This work proposed a new method trying to directly locate waterfowl farms, including both registered and unregistered ones without the need of human labeling, and shows that using the existing simple U-Net combined with residual blocks has better performance than the other deep models in this task.
Abstract: For the epidemic prevention of avian influenza, there exist lots of differences between ideality and reality. This is why the epidemic is usually out of control. One of the reasons is that many illegal waterfowl farms are built without government registration. In this work, we proposed a new method trying to directly locate waterfowl farms, including both registered and unregistered ones without the need of human labeling. This will not only save human labors, but also update the location and size information of waterfowl farms regularly due to the computing speed of computers. In this work, we proposed a new method for satellite image augmentation. The layers of the model we proposed are not deeper than the other deep neural network models. However, we show that using the existing simple U-Net combined with residual blocks has better performance than the other deep models in this task.

10 citations

Proceedings Article•DOI•
23 Jul 2018
TL;DR: The extensive experiments in the LIVE Video Quality Database suggest the proposed video quality assessment model has superior correlation performance with human visual perception than other state-of-the-art methods.
Abstract: In this work, we proposed a full-reference method to estimate video quality. First, we decompose the video into one spatial image and two spatiotemporal slice images. Then for each one of them, sixteen Laws texture filters are applied to generate nine different Laws feature maps. In order to compare the similarity degree of these feature maps obtained from both original and distorted videos, we compute the twodimensional correlation coefficients. Since the correlation coefficients are computed for each frame and spatiotemporal slice, we only choose four statistical values to represent them to reduce the complexity. Lastly, the regression approach is chosen to learn the mapping relationship between the selected features and subjective quality scores. The extensive experiments in the LIVE Video Quality Database suggest our proposed video quality assessment model has superior correlation performance with human visual perception than other state-of-the-art methods.

9 citations


Additional excerpts

  • ...[15] developed a fusion approach to VQA....

    [...]

Proceedings Article•DOI•
11 Oct 2020
TL;DR: A general-purpose no-reference video quality assessment (VQA) metric based on the cascade combination of 2D convolutional neural network, multi-layer perceptron, and support vector regression model, which demonstrates that this method is competitive with other full-reference and NR VQA metrics.
Abstract: In this paper, we propose a general-purpose no-reference (NR) video quality assessment (VQA) metric based on the cascade combination of 2D convolutional neural network (CNN), multi-layer perceptron (MLP), and support vector regression (SVR) model. The features are extracted from both spatial and spatiotemporal domains by using a 2D CNN. These features can capture different aspects of video frames for predicting quality scores, and we take these features as inputs of MLP to obtain a few estimated quality scores on different perspectives. Finally, these estimated scores are combined as a final quality score by an SVR model. The proposed method is evaluated on the well-known LIVE Video database with other state-of-the-art and well-performing VQA metrics. And the experimental result demonstrates that our method is competitive with other full-reference and NR VQA metrics.

3 citations


Cites methods from "A fusion approach to video quality ..."

  • ...The experiment shows the proposed method outperforms other state-of-the-art FR VQA metrics, such as PSNR, VQM [5], MOVIE [4], STMAD [2], ViS3 [3], CA-TD-MOVIE [33], STILFC VQA [34] and STVA-VQA [35], and NR VQA metrics, V-BLIINDS [13], VIIDEO [14], and SACONVA [15]....

    [...]

References
More filters
Journal Article•DOI•
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

40,609 citations

Book•
Christopher M. Bishop1•
17 Aug 2006
TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

22,840 citations

Journal Article•DOI•
TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.
Abstract: (2007). Pattern Recognition and Machine Learning. Technometrics: Vol. 49, No. 3, pp. 366-366.

18,802 citations

Journal Article•DOI•
TL;DR: The independent test results from the VQEG FR-TV Phase II tests are summarized, as well as results from eleven other subjective data sets that were used to develop the NTIA General Model.
Abstract: The National Telecommunications and Information Administration (NTIA) General Model for estimating video quality and its associated calibration techniques were independently evaluated by the Video Quality Experts Group (VQEG) in their Phase II Full Reference Television (FR-TV) test. The NTIA General Model was the only video quality estimator that was in the top performing group for both the 525-line and 625-line video tests. As a result, the American National Standards Institute (ANSI) adopted the NTIA General Model and its associated calibration techniques as a North American Standard in 2003. The International Telecommunication Union (ITU) has also included the NTIA General Model as a normative method in two Draft Recommendations. This paper presents a description of the NTIA General Model and its associated calibration techniques. The independent test results from the VQEG FR-TV Phase II tests are summarized, as well as results from eleven other subjective data sets that were used to develop the method.

1,268 citations

Journal Article•DOI•
TL;DR: A recent large-scale subjective study of video quality on a collection of videos distorted by a variety of application-relevant processes results in a diverse independent public database of distorted videos and subjective scores that is freely available.
Abstract: We present the results of a recent large-scale subjective study of video quality on a collection of videos distorted by a variety of application-relevant processes. Methods to assess the visual quality of digital videos as perceived by human observers are becoming increasingly important, due to the large number of applications that target humans as the end users of video. Owing to the many approaches to video quality assessment (VQA) that are being developed, there is a need for a diverse independent public database of distorted videos and subjective scores that is freely available. The resulting Laboratory for Image and Video Engineering (LIVE) Video Quality Database contains 150 distorted videos (obtained from ten uncompressed reference videos of natural scenes) that were created using four different commonly encountered distortion types. Each video was assessed by 38 human subjects, and the difference mean opinion scores (DMOS) were recorded. We also evaluated the performance of several state-of-the-art, publicly available full-reference VQA algorithms on the new database. A statistical evaluation of the relative performance of these algorithms is also presented. The database has a dedicated web presence that will be maintained as long as it remains relevant and the data is available online.

1,172 citations