scispace - formally typeset
Search or ask a question

Showing papers by "Alan C. Bovik published in 2010"


Journal ArticleDOI
TL;DR: A recent large-scale subjective study of video quality on a collection of videos distorted by a variety of application-relevant processes results in a diverse independent public database of distorted videos and subjective scores that is freely available.
Abstract: We present the results of a recent large-scale subjective study of video quality on a collection of videos distorted by a variety of application-relevant processes. Methods to assess the visual quality of digital videos as perceived by human observers are becoming increasingly important, due to the large number of applications that target humans as the end users of video. Owing to the many approaches to video quality assessment (VQA) that are being developed, there is a need for a diverse independent public database of distorted videos and subjective scores that is freely available. The resulting Laboratory for Image and Video Engineering (LIVE) Video Quality Database contains 150 distorted videos (obtained from ten uncompressed reference videos of natural scenes) that were created using four different commonly encountered distortion types. Each video was assessed by 38 human subjects, and the difference mean opinion scores (DMOS) were recorded. We also evaluated the performance of several state-of-the-art, publicly available full-reference VQA algorithms on the new database. A statistical evaluation of the relative performance of these algorithms is also presented. The database has a dedicated web presence that will be maintained as long as it remains relevant and the data is available online.

1,172 citations


Journal ArticleDOI
TL;DR: A new two-step framework for no-reference image quality assessment based on natural scene statistics (NSS) is proposed, which does not require any knowledge of the distorting process and the framework is modular in that it can be extended to any number of distortions.
Abstract: Present day no-reference/no-reference image quality assessment (NR IQA) algorithms usually assume that the distortion affecting the image is known. This is a limiting assumption for practical applications, since in a majority of cases the distortions in the image are unknown. We propose a new two-step framework for no-reference image quality assessment based on natural scene statistics (NSS). Once trained, the framework does not require any knowledge of the distorting process and the framework is modular in that it can be extended to any number of distortions. We describe the framework for blind image quality assessment and a version of this framework-the blind image quality index (BIQI) is evaluated on the LIVE image quality assessment database. A software release of BIQI has been made available online: http://live.ece.utexas.edu/research/quality/BIQI_release.zip.

1,085 citations


Journal ArticleDOI
TL;DR: A general, spatio-spectrally localized multiscale framework for evaluating dynamic video fidelity that integrates both spatial and temporal aspects of distortion assessment and is found to be quite competitive with, and even outperform, algorithms developed and submitted to the VQEG FRTV Phase 1 study, as well as more recent VQA algorithms tested on this database.
Abstract: There has recently been a great deal of interest in the development of algorithms that objectively measure the integrity of video signals. Since video signals are being delivered to human end users in an increasingly wide array of applications and products, it is important that automatic methods of video quality assessment (VQA) be available that can assist in controlling the quality of video being delivered to this critical audience. Naturally, the quality of motion representation in videos plays an important role in the perception of video quality, yet existing VQA algorithms make little direct use of motion information, thus limiting their effectiveness. We seek to ameliorate this by developing a general, spatio-spectrally localized multiscale framework for evaluating dynamic video fidelity that integrates both spatial and temporal (and spatio-temporal) aspects of distortion assessment. Video quality is evaluated not only in space and time, but also in space-time, by evaluating motion quality along computed motion trajectories. Using this framework, we develop a full reference VQA algorithm for which we coin the term the MOtion-based Video Integrity Evaluation index, or MOVIE index. It is found that the MOVIE index delivers VQA scores that correlate quite closely with human subjective judgment, using the Video Quality Expert Group (VQEG) FRTV Phase 1 database as a test bed. Indeed, the MOVIE index is found to be quite competitive with, and even outperform, algorithms developed and submitted to the VQEG FRTV Phase 1 study, as well as more recent VQA algorithms tested on this database.

729 citations


Journal ArticleDOI
TL;DR: The BLIINDS index (BLind Image Integrity Notator using DCT Statistics) is introduced which is a no-reference approach to image quality assessment that does not assume a specific type of distortion of the image and it requires only minimal training.
Abstract: The development of general-purpose no-reference approaches to image quality assessment still lags recent advances in full-reference methods. Additionally, most no-reference or blind approaches are distortion-specific, meaning they assess only a specific type of distortion assumed present in the test image (such as blockiness, blur, or ringing). This limits their application domain. Other approaches rely on training a machine learning algorithm. These methods however, are only as effective as the features used to train their learning machines. Towards ameliorating this we introduce the BLIINDS index (BLind Image Integrity Notator using DCT Statistics) which is a no-reference approach to image quality assessment that does not assume a specific type of distortion of the image. It is based on predicting image quality based on observing the statistics of local discrete cosine transform coefficients, and it requires only minimal training. The method is shown to correlate highly with human perception of quality.

383 citations


Proceedings ArticleDOI
TL;DR: The recent MOtion-based Video Integrity Evaluation (MOVIE) index emerges as the leading objective VQA algorithm in this study, while the performance of the Video Quality Metric (VQM) and the Multi-Scale Structural SIMilarity (MS-SSIM) index is noteworthy.
Abstract: Automatic methods to evaluate the perceptual quality of a digital video sequence have widespread applications wherever the end-user is a human. Several objective video quality assessment (VQA) algorithms exist, whose performance is typically evaluated using the results of a subjective study performed by the video quality experts group (VQEG) in 2000. There is a great need for a free, publicly available subjective study of video quality that embodies state-of-the-art in video processing technology and that is effective in challenging and benchmarking objective VQA algorithms. In this paper, we present a study and a resulting database, known as the LIVE Video Quality Database, where 150 distorted video sequences obtained from 10 different source video content were subjectively evaluated by 38 human observers. Our study includes videos that have been compressed by MPEG-2 and H.264, as well as videos obtained by simulated transmission of H.264 compressed streams through error prone IP and wireless networks. The subjective evaluation was performed using a single stimulus paradigm with hidden reference removal, where the observers were asked to provide their opinion of video quality on a continuous scale. We also present the performance of several freely available objective, full reference (FR) VQA algorithms on the LIVE Video Quality Database. The recent MOtion-based Video Integrity Evaluation (MOVIE) index emerges as the leading objective VQA algorithm in our study, while the performance of the Video Quality Metric (VQM) and the Multi-Scale Structural SIMilarity (MS-SSIM) index is noteworthy. The LIVE Video Quality Database is freely available for download1 and we hope that our study provides researchers with a valuable tool to benchmark and improve the performance of objective VQA algorithms.

215 citations


Journal ArticleDOI
TL;DR: A novel anthropometric three dimensional (Anthroface 3D) face recognition algorithm, which is based on a systematically selected set of discriminatory structural characteristics of the human face derived from the existing scientific literature on facial anthropometry, is presented.
Abstract: We present a novel anthropometric three dimensional (Anthroface 3D) face recognition algorithm, which is based on a systematically selected set of discriminatory structural characteristics of the human face derived from the existing scientific literature on facial anthropometry. We propose a novel technique for automatically detecting 10 anthropometric facial fiducial points that are associated with these discriminatory anthropometric features. We isolate and employ unique textural and/or structural characteristics of these fiducial points, along with the established anthropometric facial proportions of the human face for detecting them. Lastly, we develop a completely automatic face recognition algorithm that employs facial 3D Euclidean and geodesic distances between these 10 automatically located anthropometric facial fiducial points and a linear discriminant classifier. On a database of 1149 facial images of 118 subjects, we show that the standard deviation of the Euclidean distance of each automatically detected fiducial point from its manually identified position is less than 2.54 mm. We further show that the proposed Anthroface 3D recognition algorithm performs well (equal error rate of 1.98% and a rank 1 recognition rate of 96.8%), out performs three of the existing benchmark 3D face recognition algorithms, and is robust to the observed fiducial point localization errors.

184 citations


Journal ArticleDOI
TL;DR: This work considers a four-component image model that classifies image local regions according to edge and smoothness properties and provides results that are highly consistent with human subjective judgment of the quality of blurred and noisy images and deliver better overall performance than (G-)SSIM and MS-SSIM on the LIVE Image Quality Assessment Database.
Abstract: The assessment of image quality is important in numerous image processing applications. Two prominent examples, the Structural Similarity Image (SSIM) index and Multi-scale Structural Similarity (MS-SSIM) operate under the assumption that human visual perception is highly adapted for extracting structural information from a scene. Results in large human studies have shown that these quality indices perform very well relative to other methods. However, the performance of SSIM and other Image Quality Assessment (IQA) algorithms are less effective when used to rate blurred and noisy images. We address this defect by considering a four-component image model that classifies image local regions according to edge and smoothness properties. In our approach, SSIM scores are weighted by region type, leading to modified versions of (G-)SSIM and MS-(G-)SSIM, called four-component (G-)SSIM (4-(G-)SSIM) and four-component MS-(G-)SSIM (4-MS-(G-)SSIM). Our experimental results show that our new approach provides results that are highly consistent with human subjective judgment of the quality of blurred and noisy images, and also deliver better overall performance than (G-)SSIM and MS-(G-)SSIM on the LIVE Image Quality Assessment Database.

151 citations


Proceedings ArticleDOI
23 May 2010
TL;DR: The Texas 3D Face Recognition Database contains 1149 pairs of high resolution, pose normalized, preprocessed, and perfectly aligned color and range images of 118 adult human subjects acquired using a stereo camera.
Abstract: We make the Texas 3D Face Recognition Database available to researchers in three dimensional (3D) face recognition and other related areas. This database contains 1149 pairs of high resolution, pose normalized, preprocessed, and perfectly aligned color and range images of 118 adult human subjects acquired using a stereo camera. The images are accompanied with information about the subjects' gender, ethnicity, facial expression, and the locations of 25 manually located anthropometric facial fiducial points. Specific partitions of the data for developing and evaluating 3D face recognition algorithms are also included.

148 citations


Journal ArticleDOI
TL;DR: To evaluate the performance of VQA algorithms for the specific task of H.264 advanced video coding compressed video transmission over wireless networks, a subjective study involving 160 distorted videos is conducted.
Abstract: Evaluating the perceptual quality of video is of tremendous importance in the design and optimization of wireless video processing and transmission systems. In an endeavor to emulate human perception of quality, various objective video quality assessment (VQA) algorithms have been developed. However, the only subjective video quality database that exists on which these algorithms can be tested is dated and does not accurately reflect distortions introduced by present generation encoders and/or wireless channels. In order to evaluate the performance of VQA algorithms for the specific task of H.264 advanced video coding compressed video transmission over wireless networks, we conducted a subjective study involving 160 distorted videos. Various leading full reference VQA algorithms were tested for their correlation with human perception. The data from the paper has been made available to the research community, so that further research on new VQA algorithms and on the general area of VQA may be carried out.

138 citations


Journal ArticleDOI
TL;DR: A new content-weighted method for full- reference (FR) video quality assessment using a three-component image model that classifies image local regions according to their image gradient properties and applies variable weights to structural similarity image index (SSIM) and peak signal-to-noise ratio (PSNR) scores.
Abstract: Objective image and video quality measures play impor- tant roles in numerous image and video processing applications. In this work, we propose a new content-weighted method for full- reference (FR) video quality assessment using a three-component image model. Using the idea that different image regions have dif- ferent perceptual significance relative to quality, we deploy a model that classifies image local regions according to their image gradient properties, then apply variable weights to structural similarity image index (SSIM) (and peak signal-to-noise ratio (PSNR)) scores ac- cording to region. A frame-based video quality assessment algo- rithm is thereby derived. Experimental results on the Video Quality Experts Group (VQEG) FR-TV Phase 1 test dataset show that the proposed algorithm outperforms existing video quality assessment methods. © 2010 SPIE and IS&T. DOI: 10.1117/1.3267087

66 citations


Journal ArticleDOI
TL;DR: An unequal power allocation scheme for transmission of JPEG compressed images over multiple-input multiple-output systems employing spatial multiplexing provides significant image quality improvement as compared to different equal power allocations schemes.
Abstract: With the introduction of multiple transmit and receive antennas in next generation wireless systems, real-time image and video communication are expected to become quite common, since very high data rates will become available along with improved data reliability. New joint transmission and coding schemes that explore advantages of multiple antenna systems matched with source statistics are expected to be developed. Based on this idea, we present an unequal power allocation scheme for transmission of JPEG compressed images over multiple-input multiple-output systems employing spatial multiplexing. The JPEG-compressed image is divided into different quality layers, and different layers are transmitted simultaneously from different transmit antennas using unequal transmit power, with a constraint on the total transmit power during any symbol period. Results show that our unequal power allocation scheme provides significant image quality improvement as compared to different equal power allocations schemes, with the peak-signal-to-noise-ratio gain as high as 14 dB at low signal-to-noise-ratios.

Proceedings ArticleDOI
14 Mar 2010
TL;DR: It is demonstrated that each distortion affects the statistics of natural images in a characteristic way and it is possible to parameterize this characteristic and build a classifier that can classify a given image into a particular distortion category solely on the basis of DIS, with high accuracy.
Abstract: Natural scene statistics (NSS) are an active area of research. Although there exist elegant models for NSS, the statistics of natural image distortions have received little attention. In this paper we study distorted image statistics (DIS) for natural scenes. We demonstrate that each distortion affects the statistics of natural images in a characteristic way and it is possible to parameterize this characteristic. We show that not only are DIS different for different distortions, but by such parametrization it is also possible to build a classifier that can classify a given image into a particular distortion category solely on the basis of DIS, with high accuracy. Applications of such categorization are of considerable scope and include DIS-based quality assessment and blind image distortion correction.

Journal ArticleDOI
TL;DR: A new video quality assessment (VQA) algorithm is proposed - the motion compensated structural similarity index - that assesses not only spatial quality but also quality along temporal trajectories and is computationally efficient as compared to other VQA algorithms.
Abstract: We propose a new video quality assessment (VQA) algorithm - the motion compensated structural similarity index - that assesses not only spatial quality but also quality along temporal trajectories. Drawing inspiration from the motion-compensated approach followed for video compression, we propose a motion-compensated approach to temporal quality assessment. The proposed algorithm is computationally efficient as compared to other VQA algorithms that utilize motion information from extracted optical flow and correlates well with human perception of quality. In order to exemplify the utility of the algorithm in a practical setting, we evaluate the quality of H.264/AVC compressed videos. Efficiency of computation is enabled by the novel motion-vector re-use concept.

Journal ArticleDOI
TL;DR: A novel, model-based active contour algorithm, termed “snakules”, for the annotation of spicules on mammography, which deploys snakules that are converging open-ended active contours also known as snakes at each suspect spiculated mass location.
Abstract: We have developed a novel, model-based active contour algorithm, termed “snakules”, for the annotation of spicules on mammography. At each suspect spiculated mass location that has been identified by either a radiologist or a computer-aided detection (CADe) algorithm, we deploy snakules that are converging open-ended active contours also known as snakes. The set of convergent snakules have the ability to deform, grow and adapt to the true spicules in the image, by an attractive process of curve evolution and motion that optimizes the local matching energy. Starting from a natural set of automatically detected candidate points, snakules are deployed in the region around a suspect spiculated mass location. Statistics of prior physical measurements of spiculated masses on mammography are used in the process of detecting the set of candidate points. Observer studies with experienced radiologists to evaluate the performance of snakules demonstrate the potential of the algorithm as an image analysis technique to improve the specificity of CADe algorithms and as a CADe prompting tool.

Proceedings ArticleDOI
14 Mar 2010
TL;DR: This paper presents a design of real-time implementable full-reference image quality algorithms based on the SSIM index and multi-scale SSIM (MS-SSIM) index that are tested on the LIVE image quality database and shown to yield performance commensurate with SSIM and MS- SSIM but with much lower computational complexity.
Abstract: The development of real-time image quality assessment algorithms is an important direction on which little research has focused. This paper presents a design of real-time implementable full-reference image quality algorithms based on the SSIM index [2] and multi-scale SSIM (MS-SSIM) index [3]. The proposed algorithms, which modify SSIM/MS-SSIM to achieve speed, are tested on the LIVE image quality database [13] and shown to yield performance commensurate with SSIM and MS-SSIM but with much lower computational complexity.

Journal ArticleDOI
TL;DR: Eye tracking experiments on naturalistic stereo images presented through a haploscope found that fixated luminance contrast and luminance gradient are generally higher than randomly selected luminance contrasts and gradient, which agrees with previous literature, but the fixated disparity contrast and disparity gradient are usually lower.
Abstract: Analysis of the statistics of natural scene features at observers’ fixations can help us understand the mechanism of fixation selection and visual attention of the human vision system Previous studies revealed that several low-level luminance features at fixations are statistically different from those at randomly selected locations In our study, we conducted eye tracking experiments on naturalistic stereo images presented through a haploscope and found that fixated luminance contrast and luminance gradient are generally higher than randomly selected luminance contrast and luminance gradient, which agrees with previous literature, but the fixated disparity contrast and disparity gradient are generally lower than randomly selected disparity contrast and disparity gradient We discuss the relevance of our findings in the context of the complexity of disparity calculations and the metabolic needs of disparity processing

Journal ArticleDOI
TL;DR: The PULP algorithm seeks efficient forward error correction assignment that balances efficiency and fairness by controlling the size of identified salient region(s) relative to the channel state by utilizing a perceptual weighting scheme.
Abstract: We describe a method for achieving perceptually minimal video distortion over packet-erasure networks using perceptually unequal loss protection (PULP). There are two main ingredients in the algorithm. First, a perceptual weighting scheme is employed wherein the compressed video is weighted as a function of the nonuniform distribution of retinal photoreceptors. Secondly, packets are assigned temporal importance within each group of pictures (GOP), recognizing that the severity of error propagation increases with elapsed time within a GOP. Using both frame-level perceptual importance and GOP-level hierarchical importance, the PULP algorithm seeks efficient forward error correction assignment that balances efficiency and fairness by controlling the size of identified salient region(s) relative to the channel state. PULP demonstrates robust performance and significantly improved subjective and objective visual quality in the face of burst packet losses.

Proceedings ArticleDOI
TL;DR: This paper proposes a new VQA algorithm - the spatio-temporal video SSIM based on the essence of MOVIE, and explains the algorithm and demonstrates its conceptual similarity to MOVIE; it explores its computational complexity and evaluates its performance on the popular VQEG dataset.
Abstract: Recently, Seshadrinathan and Bovik proposed the Motion-based Video Integrity Evaluation (MOVIE) index for VQA. 1, 2 MOVIE utilized a multi-scale spatio-temporal Gabor filter bank to decompose the videos and to compute motion vectors. Apart from its psychovisual inspiration, MOVIE is an interesting option for VQA owing to its performance. However, the use of MOVIE in a practical setting may prove to be difficult owing to the presence of the multi-scale optical flow computation. In order to bridge the gap between the conceptual elegance of MOVIE and a practical VQA algorithm, we propose a new VQA algorithm - the spatio-temporal video SSIM based on the essence of MOVIE. Spatio-temporal video SSIM utilizes motion information computed from a block-based motion-estimation algorithm and quality measures using a localized set of oriented spatio-temporal filters. In this paper we explain the algorithm and demonstrate its conceptual similarity to MOVIE; we explore its computational complexity and evaluate its performance on the popular VQEG dataset. We show that the proposed algorithm allows for efficient FR VQA without compromising on the performance while retaining the conceptual elegance of MOVIE.

Proceedings ArticleDOI
23 May 2010
TL;DR: In this paper, an algorithm is presented to automatically detect near surface ice layers in images from the shallow subsurface radar (SHARAD) on NASA's Mars Reconnaissance Orbiter.
Abstract: An algorithm is presented to automatically detect near surface ice layers in images from the Shallow Subsurface Radar (SHARAD) on NASA's Mars Reconnaissance Orbiter. Mars' ice-rich Northern Polar Layered Deposits (NPLD) represents an extensive geologic record of climate history. Identifying ice layers in cross-sectional images leads to understanding the three-dimensional structure of ice layers. Scientists have manually identified layers in large data volumes, but the automated algorithm will allow studying more images from over a thousand orbital crossings. A unique coordinate transformation, based upon the surface reflection, makes subsequent filtering and detection more effective on near surface layers. Results show promising capabilities for automatically detecting ice layers on Mars.

Proceedings ArticleDOI
03 Dec 2010
TL;DR: A novel two stage framework for distortion-independent blind image quality assessment based on natural scene statistics (NSS) is proposed, which can be extended beyond the distortion-pool considered here, and each module proposed can be replaced by better-performing ones in the future.
Abstract: Most present day no-reference/blind image quality assessment (NR IQA) algorithms are distortion specific - i.e., they assume that the distortion affecting the image is known. Here we propose a novel two stage framework for distortion-independent blind image quality assessment based on natural scene statistics (NSS). The proposed framework is modular in that it can be extended beyond the distortion-pool considered here, and each module proposed can be replaced by better-performing ones in the future. We describe a 4-distortion demonstration of the proposed framework and show that it performs competitively with the full-reference peak-signal-to-noise-ratio on the LIVE IQA database. A software release of the proposed index has been made available online: http://live.ece.utexas.edu/research/quality/BIQI_4D_release.zip.

Proceedings ArticleDOI
11 Nov 2010
TL;DR: Algorithms that seek to assess the similarity of 3D faces, such that similar and dissimilar faces may be classified with high correlation relative to human perception of facial similarity are developed.
Abstract: We develop algorithms that seek to assess the similarity of 3D faces, such that similar and dissimilar faces may be classified with high correlation relative to human perception of facial similarity. To obtain human facial similarity ratings, we conduct a subjective study, where a set of human subjects rate the similarity of pairs of faces. Such similarity scores are obtained from 12 subjects on 180 3D faces, with a total of 5490 pairs of similarity scores. We then extract Gabor features from automatically detected fiducial points on the range and texture images from the 3D face and demonstrate that these features correlate well with human judgements of similarity. Finally, we demonstrate the application of using such facial similarity ratings for scalable face recognition.

Proceedings ArticleDOI
03 Dec 2010
TL;DR: Categorize the various motion situations and deploy appropriate perceptual models to each category to create a new approach to objective video quality assessment.
Abstract: Emerging multimedia applications have increased the need for video quality measurement. Motion is critical to this task, but is complicated owing to a variety of object movements and movement of the camera. Here, we categorize the various motion situations and deploy appropriate perceptual models to each category. We use these models to create a new approach to objective video quality assessment. Performance evaluation on the Laboratory for Image and Video Engineering (LIVE) Video Quality Database shows competitive performance compared to the leading contemporary VQA algorithms.

Proceedings ArticleDOI
TL;DR: This work uses the recently introduced Maximum Likelihood Difference Scaling (MLDS) method to quantify suprathreshold perceptual differences between pairs of images and examines how perceived image quality estimated through MLDS changes the compression rate is increased.
Abstract: A crucial step in image compression is the evaluation of its performance, and more precisely the available way to measure the final quality of the compressed image Usually, to measure performance, some measure of the covariation between the subjective ratings and the degree of compression is performed between rated image quality and algorithm Nevertheless, local variations are not well taken into account We use the recently introduced Maximum Likelihood Difference Scaling (MLDS) method to quantify suprathreshold perceptual differences between pairs of images and examine how perceived image quality estimated through MLDS changes the compression rate is increased This approach circumvents the limitations inherent to subjective rating methods

Proceedings ArticleDOI
22 Mar 2010
TL;DR: It is found that eye tracking experiments on naturalistic stereo images presented through a haploscope found that fixated luminance contrast and luminance gradient were generally higher than randomly selected luminance Contrast and Luminance gradient, which agrees with previous literatures.
Abstract: We conducted eye tracking experiments on naturalistic stereo images presented through a haploscope, and found that fixated luminance contrast and luminance gradient were generally higher than randomly selected luminance contrast and luminance gradient, which agrees with previous literatures However we also found that the fixated disparity contrast and disparity gradient were generally lower than randomly selected disparity contrast and disparity gradient We discuss the implications of this remarkable result

Proceedings ArticleDOI
23 May 2010
TL;DR: The results from the initial classification experiment demonstrate the strong potential of snakules as an image analysis technique to extract features specific to spicules and spiculated masses, which can subsequently be used to distinguish true spiculate mass locations from non-lesion locations on a mammogram and improve the specificity of computer-aided detection (CADe) algorithms.
Abstract: In this paper, we describe a novel approach for the automatic classification of candidate spiculated mass locations on mammography. Our approach is based on “Snakules” — an evidence-based active contour algorithm that we have recently developed for the annotation of spicules on mammography. We use snakules to extract features characteristic of spicules and spiculated masses, and use these features to classify whether a region of a mammogram contains a spiculated mass or not. The results from our initial classification experiment demonstrate the strong potential of snakules as an image analysis technique to extract features specific to spicules and spiculated masses, which can subsequently be used to distinguish true spiculated mass locations from non-lesion locations on a mammogram and improve the specificity of computer-aided detection (CADe) algorithms.

Proceedings ArticleDOI
03 Dec 2010
TL;DR: This approach proposes a discrete cosine transform (DCT) statistics-based support vector machine (SVM) approach based on only 3 features in the DCT domain, showing to correlate highly with human visual perception of quality.
Abstract: General-purpose no-reference image quality assessment approaches still lag the advances in full-reference methods. Most no-reference methods are either distortion specific (i.e. they quantify one or more distortions such as blur, blockiness, or ringing), or they train a learning machine based on a large number of features. In this approach, we propose a discrete cosine transform (DCT) statistics-based support vector machine (SVM) approach based on only 3 features in the DCT domain. The approach extracts a very small number of features and is entirely in the DCT domain, making it computationally convenient. The results are shown to correlate highly with human visual perception of quality.

Journal ArticleDOI
TL;DR: A quantitative accuracy evaluation wherein the proposed method outperforms a microcanonical annealing approach by Barnard and a cooperative approach by Zitnick and Kanade, while using fewer match quality evaluations than either.
Abstract: We present an efficient method that computes dense stereo correspondences by stochastically sampling match quality values. Nonexhaustive sampling facilitates the use of quality metrics that take unique values at noninteger disparities. Depth estimates are iteratively refined with a stochastic cooperative search by perturbing the estimates, sampling match quality, and reweighting and aggregating the perturbations. The approach gains significant efficiencies when applied to video, where initial estimates are seeded using information from the previous pair in a novel application of the Z-buffering algorithm. This significantly reduces the number of search iterations required. We present a quantitative accuracy evaluation wherein the proposed method outperforms a microcanonical annealing approach by Barnard and a cooperative approach by Zitnick and Kanade , while using fewer match quality evaluations than either. The approach is shown to have more attractive memory usage and scaling than alternatives based on exhaustive sampling.

Proceedings ArticleDOI
03 Dec 2010
TL;DR: This work extends previous work on MICA by introducing a Fast-MICA algorithm that demonstrates the same improvement over classical ICA as the original MICA algorithm while improving the computational speed by two polynomial orders of magnitude.
Abstract: We extend our previous work on Multilinear Independent Component Analysis (MICA) by introducing a Fast-MICA algorithm that demonstrates the same improvement over classical ICA as the original MICA algorithm [1] while improving the computational speed by two polynomial orders of magnitude. Apart for enabling a faster determination of the multilinear structure of image patch probability density, this new approach opens up, for the first time, the possibility of computing a novel non-stationarity index based on the relative change in mutual information. We demonstrate the performance of our Fast-MICA algorithm together with an illustration of our novel non-stationarity index.

Proceedings ArticleDOI
23 May 2010
TL;DR: This work explores the use of Distributed Ray Tracing (DRT), an anti-aliasing technique from computer graphics, in multi-view computational stereo and finds it improves ABM accuracy by 18% and can be generalized to improve other stereo algorithms.
Abstract: We explore the use of Distributed Ray Tracing (DRT), an anti-aliasing technique from computer graphics, in multi-view computational stereo. As an example, we study ABM, a multi-view stereo algorithm based on a set of Hough transform accumulation operations. Augmenting ABM with DRT improves both internal signal quality and reconstruction accuracy. Results are given for both fundamental and complex “super-resolution reconstruction” tasks, where the voxel side length is less than the image ground sample distance. DRT improves ABM accuracy by 18% and can be generalized to improve other stereo algorithms.

Journal ArticleDOI
TL;DR: Cats and owls have a much larger shear than humans, which matches their typical height and fixation distance, and it is proposed that this shear has an ecological value by bringing the ground to vertical horopter to aid in navigating the world.
Abstract: The theoretical vertical horopter is a line passing through the fixation point and perpendicular to the horizontal plane, when the fixation is symmetric and on the horizontal median plane. However, the empirical vertical horopter measured psychophysically deviates from the true vertical, as its top inclines backward with an angle. Thus the two corresponding retinal images of the empirical vertical horopter also deviate from the theoretical corresponding vertical meridians of the two eyes. The average angle between the two empirical vertical meridians is 2 deg, which is called the Helmholtz shear of empirical vertical meridians. Explanations were proposed that this shear has an ecological value by bringing the ground to vertical horopter to aid in navigating the world. Further evidence was found in cats and owls. They have a much larger shear than humans, which matches their typical height and fixation distance.