scispace - formally typeset
Search or ask a question
Author

Alan C. Bovik

Bio: Alan C. Bovik is an academic researcher from University of Texas at Austin. The author has contributed to research in topics: Image quality & Video quality. The author has an hindex of 102, co-authored 837 publications receiving 96088 citations. Previous affiliations of Alan C. Bovik include University of Illinois at Urbana–Champaign & University of Sydney.


Papers
More filters
Proceedings ArticleDOI
29 Apr 2005
TL;DR: A key aspect of this work is that each parameter of the filter has been incorporated to capture the variation in physical characteristics of spiculated masses and architectural distortions and that the parameters of the stage-one detection algorithm are determined by the physical measurements.
Abstract: Mass detection algorithms generally consist of two stages. The aim of the first stage is to detect all potential masses. In the second stage, the aim is to reduce the false-positives by classifying the detected objects as masses or normal tissue. In this paper, we present a new evidence based, stage-one algorithm for the detection of spiculated masses and architectural distortions. By evidence based, we mean that we use the statistics of the physical characteristics of these abnormalities to determine the parameters of the detection algorithm. Our stage-one algorithm consists of two steps, an enhancement step followed by a filtering step. In the first step, we propose a new technique for the enhancement of spiculations in which a linear filter is applied to the Radon transform of the image. In the second step, we filter the enhanced images with a new class of linear image filters called Radial Spiculation Filters. We have invented these filters specifically for detecting spiculated masses and architectural distortions that are marked by converging lines or spiculations. These filters are highly specific narrowband filters, which are designed to match the expected structures of these abnormalities and form a new class of wavelet-type filterbanks derived from optimal theories of filtering. A key aspect of this work is that each parameter of the filter has been incorporated to capture the variation in physical characteristics of spiculated masses and architectural distortions and that the parameters of the stage-one detection algorithm are determined by the physical measurements.

61 citations

Proceedings ArticleDOI
01 Dec 2002
TL;DR: This paper presents an algorithm for blindly determining the quality of JPEG2000 compressed images, trained and tested on data obtained from human observers, and performs close to the limit on useful prediction imposed by the variability between human subjects.
Abstract: Measurement of image quality is crucial for many image-processing algorithms, such as acquisition, compression, restoration, enhancement and reproduction. Traditionally, image quality assessment algorithms have focused on measuring image fidelity, where quality is measured as fidelity with respect to a 'reference' or 'perfect' image. The field of blind quality assessment has been largely unexplored. In this paper we present an algorithm for blindly determining the quality of JPEG2000 compressed images. Our algorithm assigns quality scores that are in good agreement with human evaluations. Our algorithm utilizes a statistical model for wavelet coefficients and computes features that exploit the fact that quantization produces more zero coefficients than expected for natural images. The algorithm is trained and tested on data obtained from human observers, and performs close to the limit on useful prediction imposed by the variability between human subjects.

61 citations

Journal ArticleDOI
15 Jul 2005-Spine
TL;DR: In this paper, a measurement technique to assess dynamic motion of the lumbar spine using enhanced digital fluoroscopic video (DFV) and a distortion compensated roentgen analysis (DCRA) was developed.
Abstract: Study Design. Methodological reliability. Objective. Develop a measurement technique to assess dynamic motion of the lumbar spine using enhanced digital fluoroscopic video (DFV) and a distortion compensated roentgen analysis (DCRA). Summary of Background Data. Controversy over both the definition and consequences of lumbar segmental instability persists. Information from static imaging has had limited success in providing an understanding of this disorder. DFV has the potential to provide further information about lumbar segmental instability; however, the image quality is poor and clinical application is limited. Methods. DFV from 20 male subjects (11 with and nine without low back pain) were obtained during eccentric lumbar flexion (30 Hz). Each DFVs was enhanced with a series of filters to accentuate the vertebral edges. An adapted DCRA algorithm was applied to determine segmental angular and linear displacement. Both intraimage and interimage reliability were assessed using intraclass correlation coefficients (ICC) and standard error of the measurement (SEM). Results. Intraimage reliability yielded an average ICC of 0.986, and the SEM ranged from 0.4‐0.7° and 0.2‐0.3 mm. Interimage reliability yielded an average ICC of 0.878, and the SEM ranged from 0.7‐1.4° and 0.4‐0.7 mm. Conclusions. Enhanced DFV combined with a DCRA resulted in reliable assessment of lumbar spine kinematics. The error values associated with this technique were low and were comparable to published error measurements obtained when using a similar algorithm on handdrawn outlines from static radiographs.

61 citations

Journal ArticleDOI
TL;DR: An unequal power allocation scheme for transmission of JPEG compressed images over multiple-input multiple-output systems employing spatial multiplexing provides significant image quality improvement as compared to different equal power allocations schemes.
Abstract: With the introduction of multiple transmit and receive antennas in next generation wireless systems, real-time image and video communication are expected to become quite common, since very high data rates will become available along with improved data reliability. New joint transmission and coding schemes that explore advantages of multiple antenna systems matched with source statistics are expected to be developed. Based on this idea, we present an unequal power allocation scheme for transmission of JPEG compressed images over multiple-input multiple-output systems employing spatial multiplexing. The JPEG-compressed image is divided into different quality layers, and different layers are transmitted simultaneously from different transmit antennas using unequal transmit power, with a constraint on the total transmit power during any symbol period. Results show that our unequal power allocation scheme provides significant image quality improvement as compared to different equal power allocations schemes, with the peak-signal-to-noise-ratio gain as high as 14 dB at low signal-to-noise-ratios.

60 citations

Posted Content
TL;DR: The largest (by far) subjective video quality dataset is created, containing 38,811 real-world distorted videos and 116,433 space-time localized video patches (‘v-patches’), and 5.5M human perceptual quality annotations, which create two unique NR-VQA models.
Abstract: No-reference (NR) perceptual video quality assessment (VQA) is a complex, unsolved, and important problem to social and streaming media applications. Efficient and accurate video quality predictors are needed to monitor and guide the processing of billions of shared, often imperfect, user-generated content (UGC). Unfortunately, current NR models are limited in their prediction capabilities on real-world, "in-the-wild" UGC video data. To advance progress on this problem, we created the largest (by far) subjective video quality dataset, containing 39, 000 realworld distorted videos and 117, 000 space-time localized video patches ('v-patches'), and 5.5M human perceptual quality annotations. Using this, we created two unique NR-VQA models: (a) a local-to-global region-based NR VQA architecture (called PVQ) that learns to predict global video quality and achieves state-of-the-art performance on 3 UGC datasets, and (b) a first-of-a-kind space-time video quality mapping engine (called PVQ Mapper) that helps localize and visualize perceptual distortions in space and time. We will make the new database and prediction models available immediately following the review process.

60 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

40,609 citations

Book
01 Jan 1998
TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.
Abstract: Introduction to a Transient World. Fourier Kingdom. Discrete Revolution. Time Meets Frequency. Frames. Wavelet Zoom. Wavelet Bases. Wavelet Packet and Local Cosine Bases. An Approximation Tour. Estimations are Approximations. Transform Coding. Appendix A: Mathematical Complements. Appendix B: Software Toolboxes.

17,693 citations

Proceedings ArticleDOI
21 Jul 2017
TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

11,958 citations

Posted Content
TL;DR: Conditional Adversarial Network (CA) as discussed by the authors is a general-purpose solution to image-to-image translation problems, which can be used to synthesize photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

11,127 citations

Journal ArticleDOI
01 Apr 1988-Nature
TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.
Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

9,929 citations