Home
/
Authors
/
Alan C. Bovik

Author

Alan C. Bovik

Other affiliations: University of Illinois at Urbana–Champaign, University of Sydney, Intel ...read more

Bio: Alan C. Bovik is an academic researcher from University of Texas at Austin. The author has contributed to research in topics: Image quality & Video quality. The author has an hindex of 102, co-authored 837 publications receiving 96088 citations. Previous affiliations of Alan C. Bovik include University of Illinois at Urbana–Champaign & University of Sydney.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1983
1982

Papers

PDF

Open Access

More filters

Frequency Domain Representations for Three Dimentional Face Recognition.

[...]

Shalini Gupta, Mia K. Markey, Alan C. Bovik

01 Jan 2008

1 citations

Proceedings Article•DOI•

Dense stereo correspondence using color

[...]

John R. Jordan¹, Alan C. Bovik¹•Institutions (1)

University of Texas at Austin¹

01 Feb 1991

TL;DR: In this article, the use of chromatic photometric constraints for solving the dense stereo correspondence problem has been investigated and a theoretical construction for developing dense stereo correspondences which use chromatic information has been proposed.

...read moreread less

Abstract: We investigate the use of chromatic information in dense stereo correspondence. Specifically the chromatic photometric constraint which is used to specify a mathematical optimality criterion for solving the dense stereo correspondence problem is developed. The result is a theoretical construction for developing dense stereo correspondence algorithms which use chromatic information. The efficacy of using chromatic information via this construction is tested by implementing singleand multi-resolution versions of a stereo correspondence algorithm which uses simulated annealing as a means of solving the optimization problem. Results demonstrate that the use of chromatic information can significantly improve the performance of dense stereo correspondence. 1.© (1991) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

...read moreread less

1 citations

Understanding images of textured surfaces

[...]

Boaz J. Super, Alan C. Bovik, Donald S. Fussell

01 Jan 1992

TL;DR: This work provides support for the hypothesis that the computation of LSF information is a general-purpose stage in low-level vision by finding LSF-based solutions to three different vision problems.

...read moreread less

Abstract: The use of local spatial frequency (LSF) methods for analysis of texture images is explored. Analyses and solutions of three specific problems are presented: (1) the measurement of the fractal dimension of images locally and accurately; (2) the measurement of the three-dimensional orientation of planar textured surfaces; (3) the computation of the three-dimensional shape of curved textured surfaces. The solutions presented apply to surfaces with both rough and reflectance textures. All solutions have been implemented and tested on images of real surfaces, including ones with complex, irregular textures. The novelty of the work is in the use of LSF representations as the basis for the required computations. An LSF representation of any texture can be computed easily with low-level, parallelizable, decision-free computations. This is an advantage over traditional texture-element and edge-element based methods which require pre-processing that is more complex, higher-level, not decision-free, and not applicable to all textures. The methods presented can use different LSF representations; however, Gabor wavelet decompositions are favored for their sampling efficiency and optimal joint localization in the spatial and spatial-frequency domains, which results in highly localized, accurate measurements. By finding LSF-based solutions to three different vision problems, this work provides support for the hypothesis that the computation of LSF information is a general-purpose stage in low-level vision. This work develops novel texture projection models that describe the relationship of image LSF's, surface LSF's, and surface geometry; these models are potentially applicable to other vision problems as well.

...read moreread less

1 citations

Book•DOI•

Advances in Image Processing and Understanding

[...]

Alan C. Bovik, Chang Wen Chen, Dmitry B. Goldgof

01 Jan 2002

TL;DR: This website will be so easy for you to access the internet service, so you can really keep in mind that the book is the best book for you.

...read moreread less

Abstract: We present here because it will be so easy for you to access the internet service. As in this new era, much technology is sophistically offered by connecting to the internet. No any problems to face, just for this day, you can really keep in mind that the book is the best book for you. We offer the best here to read. After deciding how your feeling will be, you can enjoy to visit the link and get the book.

...read moreread less

1 citations

Evaluation of non-reference quality assessment algorithms to curate born-digital video collections

[...]

Maria Esteva, Anne Bowen, T. Richard Goodall¹, Alan C. Bovik, Zach Brian Abel¹ - Show less +1 more•Institutions (1)

University of Texas at Austin¹

01 Jan 2015

TL;DR: It is found that this particular I/VQA model is not apt for evaluating collections with varied content, and that their implementation at large scale can narrow the problem of curating very digital video collections and lead to preservation and access decisions based on informed priorities.

...read moreread less

Abstract: As the production, the variety, and the consumption of borndigital video grows, so does the demand for acquiring, curating and preserving large-scale digital video collections. A multidisciplinary team of curators, computer scientists and video engineers we explore the use of Non-Reference Image and Video Quality Algorithms (I/VQA), specifically of BRISQUE in this paper, to automatically derive ranges of video quality. An important characteristic of these algorithms is that they are modeled to human perception. We run the algorithms in a High Performance Computing (HPC) environment to obtain results for many videos at the same time, accelerating time to results and precision in computing per-frame and per-video quality assessment scores. Results, which were evaluated quantitatively and qualitatively, suggest that BRISQUE identifies the distortions in which it was trained, and performs well in videos that have natural scenes and do not have drastic scene changes. While we found that this particular model is not apt for evaluating collections with varied content, the results suggest that research into other I/VQA models is promising, and that their implementation at large scale can narrow the problem of curating very digital video collections and lead to preservation and access decisions based on informed priorities. Introduction The use of video has become significant and pervasive in our daily lives, going beyond traditional education and entertainment functions into areas such as personal communications exchange, criminal evidence, surveillance, and marketing. With this functional diversity comes a variety of formats, including advancing compression, and editing mechanisms to facilitate video creation and distribution. The advancements in video technology are important to cultural institutions, responsible for documenting society and of preserving video collections. Over time, these video collections grow without bound, severely encumbering the curation task. Accordingly, collecting institutions realize that individual and manual inspection, a traditional approach to assessing video quality and making subsequent preservation and access decisions, is an insurmountable task. Instead, novel, reliable, and automated methods are required for this purpose. Motivated by the need to develop curation solutions for large and varied video collections, this project investigates the use of Image and Video Quality Assessment (I/VQA) algorithms to generate data-driven, perceptually relevant indicators of video quality levels for large video collections. I/VQA algorithms are designed to predict the subjective quality of a natural image or video that has been digitally acquired, processed, communicated and displayed as would be perceived and reported by users [1]. Currently, such algorithms are used to assess the quality of images and videos in streaming applications, and to dynamically correct their distortions. In this project we explore if and which I/VQA algorithms can be used to conduct large-scale automated assessment from which the need for more in depth video analysis can be prioritized. We conducted experiments to understand the adequacy/scope and to refine the I/VQA algorithm BRISQUE using a reference set of videos and a set of artistic videos as testbeds. All the experiments were run using High Performance Computing Resources (HPC). Running parallel computational processes on HPC systems allows generating results for individual frames per video in a collection, promptly and accurately within one workflow. Interpreting these results entailed a qualitative evaluation this is viewing videos with frame-level quality predictions along with a graph indicating a holistic measure of quality over an entire video. In the context of a digital curation project, experimenting with these algorithms in an HPC environment benefits from an interdisciplinary approach. A collaboration between the Laboratory for Image and Video Engineering (LIVE http://live.ece.utexas.edu), which conducts research in I/VQA, and the Texas Advanced Computing Center (TACC http://www.tacc.utexas.edu), which deploys computational resources for open science research, our team combines the expertise of data curators and computational scientists, with that of video engineers. In this paper we will introduce the I/VQA algorithms, explain how they compare to current methods to estimate video quality in heritage video collections, show the experiments conducted to understand the fitness of the model for video collections’ assessment, and discuss the results obtained from testing the model in reference video sets and in a regular video collection. I/VQA Algorithms State-of-the-art I/VQA algorithms are based on natural scene statistics (NSS), which function under the premise that scenes have statistical regularities. Because the human visual system is tuned to note regularities from irregularities, the statistics sensitive to these variations in regularity have been shown to correlate well with difference mean opinion scores (DMOS) of images and video. To successfully map these statistics to a single perceptual quality score, these algorithms train on both images and videos that have corresponding opinion scores. These DMOS scores are computed from a set of subjective evaluations obtained from humans watching sets of videos that have specific types and degrees of distortions. These videos are rated using a continuous sliding scale with the labels “Worst,” “Poor,” “Fair,” “Good,” and “Excellent.” 124 © 2015 Society for Imaging Science and Technology The user scores are combined to compute the DMOS score on the range of [0-100], where 0 is “Excellent” and 100 is “Worst.” These human scores are necessary for measuring the impact that different distortions have on perceptual quality [1]. I/VQA algorithms can be full-reference (FR) and no-reference (NR). The former require as input a high quality reference image or video against which a distorted copy can be compared to. In the context of curation, a FR algorithm, the Structural Similarity Index (SSIM), was used to verify if and to what degree the conversion of original video files involved information loss [2]. By contrast, NR algorithms measure the perceived quality in images and videos for which there is no original or pristine version available for comparison [1]. We propose that NR algorithms could be useful to understand a collection’s quality without the need for humans to review each video. But, studies have to be conducted to understand which models can be used to assess quality in video collections that are varied in content and distortions. The focus of this paper is evaluating if BRISQUE, a NR algorithm for image quality assessment that can be used to assess video, is appropriate for digital video curation. Related Work Collecting institutions have been traditionally focused on digitizing analogue video for preservation and access, and a number of video QC tools have been introduced for purposes of automatic and objective quality assessment of digitized files [3, 4]. This is a great improvement over the traditional approach in which humans reviewed the files to detect both errors originating in the analogue media that was digitized and errors resulting from the digitization process. Indeed, while humans can identify different types of video distortions, manually recording them with precision is extremely time consuming and inconsistent [5]. Aside from individual differences, popular QC tools identify various types of artifacts and noise in individual frames and across frame differences, producing frame-by-frame features [3] or averaged features [4] for each type of detected distortion. In turn, these results have to be interpreted to derive a holistic quality condition per video. Therefore, while these tools assist the curation task by a human, none of them eliminate the need for humans to view the videos. To accurately assess the condition of a video in a perceptually relevant context, these features must be mapped to a quality score which correlates significantly with human-based DMOS scores. Our work differs in methods and scope from the above, serving a complementary function. As opposed to detecting errors based on distortion-specific filters and corresponding ranges of normalcy, we are introducing perceptual subjective measures based on models of the human visual system to understand the quality of individual digital videos within collections. Importantly, the scores produced by the I/VQA algorithms are statistically significant through their correlation with the consensus scores obtained from people that have rated the distortions in reference video sets. Such consensus can be understood as the collective interpretation of quality. In addition, our project does not focus on detecting analogue distortions or on evaluating the results of the digitization process, but on distortions that are typical of compression algorithms. Because we are interested in processing large video collections, we run the model on a supercomputer allowing us to obtain DMOS predictions both holistically and at the per-frame scale. In addition, we also performed a study without training on rated distortions to remove subjectivity. In the following section we describe the testbed collections used to build and to evaluate our model, and the studies performed to determine its fitness to assess large-scale video collections conditions.

...read moreread less

1 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
…
148
149
150
151
152
153
154
…
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Image quality assessment: from error visibility to structural similarity

[...]

Zhou Wang¹, Alan C. Bovik², Hamid R. Sheikh², Eero P. Simoncelli³•Institutions (3)

Center for Neural Science¹, University of Texas at Austin², Howard Hughes Medical Institute³

01 Apr 2004-IEEE Transactions on Image Processing

TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.

...read moreread less

Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

...read moreread less

40,609 citations

Book•

A wavelet tour of signal processing

[...]

Stéphane Mallat

01 Jan 1998

TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.

...read moreread less

Abstract: Introduction to a Transient World. Fourier Kingdom. Discrete Revolution. Time Meets Frequency. Frames. Wavelet Zoom. Wavelet Bases. Wavelet Packet and Local Cosine Bases. An Approximation Tour. Estimations are Approximations. Transform Coding. Appendix A: Mathematical Complements. Appendix B: Software Toolboxes.

...read moreread less

17,693 citations

Proceedings Article•DOI•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Jul 2017

TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

...read moreread less

11,958 citations

Posted Content•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Nov 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: Conditional Adversarial Network (CA) as discussed by the authors is a general-purpose solution to image-to-image translation problems, which can be used to synthesize photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

...read moreread less

11,127 citations

Journal Article•DOI•

Phd by thesis

[...]

Richard Lathe¹•Institutions (1)

French Institute of Health and Medical Research¹

01 Apr 1988-Nature

TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.

...read moreread less

Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

...read moreread less

9,929 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse