Home
/
Authors
/
Alan C. Bovik

Author

Alan C. Bovik

Other affiliations: University of Illinois at Urbana–Champaign, University of Sydney, Intel ...read more

Bio: Alan C. Bovik is an academic researcher from University of Texas at Austin. The author has contributed to research in topics: Image quality & Video quality. The author has an hindex of 102, co-authored 837 publications receiving 96088 citations. Previous affiliations of Alan C. Bovik include University of Illinois at Urbana–Champaign & University of Sydney.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1983
1982

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

Optimizing Image Quality

[...]

Dominique Brunet¹, Sumohana S. Channappayya², Zhou Wang¹, Edward R. Vrscay¹, Alan C. Bovik³ - Show less +1 more•Institutions (3)

University of Waterloo¹, Indian Institute of Technology, Hyderabad², University of Texas at Austin³

01 Jan 2018

TL;DR: In this chapter, a systematic framework for optimization with respect to a perceptual quality assessment algorithm is presented and the Structural SIMilarity (SSIM) index is the representative image quality assessment model that is studied.

...read moreread less

Abstract: The fact that multimedia services have become the major driver for next generation wireless networks underscores their technological and economic impact. A vast majority of these multimedia services are consumer-centric and therefore must guarantee a certain level of perceptual quality. Given the massive volumes of image and video data in question, it is only natural to adopt automatic quality prediction and optimization tools. The past decade has seen the invention of several excellent automatic quality prediction tools for natural images and videos. While these tools predict perceptual quality scores accurately, they do not necessarily lend themselves to standard optimization techniques. In this chapter, a systematic framework for optimization with respect to a perceptual quality assessment algorithm is presented. The Structural SIMilarity (SSIM) index, which has found vast commercial acceptance owing to its high performance and low complexity, is the representative image quality assessment model that is studied. Specifically, a detailed exposition of the mathematical properties of the SSIM index is presented first, followed by a discussion on the design of linear and non-linear SSIM-optimal image restoration algorithms.

...read moreread less

14 citations

Journal Article•DOI•

ST-GREED: Space-Time Generalized Entropic Differences for Frame Rate Dependent Video Quality Prediction

[...]

Pavan C. Madhusudana¹, Neil Birkbeck², Yilin Wang², Balu Adsumilli², Alan C. Bovik¹ - Show less +1 more•Institutions (2)

University of Texas at Austin¹, Google²

27 Aug 2021-IEEE Transactions on Image Processing

TL;DR: In this paper, a generalized Gaussian distribution (GGD) is used to model band-pass responses, while entropy variations between reference and distorted videos under the GGD model are used to capture video quality variations arising from frame rate changes.

...read moreread less

Abstract: We consider the problem of conducting frame rate dependent video quality assessment (VQA) on videos of diverse frame rates, including high frame rate (HFR) videos. More generally, we study how perceptual quality is affected by frame rate, and how frame rate and compression combine to affect perceived quality. We devise an objective VQA model called Space-Time GeneRalized Entropic Difference (GREED) which analyzes the statistics of spatial and temporal band-pass video coefficients. A generalized Gaussian distribution (GGD) is used to model band-pass responses, while entropy variations between reference and distorted videos under the GGD model are used to capture video quality variations arising from frame rate changes. The entropic differences are calculated across multiple temporal and spatial subbands, and merged using a learned regressor. We show through extensive experiments that GREED achieves state-of-the-art performance on the LIVE-YT-HFR Database when compared with existing VQA models. The features used in GREED are highly generalizable and obtain competitive performance even on standard, non-HFR VQA databases. The implementation of GREED has been made available online: https://github.com/pavancm/GREED .

...read moreread less

14 citations

Journal Article•DOI•

Towards Perceptually Optimized Adaptive Video Streaming-A Realistic Quality of Experience Database

[...]

Christos G. Bampis¹, Zhi Li², Ioannis Katsavounidis³, Te-Yuan Huang², Chaitanya Ekanadham², Alan C. Bovik¹ - Show less +2 more•Institutions (3)

University of Texas at Austin¹, Netflix², Facebook³

20 Apr 2021-IEEE Transactions on Image Processing

TL;DR: In this article, the LIVE-NFLX-II database contains subjective QoE responses to various design dimensions, such as bitrate adaptation algorithms, network conditions and video content.

...read moreread less

Abstract: Measuring Quality of Experience (QoE) and integrating these measurements into video streaming algorithms is a multi-faceted problem that fundamentally requires the design of comprehensive subjective QoE databases and objective QoE prediction models. To achieve this goal, we have recently designed the LIVE-NFLX-II database, a highly-realistic database which contains subjective QoE responses to various design dimensions, such as bitrate adaptation algorithms, network conditions and video content. Our database builds on recent advancements in content-adaptive encoding and incorporates actual network traces to capture realistic network variations on the client device. The new database focuses on low bandwidth conditions which are more challenging for bitrate adaptation algorithms, which often must navigate tradeoffs between rebuffering and video quality. Using our database, we study the effects of multiple streaming dimensions on user experience and evaluate video quality and quality of experience models and analyze their strengths and weaknesses. We believe that the tools introduced here will help inspire further progress on the development of perceptually-optimized client adaptation and video streaming strategies. The database is publicly available at http://live.ece.utexas.edu/research/LIVE_NFLX_II/live_nflx_plus.html .

...read moreread less

14 citations

Journal Article•DOI•

Closed-form correlation model of oriented bandpass natural images

[...]

Che-Chun Su¹, Lawrence K. Cormack, Alan C. Bovik¹•Institutions (1)

University of Texas at Austin¹

01 Jan 2015-IEEE Signal Processing Letters

TL;DR: This work proposes a new closed-form spatial-oriented correlation model that captures statistical regularities between perceptually decomposed natural image luminance samples and validate the new correlation model on a variety of natural images.

...read moreread less

Abstract: Most prevalent statistical models of natural images characterize only the univariate distributions of divisively normalized bandpass responses or wavelet-like decompositions of them. However, the higher-order dependencies between spatially neighboring responses are not yet well understood. Towards filling this gap, we propose a new closed-form spatial-oriented correlation model that captures statistical regularities between perceptually decomposed natural image luminance samples. We validate the new correlation model on a variety of natural images. Experimental results demonstrate the robustness of the new correlation model across image content. A software release that implements the new closed-form spatial-oriented correlation model is available at http://live.ece.utexas.edu/research/3dnss/bicorr_release.zip.

...read moreread less

14 citations

Journal Article•DOI•

Studying the Statistics of Natural X-ray Pictures

[...]

Praful Gupta¹, Jack L. Glover², Nicholas G. Paulter², Alan C. Bovik¹•Institutions (2)

University of Texas at Austin¹, National Institute of Standards and Technology²

14 May 2018-Journal of Testing and Evaluation

TL;DR: This article has studied and analyzed the statistics of both pristine and distorted bandpass X-ray images, and devised an application of NSS models to an image modality classification task, whereby VL, X-rays, infrared, and millimeter-wave images can be effectively and automatically distinguished.

...read moreread less

Abstract: In this article, we have studied and analyzed the statistics of both pristine and distorted bandpass X-ray images. In the past, we have shown that the statistics of natural, bandpass-filtered visible light (VL) pictures, commonly expressed by natural scene statistic (NSS) models, can be used to create remarkably powerful, perceptually relevant predictors of perceptual picture quality. We find that similar models can be developed that apply quite well to X-ray image data. We have also studied the potential of applying these statistical X-ray NSS models to the design of algorithms for automatic image quality prediction of X-ray images, such as might occur in security, medicine, and material inspection applications. As a demonstration of the discrimination power of these models, we devised an application of NSS models to an image modality classification task, whereby VL, X-ray, infrared, and millimeter-wave images can be effectively and automatically distinguished. Our study is conducted on a dataset of X-ray images made available by the National Institute of Standards and Technology.

...read moreread less

14 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
…
81
82
83
84
85
86
87
…
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Image quality assessment: from error visibility to structural similarity

[...]

Zhou Wang¹, Alan C. Bovik², Hamid R. Sheikh², Eero P. Simoncelli³•Institutions (3)

Center for Neural Science¹, University of Texas at Austin², Howard Hughes Medical Institute³

01 Apr 2004-IEEE Transactions on Image Processing

TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.

...read moreread less

Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

...read moreread less

40,609 citations

Book•

A wavelet tour of signal processing

[...]

Stéphane Mallat

01 Jan 1998

TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.

...read moreread less

Abstract: Introduction to a Transient World. Fourier Kingdom. Discrete Revolution. Time Meets Frequency. Frames. Wavelet Zoom. Wavelet Bases. Wavelet Packet and Local Cosine Bases. An Approximation Tour. Estimations are Approximations. Transform Coding. Appendix A: Mathematical Complements. Appendix B: Software Toolboxes.

...read moreread less

17,693 citations

Proceedings Article•DOI•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Jul 2017

TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

...read moreread less

11,958 citations

Posted Content•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Nov 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: Conditional Adversarial Network (CA) as discussed by the authors is a general-purpose solution to image-to-image translation problems, which can be used to synthesize photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

...read moreread less

11,127 citations

Journal Article•DOI•

Phd by thesis

[...]

Richard Lathe¹•Institutions (1)

French Institute of Health and Medical Research¹

01 Apr 1988-Nature

TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.

...read moreread less

Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

...read moreread less

9,929 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse