Home
/
Authors
/
Orazio Gallo

Author

Orazio Gallo

Other affiliations: Smith-Kettlewell Institute, University of California, Santa Cruz, Polytechnic University of Milan

Bio: Orazio Gallo is an academic researcher from Nvidia. The author has contributed to research in topics: View synthesis & Image processing. The author has an hindex of 22, co-authored 68 publications receiving 3018 citations. Previous affiliations of Orazio Gallo include Smith-Kettlewell Institute & University of California, Santa Cruz.

Papers published on a yearly basis

2023
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2006
2005
2004

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Loss Functions for Image Restoration With Neural Networks

[...]

Hang Zhao¹, Orazio Gallo¹, Iuri Frosio¹, Jan Kautz¹•Institutions (1)

Nvidia¹

01 Mar 2017-IEEE Transactions on Computational Imaging

TL;DR: It is shown that the quality of the results improves significantly with better loss functions, even when the network architecture is left unchanged, and a novel, differentiable error function is proposed.

...read moreread less

Abstract: Neural networks are becoming central in several areas of computer vision and image processing and different architectures have been proposed to solve specific problems. The impact of the loss layer of neural networks, however, has not received much attention in the context of image processing: the default and virtually only choice is $\ell _2$ . In this paper, we bring attention to alternative choices for image restoration. In particular, we show the importance of perceptually-motivated losses when the resulting image is to be evaluated by a human observer. We compare the performance of several losses, and propose a novel, differentiable error function. We show that the quality of the results improves significantly with better loss functions, even when the network architecture is left unchanged.

...read moreread less

1,758 citations

Journal Article•DOI•

FlexISP: a flexible camera image processing framework

[...]

Felix Heide¹, Markus Steinberger², Yun-Ta Tsai¹, Mushfiqur Rouf¹, Dawid Pająk¹, Dikpal Reddy¹, Orazio Gallo¹, Jing Liu³, Wolfgang Heidrich⁴, Karen Egiazarian¹, Jan Kautz¹, Kari Pulli¹ - Show less +8 more•Institutions (4)

Nvidia¹, Graz University of Technology², University of California, Santa Cruz³, King Abdullah University of Science and Technology⁴

19 Nov 2014

TL;DR: This work proposes an end-to-end system that is aware of the camera and image model, enforces natural-image priors, while jointly accounting for common image processing steps like demosaicking, denoising, deconvolution, and so forth, all directly in a given output representation.

...read moreread less

Abstract: Conventional pipelines for capturing, displaying, and storing images are usually defined as a series of cascaded modules, each responsible for addressing a particular problem. While this divide-and-conquer approach offers many benefits, it also introduces a cumulative error, as each step in the pipeline only considers the output of the previous step, not the original sensor data. We propose an end-to-end system that is aware of the camera and image model, enforces natural-image priors, while jointly accounting for common image processing steps like demosaicking, denoising, deconvolution, and so forth, all directly in a given output representation (e.g., YUV, DCT). Our system is flexible and we demonstrate it on regular Bayer images as well as images from custom sensors. In all cases, we achieve large improvements in image quality and signal reconstruction compared to state-of-the-art techniques. Finally, we show that our approach is capable of very efficiently handling high-resolution images, making even mobile implementations feasible.

...read moreread less

319 citations

Proceedings Article•DOI•

Artifact-free High Dynamic Range imaging

[...]

Orazio Gallo¹, Natasha Gelfandz², Wei-Chao Chen², Marius Tico², Kari Pulli² - Show less +1 more•Institutions (2)

University of California, Santa Cruz¹, Nokia²

16 Apr 2009

TL;DR: This work presents a technique capable of dealing with a large amount of movement in the scene: it finds, in all the available exposures, patches consistent with a reference image previously selected from the stack and generates the HDR image by averaging the radiance estimates of all such regions.

...read moreread less

Abstract: The contrast in real world scenes is often beyond what consumer cameras can capture. For these situations, High Dynamic Range (HDR) images can be generated by taking multiple exposures of the same scene. When fusing information from different images, however, the slightest change in the scene can generate artifacts which dramatically limit the potential of this solution. We present a technique capable of dealing with a large amount of movement in the scene: we find, in all the available exposures, patches consistent with a reference image previously selected from the stack. We generate the HDR image by averaging the radiance estimates of all such regions and we compensate for camera calibration errors by removing potential seams. We show that our method works even in cases when many moving objects cover large regions of the scene.

...read moreread less

261 citations

Proceedings Article•DOI•

HDR Deghosting: How to Deal with Saturation?

[...]

Jun Hu¹, Orazio Gallo², Kari Pulli², Xiaobai Sun¹•Institutions (2)

Duke University¹, Nvidia²

23 Jun 2013

TL;DR: A novel method for aligning images in an HDR (high-dynamic-range) image stack to produce a new exposure stack where all the images are aligned and appear as if they were taken simultaneously, even in the case of highly dynamic scenes.

...read moreread less

Abstract: We present a novel method for aligning images in an HDR (high-dynamic-range) image stack to produce a new exposure stack where all the images are aligned and appear as if they were taken simultaneously, even in the case of highly dynamic scenes. Our method produces plausible results even where the image used as a reference is either too dark or bright to allow for an accurate registration.

...read moreread less

238 citations

Posted Content•

Loss Functions for Neural Networks for Image Processing

[...]

Hang Zhao, Orazio Gallo, Iuri Frosio, Jan Kautz

28 Nov 2015-arXiv: Computer Vision and Pattern Recognition

...read moreread less

Abstract: Neural networks are becoming central in several areas of computer vision and image processing and different architectures have been proposed to solve specific problems. The impact of the loss layer of neural networks, however, has not received much attention in the context of image processing: the default and virtually only choice is L2. In this paper, we bring attention to alternative choices for image restoration. In particular, we show the importance of perceptually-motivated losses when the resulting image is to be evaluated by a human observer. We compare the performance of several losses, and propose a novel, differentiable error function. We show that the quality of the results improves significantly with better loss functions, even when the network architecture is left unchanged.

...read moreread less

229 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Enhanced Deep Residual Networks for Single Image Super-Resolution

[...]

Bee Oh Lim¹, Sanghyun Son¹, Heewon Kim¹, Seungjun Nah¹, Kyoung Mu Lee¹ - Show less +1 more•Institutions (1)

Seoul National University¹

21 Jul 2017

TL;DR: This paper develops an enhanced deep super-resolution network (EDSR) with performance exceeding those of current state-of-the-art SR methods, and proposes a new multi-scale deepsuper-resolution system (MDSR) and training method, which can reconstruct high-resolution images of different upscaling factors in a single model.

...read moreread less

Abstract: Recent research on super-resolution has progressed with the development of deep convolutional neural networks (DCNN). In particular, residual learning techniques exhibit improved performance. In this paper, we develop an enhanced deep super-resolution network (EDSR) with performance exceeding those of current state-of-the-art SR methods. The significant performance improvement of our model is due to optimization by removing unnecessary modules in conventional residual networks. The performance is further improved by expanding the model size while we stabilize the training procedure. We also propose a new multi-scale deep super-resolution system (MDSR) and training method, which can reconstruct high-resolution images of different upscaling factors in a single model. The proposed methods show superior performance over the state-of-the-art methods on benchmark datasets and prove its excellence by winning the NTIRE2017 Super-Resolution Challenge[26].

...read moreread less

3,221 citations

Proceedings Article•DOI•

Unsupervised Monocular Depth Estimation with Left-Right Consistency

[...]

Clément Godard¹, Oisin Mac Aodha¹, Gabriel J. Brostow¹•Institutions (1)

University College London¹

21 Jul 2017

TL;DR: In this article, the authors propose a novel training objective that enables CNNs to learn to perform single image depth estimation, despite the absence of ground truth depth data, by generating disparity images by training their network with an image reconstruction loss.

...read moreread less

Abstract: Learning based methods have shown very promising results for the task of depth estimation in single images. However, most existing approaches treat depth prediction as a supervised regression problem and as a result, require vast quantities of corresponding ground truth depth data for training. Just recording quality depth data in a range of environments is a challenging problem. In this paper, we innovate beyond existing approaches, replacing the use of explicit depth data during training with easier-to-obtain binocular stereo footage. We propose a novel training objective that enables our convolutional neural network to learn to perform single image depth estimation, despite the absence of ground truth depth data. Ex-ploiting epipolar geometry constraints, we generate disparity images by training our network with an image reconstruction loss. We show that solving for image reconstruction alone results in poor quality depth images. To overcome this problem, we propose a novel training loss that enforces consistency between the disparities produced relative to both the left and right images, leading to improved performance and robustness compared to existing approaches. Our method produces state of the art results for monocular depth estimation on the KITTI driving dataset, even outperforming supervised methods that have been trained with ground truth depth.

...read moreread less

2,239 citations

Posted Content•

Enhanced Deep Residual Networks for Single Image Super-Resolution

[...]

Bee Oh Lim¹, Sanghyun Son¹, Heewon Kim¹, Seungjun Nah¹, Kyoung Mu Lee¹ - Show less +1 more•Institutions (1)

Seoul National University¹

10 Jul 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: Zhang et al. as discussed by the authors developed an enhanced deep super-resolution network (EDSR) with performance exceeding those of current state-of-the-art SR methods by removing unnecessary modules in conventional residual networks.

...read moreread less

1,589 citations

Journal Article•DOI•

FFDNet: Toward a Fast and Flexible Solution for CNN-Based Image Denoising

[...]

Kai Zhang¹, Wangmeng Zuo¹, Lei Zhang²•Institutions (2)

Harbin Institute of Technology¹, Hong Kong Polytechnic University²

25 May 2018-IEEE Transactions on Image Processing

TL;DR: FFDNet as discussed by the authors proposes a fast and flexible denoising convolutional neural network with a tunable noise level map as the input, which can handle a wide range of noise levels effectively with a single network.

...read moreread less

Abstract: Due to the fast inference and good performance, discriminative learning methods have been widely studied in image denoising. However, these methods mostly learn a specific model for each noise level, and require multiple models for denoising images with different noise levels. They also lack flexibility to deal with spatially variant noise, limiting their applications in practical denoising. To address these issues, we present a fast and flexible denoising convolutional neural network, namely FFDNet, with a tunable noise level map as the input. The proposed FFDNet works on downsampled sub-images, achieving a good trade-off between inference speed and denoising performance. In contrast to the existing discriminative denoisers, FFDNet enjoys several desirable properties, including: 1) the ability to handle a wide range of noise levels (i.e., [0, 75]) effectively with a single network; 2) the ability to remove spatially variant noise by specifying a non-uniform noise level map; and 3) faster speed than benchmark BM3D even on CPU without sacrificing denoising performance. Extensive experiments on synthetic and real noisy images are conducted to evaluate FFDNet in comparison with state-of-the-art denoisers. The results show that FFDNet is effective and efficient, making it highly attractive for practical denoising applications.

...read moreread less

1,430 citations

Proceedings Article•DOI•

Learning Deep CNN Denoiser Prior for Image Restoration

[...]

Kai Zhang¹, Wangmeng Zuo¹, Shuhang Gu, Lei Zhang•Institutions (1)

Harbin Institute of Technology¹

21 Jul 2017

TL;DR: In this paper, a set of fast and effective CNN (convolutional neural network) denoisers and integrate them into model-based optimization method to solve other inverse problems (e.g., deblurring).

...read moreread less

Abstract: Model-based optimization methods and discriminative learning methods have been the two dominant strategies for solving various inverse problems in low-level vision. Typically, those two kinds of methods have their respective merits and drawbacks, e.g., model-based optimization methods are flexible for handling different inverse problems but are usually time-consuming with sophisticated priors for the purpose of good performance, in the meanwhile, discriminative learning methods have fast testing speed but their application range is greatly restricted by the specialized task. Recent works have revealed that, with the aid of variable splitting techniques, denoiser prior can be plugged in as a modular part of model-based optimization methods to solve other inverse problems (e.g., deblurring). Such an integration induces considerable advantage when the denoiser is obtained via discriminative learning. However, the study of integration with fast discriminative denoiser prior is still lacking. To this end, this paper aims to train a set of fast and effective CNN (convolutional neural network) denoisers and integrate them into model-based optimization method to solve other inverse problems. Experimental results demonstrate that the learned set of denoisers can not only achieve promising Gaussian denoising results but also can be used as prior to deliver good performance for various low-level vision applications.

...read moreread less

1,216 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse