Home
/
Authors
/
Alex Harvill

Author

Alex Harvill

Bio: Alex Harvill is an academic researcher. The author has contributed to research in topics: Artificial neural network & Monte Carlo method. The author has an hindex of 4, co-authored 6 publications receiving 322 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Kernel-predicting convolutional networks for denoising Monte Carlo renderings

[...]

Steve Bako¹, Thijs Vogels², Brian McWilliams², Mark Meyer, Jan Novák², Alex Harvill, Pradeep Sen¹, Tony DeRose, Fabrice Rousselle² - Show less +5 more•Institutions (2)

University of California¹, Disney Research²

20 Jul 2017-ACM Transactions on Graphics

TL;DR: A novel, supervised learning approach that allows the filtering kernel to be more complex and general by leveraging a deep convolutional neural network (CNN) architecture and introduces a novel, kernel-prediction network which uses the CNN to estimate the local weighting kernels used to compute each denoised pixel from its neighbors.

...read moreread less

Abstract: Regression-based algorithms have shown to be good at denoising Monte Carlo (MC) renderings by leveraging its inexpensive by-products (e.g., feature buffers). However, when using higher-order models to handle complex cases, these techniques often overfit to noise in the input. For this reason, supervised learning methods have been proposed that train on a large collection of reference examples, but they use explicit filters that limit their denoising ability. To address these problems, we propose a novel, supervised learning approach that allows the filtering kernel to be more complex and general by leveraging a deep convolutional neural network (CNN) architecture. In one embodiment of our framework, the CNN directly predicts the final denoised pixel value as a highly non-linear combination of the input features. In a second approach, we introduce a novel, kernel-prediction network which uses the CNN to estimate the local weighting kernels used to compute each denoised pixel from its neighbors. We train and evaluate our networks on production data and observe improvements over state-of-the-art MC denoisers, showing that our methods generalize well to a variety of scenes. We conclude by analyzing various components of our architecture and identify areas of further research in deep learning for MC denoising.

...read moreread less

278 citations

Journal Article•DOI•

Denoising with kernel prediction and asymmetric loss functions

[...]

Thijs Vogels¹, Fabrice Rousselle¹, Brian McWilliams¹, Gerhard Rothlin¹, Alex Harvill, David Adler², Mark Meyer, Jan Novák¹ - Show less +4 more•Institutions (2)

Disney Research¹, Walt Disney Animation Studios²

30 Jul 2018-ACM Transactions on Graphics

TL;DR: A theoretical analysis of convergence rates of kernel-predicting architectures is presented, shedding light on why kernel prediction performs better than synthesizing the colors directly, complementing the empirical evidence presented in this and previous works.

...read moreread less

Abstract: We present a modular convolutional architecture for denoising rendered images. We expand on the capabilities of kernel-predicting networks by combining them with a number of task-specific modules, and optimizing the assembly using an asymmetric loss. The source-aware encoder---the first module in the assembly---extracts low-level features and embeds them into a common feature space, enabling quick adaptation of a trained network to novel data. The spatial and temporal modules extract abstract, high-level features for kernel-based reconstruction, which is performed at three different spatial scales to reduce low-frequency artifacts. The complete network is trained using a class of asymmetric loss functions that are designed to preserve details and provide the user with a direct control over the variance-bias trade-off during inference. We also propose an error-predicting module for inferring reconstruction error maps that can be used to drive adaptive sampling. Finally, we present a theoretical analysis of convergence rates of kernel-predicting architectures, shedding light on why kernel prediction performs better than synthesizing the colors directly, complementing the empirical evidence presented in this and previous works. We demonstrate that our networks attain results that compare favorably to state-of-the-art methods in terms of detail preservation, low-frequency noise removal, and temporal stability on a variety of production and academic datasets.

...read moreread less

152 citations

Patent•

Denoising Monte Carlo renderings using neural networks with asymmetric loss

[...]

Thijs Vogels, Fabrice Rousselle¹, Jan Novák, Brian McWilliams, Mark Meyer¹, Alex Harvill - Show less +2 more•Institutions (1)

The Walt Disney Company¹

31 Jul 2018

TL;DR: In this article, a modular architecture for denoising Monte Carlo renderings using neural networks is presented, which includes separate single-frame or temporal denoizing modules for individual scales, and one or more scale compositor neural networks configured to adaptively blend individual scales.

...read moreread less

Abstract: A modular architecture is provided for denoising Monte Carlo renderings using neural networks. The temporal approach extracts and combines feature representations from neighboring frames rather than building a temporal context using recurrent connections. A multiscale architecture includes separate single-frame or temporal denoising modules for individual scales, and one or more scale compositor neural networks configured to adaptively blend individual scales. An error-predicting module is configured to produce adaptive sampling maps for a renderer to achieve more uniform residual noise distribution. An asymmetric loss function may be used for training the neural networks, which can provide control over the variance-bias trade-off during denoising.

...read moreread less

5 citations

Patent•

Multi-scale architecture of denoising monte carlo renderings using neural networks

[...]

Thijs Vogels, Fabrice Rousselle¹, Jan Novák, Brian McWilliams, Mark Meyer¹, Alex Harvill - Show less +2 more•Institutions (1)

The Walt Disney Company¹

03 Oct 2019

TL;DR: In this paper, a modular architecture for denoising Monte Carlo renderings using neural networks is presented, which includes separate single-frame or temporal denoizing modules for individual scales, and one or more scale compositor neural networks configured to adaptively blend individual scales.

...read moreread less

4 citations

Patent•

Temporal techniques of denoising monte carlo renderings using neural networks

[...]

Thijs Vogels, Fabrice Rousselle¹, Jan Novák, Brian McWilliams, Mark Meyer¹, Alex Harvill, David Adler - Show less +3 more•Institutions (1)

The Walt Disney Company¹

03 Oct 2019

...read moreread less

4 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Journal Article•DOI•

Deep learning on image denoising: An overview.

[...]

Chunwei Tian¹, Lunke Fei², Wenxian Zheng³, Yong Xu¹, Wangmeng Zuo¹, Chia-Wen Lin⁴ - Show less +2 more•Institutions (4)

Harbin Institute of Technology¹, Guangdong University of Technology², Tsinghua University³, National Tsing Hua University⁴

01 Nov 2020-Neural Networks

TL;DR: A comparative study of deep techniques in image denoising by classifying the deep convolutional neural networks for additive white noisy images, the deep CNNs for real noisy images; the deepCNNs for blind Denoising and the deep network for hybrid noisy images.

...read moreread less

518 citations

Proceedings Article•DOI•

Burst Denoising with Kernel Prediction Networks

[...]

Ben Mildenhall¹, Jonathan T. Barron², Jiawen Chen², Dillon Sharlet², Ren Ng², Robert J. Carroll² - Show less +2 more•Institutions (2)

University of California, Berkeley¹, Google²

18 Jun 2018

TL;DR: In this paper, a convolutional neural network architecture is proposed for predicting spatially varying kernels that can both align and denoise frames, and a synthetic data generation approach based on a realistic noise formation model, and an optimization guided by an annealed loss function to avoid undesirable local minima.

...read moreread less

Abstract: We present a technique for jointly denoising bursts of images taken from a handheld camera. In particular, we propose a convolutional neural network architecture for predicting spatially varying kernels that can both align and denoise frames, a synthetic data generation approach based on a realistic noise formation model, and an optimization guided by an annealed loss function to avoid undesirable local minima. Our model matches or outperforms the state-of-the-art across a wide range of noise levels on both real and synthetic data.

...read moreread less

387 citations

Proceedings Article•DOI•

Toward Real-World Single Image Super-Resolution: A New Benchmark and a New Model

[...]

Jianrui Cai¹, Hui Zeng¹, Hongwei Yong¹, Zisheng Cao, Lei Zhang¹ - Show less +1 more•Institutions (1)

Hong Kong Polytechnic University¹

01 Oct 2019

TL;DR: Li et al. as mentioned in this paper proposed a Laplacian pyramid based kernel prediction network (LP-KPN), which efficiently learns per-pixel kernels to recover the HR image, which achieved better visual quality with sharper edges and finer textures on real-world scenes.

...read moreread less

Abstract: Most of the existing learning-based single image super-resolution (SISR) methods are trained and evaluated on simulated datasets, where the low-resolution (LR) images are generated by applying a simple and uniform degradation (i.e., bicubic downsampling) to their high-resolution (HR) counterparts. However, the degradations in real-world LR images are far more complicated. As a consequence, the SISR models trained on simulated data become less effective when applied to practical scenarios. In this paper, we build a real-world super-resolution (RealSR) dataset where paired LR-HR images on the same scene are captured by adjusting the focal length of a digital camera. An image registration algorithm is developed to progressively align the image pairs at different resolutions. Considering that the degradation kernels are naturally non-uniform in our dataset, we present a Laplacian pyramid based kernel prediction network (LP-KPN), which efficiently learns per-pixel kernels to recover the HR image. Our extensive experiments demonstrate that SISR models trained on our RealSR dataset deliver better visual quality with sharper edges and finer textures on real-world scenes than those trained on simulated datasets. Though our RealSR dataset is built by using only two cameras (Canon 5D3 and Nikon D810), the trained model generalizes well to other camera devices such as Sony a7II and mobile phones.

...read moreread less

318 citations

Journal Article•DOI•

Interactive reconstruction of Monte Carlo image sequences using a recurrent denoising autoencoder

[...]

Chakravarty R. Alla Chaitanya¹, Anton S. Kaplanyan², Christoph Schied³, Marco Salvi², Aaron Eliot Lefohn², Derek Nowrouzezahrai⁴, Timo Aila² - Show less +3 more•Institutions (4)

Université de Montréal¹, Nvidia², Karlsruhe Institute of Technology³, McGill University⁴

20 Jul 2017-ACM Transactions on Graphics

TL;DR: This work proposes a variant of deep convolutional networks better suited to the class of noise present in Monte Carlo rendering, which allows for much larger pixel neighborhoods to be taken into account, while also improving execution speed by an order of magnitude.

...read moreread less

Abstract: We describe a machine learning technique for reconstructing image sequences rendered using Monte Carlo methods. Our primary focus is on reconstruction of global illumination with extremely low sampling budgets at interactive rates. Motivated by recent advances in image restoration with deep convolutional networks, we propose a variant of these networks better suited to the class of noise present in Monte Carlo rendering. We allow for much larger pixel neighborhoods to be taken into account, while also improving execution speed by an order of magnitude. Our primary contribution is the addition of recurrent connections to the network in order to drastically improve temporal stability for sequences of sparsely sampled input images. Our method also has the desirable property of automatically modeling relationships based on auxiliary per-pixel input channels, such as depth and normals. We show significantly higher quality results compared to existing methods that run at comparable speeds, and furthermore argue a clear path for making our method run at realtime rates in the near future.

...read moreread less

277 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70

Collapse