scispace - formally typeset
Open AccessPosted Content

Learning to Predict Indoor Illumination from a Single Image

TLDR
In this article, the authors train an end-to-end deep neural network that directly regresses a limited field-of-view photo to HDR illumination, without strong assumptions on scene geometry, material properties, or lighting.
Abstract
We propose an automatic method to infer high dynamic range illumination from a single, limited field-of-view, low dynamic range photograph of an indoor scene. In contrast to previous work that relies on specialized image capture, user input, and/or simple scene models, we train an end-to-end deep neural network that directly regresses a limited field-of-view photo to HDR illumination, without strong assumptions on scene geometry, material properties, or lighting. We show that this can be accomplished in a three step process: 1) we train a robust lighting classifier to automatically annotate the location of light sources in a large dataset of LDR environment maps, 2) we use these annotations to train a deep neural network that predicts the location of lights in a scene from a single limited field-of-view photo, and 3) we fine-tune this network using a small dataset of HDR environment maps to predict light intensities. This allows us to automatically recover high-quality HDR illumination estimates that significantly outperform previous state-of-the-art methods. Consequently, using our illumination estimates for applications like 3D object insertion, we can achieve results that are photo-realistic, which is validated via a perceptual user study.

read more

Citations
More filters
Journal ArticleDOI

Scene-Aware Audio Rendering via Deep Acoustic Analysis

TL;DR: A new method to capture the acoustic characteristics of real-world rooms using commodity devices, and use the captured characteristics to generate similar sounding sources with virtual models, based on deep neural networks.
Posted Content

Seeing the World in a Bag of Chips.

TL;DR: This work addresses the dual problems of novel view synthesis and environment reconstruction from hand-held RGBD sensors, and generates highly detailed environment images, revealing room composition, objects, people, buildings, and trees visible through windows.
Posted ContentDOI

Single-image Full-body Human Relighting

TL;DR: A new deep learning architecture is proposed, tailored to the decomposition performed in PRT, that is trained using a combination of L1, logarithmic, and rendering losses and outperforms the state of the art for full-body human relighting both with synthetic images and photographs.
Proceedings ArticleDOI

Extracting Regular FOV Shots from 360 Event Footage

TL;DR: This paper studies human preferences for static and moving camera RFOV shots generated from 360 footage, and derives design guidelines that are used to develop automatic algorithms that are demonstrated in a prototype user interface for extracting RFOv shots from 360 videos.
Posted Content

Illuminant Spectra-based Source Separation Using Flash Photography

TL;DR: This work derives a novel physics-based relationship between color variations in the observed flash/no-flash intensities and the spectra and surface shading corresponding to individual scene illuminants based on their spectral differences.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings ArticleDOI

Histograms of oriented gradients for human detection

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.
Posted Content

Adam: A Method for Stochastic Optimization

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.
Proceedings ArticleDOI

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture

TL;DR: This paper addresses three different computer vision tasks using a single basic architecture: depth prediction, surface normal estimation, and semantic labeling using a multiscale convolutional network that is able to adapt easily to each task using only small modifications.
Proceedings ArticleDOI

Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography

TL;DR: A method that uses measured scene radiance and global illumination in order to add new objects to light-based models with correct lighting and the relevance of the technique to recovering surface reflectance properties in uncontrolled lighting situations is discussed.
Related Papers (5)