Data-driven hallucination of different times of day from a single outdoor photo

doi:10.1145/2508363.2508419

Home
/
Papers
/
Data-driven hallucination of different times of day from a single outdoor photo

Journal Article•DOI•

Data-driven hallucination of different times of day from a single outdoor photo

YiChang Shih¹, Sylvain Paris², Frédo Durand¹, William T. Freeman¹•Institutions (2)

Massachusetts Institute of Technology¹, Adobe Systems²

01 Nov 2013-Vol. 32, Iss: 6, pp 200

TL;DR: This paper introduces the first data-driven approach to automatically creating a plausible-looking photo that appears as though it were taken at a different time of day, using a database of time-lapse videos of various scenes.

read less

Abstract: We introduce "time hallucination": synthesizing a plausible image at a different time of day from an input image. This challenging task often requires dramatically altering the color appearance of the picture. In this paper, we introduce the first data-driven approach to automatically creating a plausible-looking photo that appears as though it were taken at a different time of day. The time of day is specified by a semantic time label, such as "night".Our approach relies on a database of time-lapse videos of various scenes. These videos provide rich information about the variations in color appearance of a scene throughout the day. Our method transfers the color appearance from videos with a similar scene as the input photo. We propose a locally affine model learned from the video for the transfer, allowing our model to synthesize new color data while retaining image details. We show that this model can hallucinate a wide range of different times of day. The model generates a large sparse linear system, which can be solved by off-the-shelf solvers. We validate our methods by synthesizing transforming photos of various outdoor scenes to four times of interest: daytime, the golden hour, the blue hour, and nighttime.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Jul 2017

TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

...read moreread less

11,958 citations

Cites background from "Data-driven hallucination of differ..."

..., [14, 23, 18, 8, 10, 50, 30, 36, 16, 55, 58]), despite the fact that the setting is always the same: predict pixels from pixels....
[...]

Posted Content•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Nov 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: Conditional Adversarial Network (CA) as discussed by the authors is a general-purpose solution to image-to-image translation problems, which can be used to synthesize photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

...read moreread less

11,127 citations

Posted Content•

Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks

[...]

Jun-Yan Zhu¹, Taesung Park¹, Phillip Isola¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

30 Mar 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work presents an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples, and introduces a cycle consistency loss to push F(G(X)) ≈ X (and vice versa).

...read moreread less

Abstract: Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain $X$ to a target domain $Y$ in the absence of paired examples. Our goal is to learn a mapping $G: X \rightarrow Y$ such that the distribution of images from $G(X)$ is indistinguishable from the distribution $Y$ using an adversarial loss. Because this mapping is highly under-constrained, we couple it with an inverse mapping $F: Y \rightarrow X$ and introduce a cycle consistency loss to push $F(G(X)) \approx X$ (and vice versa). Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc. Quantitative comparisons against several prior methods demonstrate the superiority of our approach.

...read moreread less

4,465 citations

Book Chapter•DOI•

Generative Visual Manipulation on the Natural Image Manifold

[...]

Jun-Yan Zhu¹, Philipp Krähenbühl¹, Eli Shechtman², Alexei A. Efros¹•Institutions (2)

University of California¹, Adobe Systems²

08 Oct 2016

TL;DR: This paper proposes to learn the natural image manifold directly from data using a generative adversarial neural network, and defines a class of image editing operations, and constrain their output to lie on that learned manifold at all times.

...read moreread less

Abstract: Realistic image manipulation is challenging because it requires modifying the image appearance in a user-controlled way, while preserving the realism of the result. Unless the user has considerable artistic skill, it is easy to “fall off” the manifold of natural images while editing. In this paper, we propose to learn the natural image manifold directly from data using a generative adversarial neural network. We then define a class of image editing operations, and constrain their output to lie on that learned manifold at all times. The model automatically adjusts the output keeping all edits as realistic as possible. All our manipulations are expressed in terms of constrained optimization and are applied in near-real time. We evaluate our algorithm on the task of realistic photo manipulation of shape and color. The presented method can further be used for changing one image to look like the other, as well as generating novel imagery from scratch based on user’s scribbles.

...read moreread less

1,116 citations

Cites methods from "Data-driven hallucination of differ..."

...The data term relaxes the color constancy assumption by introducing a locally affine color transfer model A [32] while the spatial and color regularization terms encourage smoothness in both the motion and color change....
[...]
...We solve the objective by iteratively estimating the flow (u, v) using a traditional optical flow algorithm, and computing the color change A by solving a system of linear equations [32]....
[...]

Journal Article•DOI•

Deep bilateral learning for real-time image enhancement

[...]

Michaël Gharbi¹, Jiawen Chen², Jonathan T. Barron², Samuel W. Hasinoff², Frédo Durand - Show less +1 more•Institutions (2)

Massachusetts Institute of Technology¹, Google²

20 Jul 2017-ACM Transactions on Graphics

TL;DR: In this paper, a convolutional neural network is used to predict the coefficients of a locally affine model in bilateral space, which is then applied to the full-resolution image.

...read moreread less

Abstract: Performance is a critical challenge in mobile image processing. Given a reference imaging pipeline, or even human-adjusted pairs of images, we seek to reproduce the enhancements and enable real-time evaluation. For this, we introduce a new neural network architecture inspired by bilateral grid processing and local affine color transforms. Using pairs of input/output images, we train a convolutional neural network to predict the coefficients of a locally-affine model in bilateral space. Our architecture learns to make local, global, and content-dependent decisions to approximate the desired image transformation. At runtime, the neural network consumes a low-resolution version of the input image, produces a set of affine transformations in bilateral space, upsamples those transformations in an edge-preserving fashion using a new slicing node, and then applies those upsampled transformations to the full-resolution image. Our algorithm processes high-resolution images on a smartphone in milliseconds, provides a real-time viewfinder at 1080p resolution, and matches the quality of state-of-the-art approximation techniques on a large class of image operators. Unlike previous work, our model is trained off-line from data and therefore does not require access to the original operator at runtime. This allows our model to learn complex, scene-dependent transformations for which no reference implementation is available, such as the photographic edits of a human retoucher.

...read moreread less

510 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Histograms of oriented gradients for human detection

[...]

Navneet Dalal¹, Bill Triggs¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

20 Jun 2005

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Abstract: We study the question of feature sets for robust visual object recognition; adopting linear SVM based human detection as a test case. After reviewing existing edge and gradient based descriptors, we show experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection. We study the influence of each stage of the computation on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial binning, and high-quality local contrast normalization in overlapping descriptor blocks are all important for good results. The new approach gives near-perfect separation on the original MIT pedestrian database, so we introduce a more challenging dataset containing over 1800 annotated human images with a large range of pose variations and backgrounds.

...read moreread less

31,952 citations

"Data-driven hallucination of differ..." refers methods in this paper

...We tried the different descriptors suggested in Xiao’s paper, and found that the Histograms of Oriented Gradients (HOG) [Dalal and Triggs 2005] works well for our data....
[...]

Proceedings Article•DOI•

SUN database: Large-scale scene recognition from abbey to zoo

[...]

Jianxiong Xiao¹, James Hays², Krista A. Ehinger¹, Aude Oliva¹, Antonio Torralba¹ - Show less +1 more•Institutions (2)

Massachusetts Institute of Technology¹, Brown University²

13 Jun 2010

TL;DR: This paper proposes the extensive Scene UNderstanding (SUN) database that contains 899 categories and 130,519 images and uses 397 well-sampled categories to evaluate numerous state-of-the-art algorithms for scene recognition and establish new bounds of performance.

...read moreread less

Abstract: Scene categorization is a fundamental problem in computer vision However, scene understanding research has been constrained by the limited scope of currently-used databases which do not capture the full variety of scene categories Whereas standard databases for object categorization contain hundreds of different classes of objects, the largest available dataset of scene categories contains only 15 classes In this paper we propose the extensive Scene UNderstanding (SUN) database that contains 899 categories and 130,519 images We use 397 well-sampled categories to evaluate numerous state-of-the-art algorithms for scene recognition and establish new bounds of performance We measure human scene classification performance on the SUN database and compare this with computational methods Additionally, we study a finer-grained scene representation to detect scenes embedded inside of larger scenes

...read moreread less

2,960 citations

"Data-driven hallucination of differ..." refers methods in this paper

...We achieve these two tasks using existing scene and image matching techniques [Xiao et al. 2010]....
[...]

Proceedings Article•DOI•

Image quilting for texture synthesis and transfer

[...]

Alexei A. Efros¹, William T. Freeman¹•Institutions (1)

Mitsubishi Electric Research Laboratories¹

01 Aug 2001

TL;DR: This work uses quilting as a fast and very simple texture synthesis algorithm which produces surprisingly good results for a wide range of textures and extends the algorithm to perform texture transfer — rendering an object with a texture taken from a different object.

...read moreread less

Abstract: We present a simple image-based method of generating novel visual appearance in which a new image is synthesized by stitching together small patches of existing images. We call this process image quilting. First, we use quilting as a fast and very simple texture synthesis algorithm which produces surprisingly good results for a wide range of textures. Second, we extend the algorithm to perform texture transfer — rendering an object with a texture taken from a different object. More generally, we demonstrate how an image can be re-rendered in the style of a different image. The method works directly on the images and does not require 3D information.

...read moreread less

2,649 citations

"Data-driven hallucination of differ..." refers background or methods in this paper

...Image Analogies Our work relates to Image Analogies [Hertzmann et al. 2001; Efros and Freeman 2001] in the sense that input : hallucinated image :: matched frame : target frame where the matched and target frames are from the time-lapse video....
[...]
...Image Analogies Our work relates to Image Analogies [Hertzmann et al. 2001; Efros and Freeman 2001] in the sense that...
[...]

Journal Article•DOI•

Color transfer between images

[...]

Erik Reinhard¹, M. Adhikhmin, Bruce Gooch, Peter Shirley•Institutions (1)

University of Utah¹

01 Sep 2001-IEEE Computer Graphics and Applications

TL;DR: This work uses a simple statistical analysis to impose one image's color characteristics on another by choosing an appropriate source image and applying its characteristic to another image.

...read moreread less

Abstract: We use a simple statistical analysis to impose one image's color characteristics on another. We can achieve color correction by choosing an appropriate source image and apply its characteristic to another image.

...read moreread less

2,615 citations

"Data-driven hallucination of differ..." refers methods in this paper

...Approaches for color transfer such as [Reinhard et al. 2001; Pouli and Reinhard 2011; Pitie et al. 2005] apply a global color mapping to match color statistics between images....
[...]
...Figure 12 compares our approach to techniques based on a global color transfer [Reinhard et al. 2001; Pitie et al. 2005]....
[...]

Journal Article•DOI•

Example-based super-resolution

[...]

William T. Freeman¹, Thouis R. Jones², Egon C. Pasztor²•Institutions (2)

Massachusetts Institute of Technology¹, Mitsubishi Electric Research Laboratories²

01 Mar 2002-IEEE Computer Graphics and Applications

TL;DR: This work built on another training-based super- resolution algorithm and developed a faster and simpler algorithm for one-pass super-resolution that requires only a nearest-neighbor search in the training set for a vector derived from each patch of local image data.

...read moreread less

Abstract: We call methods for achieving high-resolution enlargements of pixel-based images super-resolution algorithms. Many applications in graphics or image processing could benefit from such resolution independence, including image-based rendering (IBR), texture mapping, enlarging consumer photographs, and converting NTSC video content to high-definition television. We built on another training-based super-resolution algorithm and developed a faster and simpler algorithm for one-pass super-resolution. Our algorithm requires only a nearest-neighbor search in the training set for a vector derived from each patch of local image data. This one-pass super-resolution algorithm is a step toward achieving resolution independence in image-based representations. We don't expect perfect resolution independence-even the polygon representation doesn't have that-but increasing the resolution independence of pixel-based representations is an important task for IBR.

...read moreread less

2,576 citations

"Data-driven hallucination of differ..." refers methods in this paper

...Image Collections Recent research demonstrates convincing graphics application with big data, such as scene completion [Hays and Efros 2007], tone adjustment [Bychkovsky et al. 2011], and super-resolution [Freeman et al. 2002]....
[...]