Home
/
Authors
/
Ting-Chun Wang

Author

Ting-Chun Wang

Other affiliations: University of California, University of California, Berkeley, National Taiwan University

Bio: Ting-Chun Wang is an academic researcher from Nvidia. The author has contributed to research in topics: Rendering (computer graphics) & Convolutional neural network. The author has an hindex of 26, co-authored 42 publications receiving 7638 citations. Previous affiliations of Ting-Chun Wang include University of California & University of California, Berkeley.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

[...]

Ting-Chun Wang¹, Ming-Yu Liu¹, Jun-Yan Zhu², Andrew Tao¹, Jan Kautz¹, Bryan Catanzaro¹ - Show less +2 more•Institutions (2)

Nvidia¹, University of California, Berkeley²

18 Jun 2018

TL;DR: In this paper, a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs) is presented.

...read moreread less

Abstract: We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs). Conditional GANs have enabled a variety of applications, but the results are often limited to low-resolution and still far from realistic. In this work, we generate 2048 A— 1024 visually appealing results with a novel adversarial loss, as well as new multi-scale generator and discriminator architectures. Furthermore, we extend our framework to interactive visual manipulation with two additional features. First, we incorporate object instance segmentation information, which enables object manipulations such as removing/adding objects and changing the object category. Second, we propose a method to generate diverse results given the same input, allowing users to edit the object appearance interactively. Human opinion studies demonstrate that our method significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing.

...read moreread less

3,457 citations

Proceedings Article•DOI•

Semantic Image Synthesis With Spatially-Adaptive Normalization

[...]

Taesung Park¹, Ming-Yu Liu², Ting-Chun Wang², Jun-Yan Zhu³•Institutions (3)

University of California, Berkeley¹, Nvidia², Massachusetts Institute of Technology³

18 Mar 2019

TL;DR: S spatially-adaptive normalization is proposed, a simple but effective layer for synthesizing photorealistic images given an input semantic layout that allows users to easily control the style and content of image synthesis results as well as create multi-modal results.

...read moreread less

Abstract: We propose spatially-adaptive normalization, a simple but effective layer for synthesizing photorealistic images given an input semantic layout. Previous methods directly feed the semantic layout as input to the network, forcing the network to memorize the information throughout all the layers. Instead, we propose using the input layout for modulating the activations in normalization layers through a spatially-adaptive, learned affine transformation. Experiments on several challenging datasets demonstrate the superiority of our method compared to existing approaches, regarding both visual fidelity and alignment with input layouts. Finally, our model allows users to easily control the style and content of image synthesis results as well as create multi-modal results. Code is available upon publication.

...read moreread less

2,159 citations

Book Chapter•DOI•

Image Inpainting for Irregular Holes Using Partial Convolutions

[...]

Guilin Liu¹, Fitsum A. Reda¹, Kevin J. Shih¹, Ting-Chun Wang¹, Andrew Tao¹, Bryan Catanzaro¹ - Show less +2 more•Institutions (1)

Nvidia¹

08 Sep 2018

TL;DR: This work proposes the use of partial convolutions, where the convolution is masked and renormalized to be conditioned on only valid pixels, and outperforms other methods for irregular masks.

...read moreread less

Abstract: Existing deep learning based image inpainting methods use a standard convolutional network over the corrupted image, using convolutional filter responses conditioned on both valid pixels as well as the substitute values in the masked holes (typically the mean value). This often leads to artifacts such as color discrepancy and blurriness. Post-processing is usually used to reduce such artifacts, but are expensive and may fail. We propose the use of partial convolutions, where the convolution is masked and renormalized to be conditioned on only valid pixels. We further include a mechanism to automatically generate an updated mask for the next layer as part of the forward pass. Our model outperforms other methods for irregular masks. We show qualitative and quantitative comparisons with other methods to validate our approach.

...read moreread less

1,606 citations

Posted Content•

Image Inpainting for Irregular Holes Using Partial Convolutions

[...]

Guilin Liu¹, Fitsum A. Reda¹, Kevin J. Shih¹, Ting-Chun Wang¹, Andrew Tao¹, Bryan Catanzaro¹ - Show less +2 more•Institutions (1)

Nvidia¹

20 Apr 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, the convolution is masked and renormalized to be conditioned on only valid pixels, and a mechanism is proposed to automatically generate an updated mask for the next layer as part of the forward pass.

...read moreread less

536 citations

Proceedings Article•

Video-to-Video Synthesis

[...]

Ting-Chun Wang¹, Ming-Yu Liu¹, Jun-Yan Zhu², Guilin Liu³, Andrew Tao¹, Jan Kautz¹, Bryan Catanzaro¹ - Show less +3 more•Institutions (3)

Nvidia¹, University of California, Berkeley², Adobe Systems³

20 Aug 2018

TL;DR: This paper proposes a novel video-to-video synthesis approach under the generative adversarial learning framework, capable of synthesizing 2K resolution videos of street scenes up to 30 seconds long, which significantly advances the state-of-the-art of video synthesis.

...read moreread less

Abstract: We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e.g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video. While its image counterpart, the image-to-image translation problem, is a popular topic, the video-to-video synthesis problem is less explored in the literature. Without modeling temporal dynamics, directly applying existing image synthesis approaches to an input video often results in temporally incoherent videos of low visual quality. In this paper, we propose a video-to-video synthesis approach under the generative adversarial learning framework. Through carefully-designed generators and discriminators, coupled with a spatio-temporal adversarial objective, we achieve high-resolution, photorealistic, temporally coherent video results on a diverse set of input formats including segmentation masks, sketches, and poses. Experiments on multiple benchmarks show the advantage of our method compared to strong baselines. In particular, our model is capable of synthesizing 2K resolution videos of street scenes up to 30 seconds long, which significantly advances the state-of-the-art of video synthesis. Finally, we apply our method to future video prediction, outperforming several competing systems. Code, models, and more results are available at our website: https://github.com/NVIDIA/vid2vid. (Please use Adobe Reader to see the embedded videos in the paper.)

...read moreread less

437 citations

1
2
3
4
…
5
6
7
8
9

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Proceedings Article•DOI•

Semantic Image Synthesis With Spatially-Adaptive Normalization

[...]

Taesung Park¹, Ming-Yu Liu², Ting-Chun Wang², Jun-Yan Zhu³•Institutions (3)

University of California, Berkeley¹, Nvidia², Massachusetts Institute of Technology³

18 Mar 2019

...read moreread less

2,159 citations

Posted Content•

Self-Attention Generative Adversarial Networks

[...]

Han Zhang¹, Ian Goodfellow¹, Dimitris N. Metaxas², Augustus Odena¹•Institutions (2)

Google¹, Rutgers University²

21 May 2018-arXiv: Machine Learning

TL;DR: Self-Attention Generative Adversarial Network (SAGAN) as mentioned in this paper uses attention-driven, long-range dependency modeling for image generation tasks and achieves state-of-the-art results.

...read moreread less

Abstract: In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks. Traditional convolutional GANs generate high-resolution details as a function of only spatially local points in lower-resolution feature maps. In SAGAN, details can be generated using cues from all feature locations. Moreover, the discriminator can check that highly detailed features in distant portions of the image are consistent with each other. Furthermore, recent work has shown that generator conditioning affects GAN performance. Leveraging this insight, we apply spectral normalization to the GAN generator and find that this improves training dynamics. The proposed SAGAN achieves the state-of-the-art results, boosting the best published Inception score from 36.8 to 52.52 and reducing Frechet Inception distance from 27.62 to 18.65 on the challenging ImageNet dataset. Visualization of the attention layers shows that the generator leverages neighborhoods that correspond to object shapes rather than local regions of fixed shape.

...read moreread less

2,106 citations

Proceedings Article•

A morphable model for the synthesis of 3D faces

[...]

Matthew Turk

01 Jan 1999

2,010 citations

Book Chapter•DOI•

Multimodal Unsupervised Image-to-Image Translation

[...]

Xun Huang¹, Ming-Yu Liu², Serge Belongie¹, Jan Kautz²•Institutions (2)

Cornell University¹, Nvidia²

08 Sep 2018

TL;DR: In this article, the authors propose a multimodal unsupervised image-to-image (MUNIT) framework, where the image representation can be decomposed into a content code that is domain-invariant and a style code that captures domain-specific properties.

...read moreread less

Abstract: Unsupervised image-to-image translation is an important and challenging problem in computer vision. Given an image in the source domain, the goal is to learn the conditional distribution of corresponding images in the target domain, without seeing any examples of corresponding image pairs. While this conditional distribution is inherently multimodal, existing approaches make an overly simplified assumption, modeling it as a deterministic one-to-one mapping. As a result, they fail to generate diverse outputs from a given source domain image. To address this limitation, we propose a Multimodal Unsupervised Image-to-image \(\text{ Translation } \text{(MUNIT) }\) framework. We assume that the image representation can be decomposed into a content code that is domain-invariant, and a style code that captures domain-specific properties. To translate an image to another domain, we recombine its content code with a random style code sampled from the style space of the target domain. We analyze the proposed framework and establish several theoretical results. Extensive experiments with comparisons to state-of-the-art approaches further demonstrate the advantage of the proposed framework. Moreover, our framework allows users to control the style of translation outputs by providing an example style image. Code and pretrained models are available at https://github.com/nvlabs/MUNIT.

...read moreread less

1,874 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse