Multimodal Unsupervised Image-to-Image Translation

Open AccessPosted Content

Multimodal Unsupervised Image-to-Image Translation

- 12 Apr 2018 -

arXiv: Computer Vision and Pattern Recog...

TLDR

A Multimodal Unsupervised Image-to-image Translation (MUNIT) framework that assumes that the image representation can be decomposed into a content code that is domain-invariant, and a style code that captures domain-specific properties.

Abstract:

Unsupervised image-to-image translation is an important and challenging problem in computer vision. Given an image in the source domain, the goal is to learn the conditional distribution of corresponding images in the target domain, without seeing any pairs of corresponding images. While this conditional distribution is inherently multimodal, existing approaches make an overly simplified assumption, modeling it as a deterministic one-to-one mapping. As a result, they fail to generate diverse outputs from a given source domain image. To address this limitation, we propose a Multimodal Unsupervised Image-to-image Translation (MUNIT) framework. We assume that the image representation can be decomposed into a content code that is domain-invariant, and a style code that captures domain-specific properties. To translate an image to another domain, we recombine its content code with a random style code sampled from the style space of the target domain. We analyze the proposed framework and establish several theoretical results. Extensive experiments with comparisons to the state-of-the-art approaches further demonstrates the advantage of the proposed framework. Moreover, our framework allows users to control the style of translation outputs by providing an example style image. Code and pretrained models are available at this https URL

Citations

PDF

Open Access

More filters

Journal Article

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

杉山拓海

- 12 Sep 2017 -

Computers & Graphics

Proceedings ArticleDOI

Semantic Image Synthesis With Spatially-Adaptive Normalization

Taesung Park, +3 more

TL;DR: S spatially-adaptive normalization is proposed, a simple but effective layer for synthesizing photorealistic images given an input semantic layout that allows users to easily control the style and content of image synthesis results as well as create multi-modal results.

...read moreread less

Posted Content

A Style-Based Generator Architecture for Generative Adversarial Networks

Tero Karras, +2 more

- 12 Dec 2018 -

arXiv: Neural and Evolutionary Computing

TL;DR: This article proposed an alternative generator architecture for GANs, borrowing from style transfer literature, which leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images.

...read moreread less

Posted Content

StarGAN v2: Diverse Image Synthesis for Multiple Domains

Yunjey Choi, +3 more

- 04 Dec 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: StarGAN v2, a single framework that tackles image-to-image translation models with limited diversity and multiple models for all domains, is proposed and shows significantly improved results over the baselines.

...read moreread less

Journal ArticleDOI

Diverse Image-to-Image Translation via Disentangled Representations

Hsin-Ying Lee, +4 more

- 01 Nov 2020 -

International Journal of Computer Vision

TL;DR: This work presents an approach based on disentangled representation for producing diverse outputs without paired training images, and proposes to embed images onto two spaces: a domain-invariant content space capturing shared information across domains and adomain-specific attribute space.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Journal ArticleDOI

Generative Adversarial Nets

Ian Goodfellow, +7 more

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Collapse

Multimodal Unsupervised Image-to-Image Translation

Citations

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

Semantic Image Synthesis With Spatially-Adaptive Normalization

A Style-Based Generator Architecture for Generative Adversarial Networks

StarGAN v2: Diverse Image Synthesis for Multiple Domains

Diverse Image-to-Image Translation via Disentangled Representations

References

Deep Residual Learning for Image Recognition

Adam: A Method for Stochastic Optimization

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

Generative Adversarial Nets

Related Papers (5)

Generative Adversarial Nets

Image-to-Image Translation with Conditional Adversarial Networks

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

Deep Residual Learning for Image Recognition

Perceptual Losses for Real-Time Style Transfer and Super-Resolution