StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks

doi:10.1109/ICCV.2017.629

Proceedings Article•DOI•

StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks

Han Zhang¹, Tao Xu², Hongsheng Li³•Institutions (3)

Rutgers University¹, Lehigh University², The Chinese University of Hong Kong³

01 Oct 2017-pp 5908-5916

TL;DR: This paper proposes Stacked Generative Adversarial Networks (StackGAN) to generate 256 photo-realistic images conditioned on text descriptions and introduces a novel Conditioning Augmentation technique that encourages smoothness in the latent conditioning manifold.

read less

Abstract: Synthesizing high-quality images from text descriptions is a challenging problem in computer vision and has many practical applications. Samples generated by existing textto- image approaches can roughly reflect the meaning of the given descriptions, but they fail to contain necessary details and vivid object parts. In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) to generate 256.256 photo-realistic images conditioned on text descriptions. We decompose the hard problem into more manageable sub-problems through a sketch-refinement process. The Stage-I GAN sketches the primitive shape and colors of the object based on the given text description, yielding Stage-I low-resolution images. The Stage-II GAN takes Stage-I results and text descriptions as inputs, and generates high-resolution images with photo-realistic details. It is able to rectify defects in Stage-I results and add compelling details with the refinement process. To improve the diversity of the synthesized images and stabilize the training of the conditional-GAN, we introduce a novel Conditioning Augmentation technique that encourages smoothness in the latent conditioning manifold. Extensive experiments and comparisons with state-of-the-arts on benchmark datasets demonstrate that the proposed method achieves significant improvements on generating photo-realistic images conditioned on text descriptions.

...read moreread less

StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks

Citations

Cites background from "StackGAN: Text to Photo-Realistic I..."

Cites background from "StackGAN: Text to Photo-Realistic I..."

References

"StackGAN: Text to Photo-Realistic I..." refers background in this paper

"StackGAN: Text to Photo-Realistic I..." refers methods in this paper

"StackGAN: Text to Photo-Realistic I..." refers methods in this paper

"StackGAN: Text to Photo-Realistic I..." refers background or methods in this paper

Related Papers (5)