Image-to-image Translation via Hierarchical Style Disentanglement
Xinyang Li,Shengchuan Zhang,Jie Hu,Liujuan Cao,Xiaopeng Hong,Xudong Mao,Feiyue Huang,Yongjian Wu,Rongrong Ji +8 more
- pp 8639-8648
Reads0
Chats0
TLDR
Li et al. as discussed by the authors propose Hierarchical Style Disentanglement (HiSD), which disentangles the labels into independent tags, exclusive attributes, and disentangled styles from top to bottom.Abstract:
Recently, image-to-image translation has made significant progress in achieving both multi-label (i.e., translation conditioned on different labels) and multi-style (i.e., generation with diverse styles) tasks. However, due to the unexplored independence and exclusiveness in the labels, existing endeavors are defeated by involving uncontrolled manipulations to the translation results. In this paper, we propose Hierarchical Style Disentanglement (HiSD) to address this issue. Specifically, we organize the labels into a hierarchical tree structure, in which independent tags, exclusive attributes, and disentangled styles are allocated from top to bottom. Correspondingly, a new translation process is designed to adapt the above structure, in which the styles are identified for controllable translations. Both qualitative and quantitative results on the CelebA-HQ dataset verify the ability of the proposed HiSD. The code has been released at https://github.com/imlixinyang/HiSD.read more
Citations
More filters
Proceedings ArticleDOI
Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation
TL;DR: This paper proposes a novel knowledge distillation method referred to as wavelet knowledgedistillation, which first decomposes the generated images of teachers into different frequency bands with discrete wavelet transformation and then only distills the high frequency bands.
Proceedings ArticleDOI
Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields
TL;DR: A new NeRF-based conditional 3D face synthesis framework is proposed, which enables 3D controllability over the generated face images by imposing explicit 3D conditions from3D face priors and at its core is a conditional Generative Occupancy Field (cGOF) that effectively enforces the shape of the generate face to commit to a given 3D Morphable Model (3DMM) mesh.
Proceedings ArticleDOI
Exploring Negatives in Contrastive Learning for Unpaired Image-to-Image Translation
TL;DR: This paper study the negatives from an information-theoretic perspective and introduce a new negative Pruning technology for Unpaired image-to-image Translation (PUT) by sparsifying and ranking the patches.
Proceedings ArticleDOI
Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation
TL;DR: In this article , a wavelet knowledge distillation method is proposed to decompose the generated images into different frequency bands with discrete wavelet transformation and then only distills the high frequency bands.
Journal ArticleDOI
Hierarchical Fine-Grained Image Forgery Detection and Localization
TL;DR: Li et al. as mentioned in this paper proposed a hierarchical fine-grained formulation for image forgery detection and localization (IFDL) representation learning, where each branch of the feature extractor learns to classify forgery attributes at one level, while localization and classification modules segment the pixel-level forgery region and detect image level forgery, respectively.
References
More filters
Journal ArticleDOI
Generative Adversarial Nets
Ian Goodfellow,Jean Pouget-Abadie,Mehdi Mirza,Bing Xu,David Warde-Farley,Sherjil Ozair,Aaron Courville,Yoshua Bengio +7 more
TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Proceedings ArticleDOI
Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks
TL;DR: CycleGAN as discussed by the authors learns a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.
Posted Content
Conditional Generative Adversarial Nets
Mehdi Mirza,Simon Osindero +1 more
TL;DR: The conditional version of generative adversarial nets is introduced, which can be constructed by simply feeding the data, y, to the generator and discriminator, and it is shown that this model can generate MNIST digits conditioned on class labels.
Proceedings ArticleDOI
Deep Learning Face Attributes in the Wild
TL;DR: A novel deep learning framework for attribute prediction in the wild that cascades two CNNs, LNet and ANet, which are fine-tuned jointly with attribute tags, but pre-trained differently.
Proceedings Article
GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium
TL;DR: In this paper, a two time-scale update rule (TTUR) was proposed for training GANs with stochastic gradient descent on arbitrary GAN loss functions, which has an individual learning rate for both the discriminator and the generator.
Related Papers (5)
NewsEmbed: Modeling News through Pre-trained Document Representations
Jialu Liu,Tianqi Liu,Cong Yu +2 more
Multi-Task Learning of Hierarchical Vision-Language Representation
Duy-Kien Nguyen,Takayuki Okatani +1 more