HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing

Open AccessPosted Content

HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing

Yuval Alaluf, +4 more

- 30 Nov 2021 -

arXiv: Computer Vision and Pattern Recog...

Chats0

TLDR

HyperStyle as discussed by the authors learns to modulate StyleGAN's weights to faithfully express a given image in editable regions of the latent space, which yields reconstructions comparable to those of optimization techniques with the near real-time inference capabilities of encoders.

Abstract:

The inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains an open challenge, due to an inherent trade-off between reconstruction and editability: latent space regions which can accurately represent real images typically suffer from degraded semantic control. Recent work proposes to mitigate this trade-off by fine-tuning the generator to add the target image to well-behaved, editable regions of the latent space. While promising, this fine-tuning scheme is impractical for prevalent use as it requires a lengthy training phase for each new image. In this work, we introduce this approach into the realm of encoder-based inversion. We propose HyperStyle, a hypernetwork that learns to modulate StyleGAN's weights to faithfully express a given image in editable regions of the latent space. A naive modulation approach would require training a hypernetwork with over three billion parameters. Through careful network design, we reduce this to be in line with existing encoders. HyperStyle yields reconstructions comparable to those of optimization techniques with the near real-time inference capabilities of encoders. Lastly, we demonstrate HyperStyle's effectiveness on several applications beyond the inversion task, including the editing of out-of-domain images which were never seen during training.

HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing

Citations

Stitch it in Time: GAN-Based Facial Editing of Real Videos

GAN Inversion: A Survey

Third Time's the Charm? Image and Video Editing with StyleGAN3

DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model Generalization

Survey on leveraging pre-trained generative adversarial networks for image editing and restoration

References

Deep Residual Learning for Image Recognition

Generative Adversarial Nets

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

A Style-Based Generator Architecture for Generative Adversarial Networks

Deep Learning Face Attributes in the Wild

Related Papers (1)

Towards Real-Time Text2Video via CLIP-Guided, Pixel-Level Optimization