Generating the Future with Adversarial Transformers

doi:10.1109/CVPR.2017.319

Open AccessProceedings ArticleDOI

Generating the Future with Adversarial Transformers

- pp 2992-3000

TLDR

This work presents a model that generates the future by transforming pixels in the past, and explicitly disentangles the models memory from the prediction, which helps the model learn desirable invariances.

Abstract:

We learn models to generate the immediate future in video. This problem has two main challenges. Firstly, since the future is uncertain, models should be multi-modal, which can be difficult to learn. Secondly, since the future is similar to the past, models store low-level details, which complicates learning of high-level semantics. We propose a framework to tackle both of these challenges. We present a model that generates the future by transforming pixels in the past. Our approach explicitly disentangles the models memory from the prediction, which helps the model learn desirable invariances. Experiments suggest that this model can generate short videos of plausible futures. We believe predictive models have many applications in robotics, health-care, and video understanding.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Future Frame Prediction for Anomaly Detection - A New Baseline

Wen Liu, +3 more

TL;DR: In this article, Liu et al. propose to detect abnormal events by enforcing the optical flow between predicted frames and ground truth frames to be consistent, and this is the first work that introduces a temporal constraint into the video prediction task.

...read moreread less

Posted Content

Stochastic Adversarial Video Prediction

Alex X. Lee, +5 more

- 04 Apr 2018 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work shows that latent variational variable models that explicitly model underlying stochasticity and adversarially-trained models that aim to produce naturalistic images are in fact complementary and combines the two to produce predictions that look more realistic to human raters and better cover the range of possible futures.

...read moreread less

Posted Content

Video-to-Video Synthesis

Ting-Chun Wang, +6 more

- 20 Aug 2018 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this article, a video-to-video synthesis approach under the generative adversarial learning framework is proposed, which achieves high-resolution, photorealistic, temporally coherent video results on a diverse set of input formats.

...read moreread less

Proceedings Article

Stochastic Variational Video Prediction

Mohammad Babaeizadeh, +4 more

TL;DR: In this paper, a stochastic variational video prediction (SV2P) method is proposed to predict a different possible future for each sample of its latent variables for real-world video.

...read moreread less

Book ChapterDOI

R2P2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting

Nicholas Rhinehart, +2 more

TL;DR: A method to forecast a vehicle’s ego-motion as a distribution over spatiotemporal paths, conditioned on features embedded in an overhead map, and obtains expressions for the cross-entropy metrics that can be efficiently evaluated and differentiated, enabling stochastic-gradient optimization.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Journal ArticleDOI

Generative Adversarial Nets

Ian Goodfellow, +7 more

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Journal ArticleDOI

The Pascal Visual Object Classes (VOC) Challenge

Mark Everingham, +4 more

- 01 Jun 2010 -

International Journal of Computer Vision

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.

...read moreread less

Posted Content

Conditional Generative Adversarial Nets

Mehdi Mirza, +1 more

- 06 Nov 2014 -

arXiv: Learning

TL;DR: The conditional version of generative adversarial nets is introduced, which can be constructed by simply feeding the data, y, to the generator and discriminator, and it is shown that this model can generate MNIST digits conditioned on class labels.

...read moreread less

Posted Content

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Alec Radford, +2 more

- 19 Nov 2015 -

arXiv: Learning

TL;DR: This work introduces a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrates that they are a strong candidate for unsupervised learning.

...read moreread less

Collapse

arXiv: Computer Vision and Pattern Recog...

Generating the Future with Adversarial Transformers

Citations

Future Frame Prediction for Anomaly Detection - A New Baseline

Stochastic Adversarial Video Prediction

Video-to-Video Synthesis

Stochastic Variational Video Prediction

R2P2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting

References

Adam: A Method for Stochastic Optimization

Generative Adversarial Nets

The Pascal Visual Object Classes (VOC) Challenge

Conditional Generative Adversarial Nets

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Related Papers (5)

Deep multi-scale video prediction beyond mean square error

Generative Adversarial Nets

Unsupervised Learning of Video Representations using LSTMs

Unsupervised Learning for Physical Interaction through Video Prediction

UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild