FlowNet: Learning Optical Flow with Convolutional Networks

Open AccessPosted Content

FlowNet: Learning Optical Flow with Convolutional Networks

Philipp Fischer, +8 more

- 26 Apr 2015 -

arXiv: Computer Vision and Pattern Recog...

Chats0

TLDR

This paper constructs CNNs which are capable of solving the optical flow estimation problem as a supervised learning task, and proposes and compares two architectures: a generic architecture and another one including a layer that correlates feature vectors at different image locations.

Abstract:

Convolutional neural networks (CNNs) have recently been very successful in a variety of computer vision tasks, especially on those linked to recognition. Optical flow estimation has not been among the tasks where CNNs were successful. In this paper we construct appropriate CNNs which are capable of solving the optical flow estimation problem as a supervised learning task. We propose and compare two architectures: a generic architecture and another one including a layer that correlates feature vectors at different image locations. Since existing ground truth data sets are not sufficiently large to train a CNN, we generate a synthetic Flying Chairs dataset. We show that networks trained on this unrealistic data still generalize very well to existing datasets such as Sintel and KITTI, achieving competitive accuracy at frame rates of 5 to 10 fps.

Citations

PDF

Open Access

More filters

Book ChapterDOI

Fully-Convolutional Siamese Networks for Object Tracking

Luca Bertinetto, +4 more

TL;DR: A basic tracking algorithm is equipped with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video and achieves state-of-the-art performance in multiple benchmarks.

...read moreread less

Posted Content

Fully-Convolutional Siamese Networks for Object Tracking

Luca Bertinetto, +4 more

- 30 Jun 2016 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this paper, a fully-convolutional Siamese network is trained end-to-end on the ILSVRC15 dataset for object detection in video, which achieves state-of-the-art performance.

...read moreread less

Book ChapterDOI

Playing for Data: Ground Truth from Computer Games

Stephan R. Richter, +3 more

TL;DR: In this paper, the authors present an approach to rapidly create pixel-accurate semantic label maps for images extracted from modern computer games, which enables rapid propagation of semantic labels within and across images synthesized by the game, without access to the source code or the content.

...read moreread less

Book ChapterDOI

RAFT: Recurrent All-Pairs Field Transforms for Optical Flow

Zachary Teed, +1 more

TL;DR: RAFT as mentioned in this paper extracts per-pixel features, builds multi-scale 4D correlation volumes for all pairs of pixels, and iteratively updates a flow field through a recurrent unit that performs lookups on the correlation volumes.

...read moreread less

Book ChapterDOI

Learning to Track at 100 FPS with Deep Regression Networks

David Held, +2 more

TL;DR: This work proposes a method for offline training of neural networks that can track novel objects at test-time at 100 fps, which is significantly faster than previous methods that use neural networks for tracking, which are typically very slow to run and not practical for real-time applications.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Journal ArticleDOI

Generative Adversarial Nets

Ian Goodfellow, +7 more

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

Jonathan Long, +2 more

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Book ChapterDOI

Visualizing and Understanding Convolutional Networks

Matthew D. Zeiler, +1 more

TL;DR: A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.

...read moreread less

FlowNet: Learning Optical Flow with Convolutional Networks

Citations

Fully-Convolutional Siamese Networks for Object Tracking

Fully-Convolutional Siamese Networks for Object Tracking

Playing for Data: Ground Truth from Computer Games

RAFT: Recurrent All-Pairs Field Transforms for Optical Flow

Learning to Track at 100 FPS with Deep Regression Networks

References

Adam: A Method for Stochastic Optimization

Generative Adversarial Nets

Fully convolutional networks for semantic segmentation

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Visualizing and Understanding Convolutional Networks

Related Papers (5)

Deep Residual Learning for Image Recognition

Are we ready for autonomous driving? The KITTI vision benchmark suite

Fully convolutional networks for semantic segmentation

U-Net: Convolutional Networks for Biomedical Image Segmentation

ImageNet Classification with Deep Convolutional Neural Networks