Sparsity Invariant CNNs

doi:10.1109/3DV.2017.00012

Proceedings ArticleDOI

Sparsity Invariant CNNs

Jonas Uhrig, +5 more

- pp 11-20

Chats0

TLDR

This paper proposes a simple yet effective sparse convolution layer which explicitly considers the location of missing data during the convolution operation, and demonstrates the benefits of the proposed network architecture in synthetic and real experiments with respect to various baseline approaches.

Abstract:

In this paper, we consider convolutional neural networks operating on sparse inputs with an application to depth upsampling from sparse laser scan data. First, we show that traditional convolutional networks perform poorly when applied to sparse data even when the location of missing data is provided to the network. To overcome this problem, we propose a simple yet effective sparse convolution layer which explicitly considers the location of missing data during the convolution operation. We demonstrate the benefits of the proposed network architecture in synthetic and real experiments with respect to various baseline approaches. Compared to dense baselines, the proposed sparse convolution network generalizes well to novel datasets and is invariant to the level of sparsity in the data. For our evaluation, we derive a novel dataset from the KITTI benchmark, comprising 93k depth annotated RGB images. Our dataset allows for training and evaluating depth upsampling and depth prediction techniques in challenging real-world settings and will be made available upon publication.

Citations

PDF

Open Access

More filters

Book ChapterDOI

Image Inpainting for Irregular Holes Using Partial Convolutions

Guilin Liu, +5 more

TL;DR: This work proposes the use of partial convolutions, where the convolution is masked and renormalized to be conditioned on only valid pixels, and outperforms other methods for irregular masks.

...read moreread less

Proceedings ArticleDOI

Digging Into Self-Supervised Monocular Depth Estimation

Clément Godard, +3 more

TL;DR: In this paper, the authors propose a set of improvements, which together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods, and demonstrate the effectiveness of each component in isolation, and show high quality, state-of-theart results on the KITTI benchmark.

...read moreread less

Posted Content

Digging Into Self-Supervised Monocular Depth Estimation

Clément Godard, +3 more

- 04 Jun 2018 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: It is shown that a surprisingly simple model, and associated design choices, lead to superior predictions, and together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods.

...read moreread less

Proceedings ArticleDOI

Multi-Task Multi-Sensor Fusion for 3D Object Detection

Ming Liang, +4 more

TL;DR: An end-to-end learnable architecture that reasons about 2D and 3D object detection as well as ground estimation and depth completion is presented that leads the KITTI benchmark on 2D, 3D and bird's eye view object detection, while being real-time.

...read moreread less

Proceedings ArticleDOI

PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation

Danfei Xu, +2 more

TL;DR: PointFusion as mentioned in this paper is a generic 3D object detection method that leverages both image and 3D point cloud information, which predicts multiple 3D box hypotheses and their confidences using the input 3D points as spatial anchors.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings ArticleDOI

Going deeper with convolutions

Christian Szegedy, +8 more

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

Jonathan Long, +2 more

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

Proceedings ArticleDOI

Are we ready for autonomous driving? The KITTI vision benchmark suite

Andreas Geiger, +2 more

TL;DR: The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.

...read moreread less

Collapse

The International Journal of Robotics Re...

Sparsity Invariant CNNs

Citations

Image Inpainting for Irregular Holes Using Partial Convolutions

Digging Into Self-Supervised Monocular Depth Estimation

Digging Into Self-Supervised Monocular Depth Estimation

Multi-Task Multi-Sensor Fusion for 3D Object Detection

PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation

References

Deep Residual Learning for Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

Going deeper with convolutions

Fully convolutional networks for semantic segmentation

Are we ready for autonomous driving? The KITTI vision benchmark suite

Related Papers (5)

Deep Residual Learning for Image Recognition

Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

Indoor segmentation and support inference from RGBD images

Are we ready for autonomous driving? The KITTI vision benchmark suite

Vision meets robotics: The KITTI dataset