Multi-Scale Context Aggregation by Dilated Convolutions

Open AccessPosted Content

Multi-Scale Context Aggregation by Dilated Convolutions

Fisher Yu, +1 more

- 23 Nov 2015 -

arXiv: Computer Vision and Pattern Recog...

Chats0

TLDR

In this article, a new convolutional network module is proposed to aggregate multi-scale contextual information without losing resolution, and the architecture is based on the fact that dilated convolutions support exponential expansion of the receptive field without loss of resolution or coverage.

Abstract:

State-of-the-art models for semantic segmentation are based on adaptations of convolutional networks that had originally been designed for image classification. However, dense prediction and image classification are structurally different. In this work, we develop a new convolutional network module that is specifically designed for dense prediction. The presented module uses dilated convolutions to systematically aggregate multi-scale contextual information without losing resolution. The architecture is based on the fact that dilated convolutions support exponential expansion of the receptive field without loss of resolution or coverage. We show that the presented context module increases the accuracy of state-of-the-art semantic segmentation systems. In addition, we examine the adaptation of image classification networks to dense prediction and show that simplifying the adapted network can increase accuracy.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Vijay Badrinarayanan, +2 more

- 01 Dec 2017 -

IEEE Transactions on Pattern Analysis an...

TL;DR: Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.

...read moreread less

Proceedings ArticleDOI

Pyramid Scene Parsing Network

Hengshuang Zhao, +4 more

TL;DR: This paper exploits the capability of global context information by different-region-based context aggregation through the pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet) to produce good quality results on the scene parsing task.

...read moreread less

Posted Content

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Liang-Chieh Chen, +4 more

- 02 Jun 2016 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: DeepLab as discussed by the authors proposes atrous spatial pyramid pooling (ASPP) to segment objects at multiple scales by probing an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views.

...read moreread less

Posted Content

Fully Convolutional Networks for Semantic Segmentation

Jonathan Long, +2 more

- 14 Nov 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: It is shown that convolutional networks by themselves, trained end- to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation.

...read moreread less

Proceedings ArticleDOI

The Cityscapes Dataset for Semantic Urban Scene Understanding

Marius Cordts, +8 more

TL;DR: This work introduces Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling, and exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Book ChapterDOI

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin, +7 more

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

Journal ArticleDOI

Learning representations by back-propagating errors

David E. Rumelhart, +2 more

- 01 Jan 1988 -

Nature

TL;DR: Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain.

...read moreread less

Journal ArticleDOI

The Pascal Visual Object Classes (VOC) Challenge

Mark Everingham, +4 more

- 01 Jun 2010 -

International Journal of Computer Vision

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.

...read moreread less

Collapse

Multi-Scale Context Aggregation by Dilated Convolutions

Citations

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Pyramid Scene Parsing Network

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Fully Convolutional Networks for Semantic Segmentation

The Cityscapes Dataset for Semantic Urban Scene Understanding

References

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

Microsoft COCO: Common Objects in Context

Learning representations by back-propagating errors

The Pascal Visual Object Classes (VOC) Challenge

Related Papers (5)

Deep Residual Learning for Image Recognition

U-Net: Convolutional Networks for Biomedical Image Segmentation

Fully convolutional networks for semantic segmentation

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks