scispace - formally typeset
Open AccessProceedings ArticleDOI

Fully convolutional networks for semantic segmentation

TLDR
The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
Abstract
Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, exceed the state-of-the-art in semantic segmentation. Our key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet [20], the VGG net [31], and GoogLeNet [32]) into fully convolutional networks and transfer their learned representations by fine-tuning [3] to the segmentation task. We then define a skip architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional network achieves state-of-the-art segmentation of PASCAL VOC (20% relative improvement to 62.2% mean IU on 2012), NYUDv2, and SIFT Flow, while inference takes less than one fifth of a second for a typical image.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

Deeply Learned Compositional Models for Human Pose Estimation

TL;DR: A novel framework, termed as Deeply Learned Compositional Model (DLCM), is introduced, which exploits deep neural networks to learn the compositionality of human bodies and proposes a novel bone-based part representation that not only compactly encodes orientations, scales and shapes of parts, but also avoids their potentially large state spaces.
Proceedings ArticleDOI

InstanceCut: From Edges to Instances with MultiCut

TL;DR: To reason globally about the optimal partitioning of an image into instances, the authors combine these two modalities into a novel MultiCut formulation, which achieves the best result among all published methods, and performs particularly well for rare object classes.
Journal ArticleDOI

Image-based post-disaster inspection of reinforced concrete bridge systems using deep learning with Bayesian optimization

TL;DR: A three‐level image‐based approach for post‐disaster inspection of the reinforced concrete bridge using deep learning with novel training strategies and a principled manner of such selection is proposed, with very promising results (well over 90% accuracies) and robustness are observed on all three‐ level deep learning models.
Proceedings ArticleDOI

Spatially Adaptive Computation Time for Residual Networks

TL;DR: Experimental results are presented showing that this model improves the computational efficiency of Residual Networks on the challenging ImageNet classification and COCO object detection datasets and the computation time maps on the visual saliency dataset cat2000 correlate surprisingly well with human eye fixation positions.
Posted Content

Recurrent Instance Segmentation

TL;DR: This work proposes a new instance segmentation paradigm consisting in an end-to-end method that learns how to segment instances sequentially, based on a recurrent neural network that sequentially finds objects and their segmentations one at a time.
References
More filters
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Book

Pattern Recognition and Machine Learning

TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.
Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.
Book

A wavelet tour of signal processing

TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.
Related Papers (5)