Fully convolutional networks for semantic segmentation
Jonathan Long,Evan Shelhamer,Trevor Darrell +2 more
- pp 3431-3440
TLDR
The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.Abstract:
Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, exceed the state-of-the-art in semantic segmentation. Our key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet [20], the VGG net [31], and GoogLeNet [32]) into fully convolutional networks and transfer their learned representations by fine-tuning [3] to the segmentation task. We then define a skip architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional network achieves state-of-the-art segmentation of PASCAL VOC (20% relative improvement to 62.2% mean IU on 2012), NYUDv2, and SIFT Flow, while inference takes less than one fifth of a second for a typical image.read more
Citations
More filters
Posted Content
PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding
TL;DR: This work aims at facilitating research on 3D representation learning by selecting a suite of diverse datasets and tasks to measure the effect of unsupervised pre-training on a large source set of 3D scenes and achieving improvement over recent best results in segmentation and detection across 6 different benchmarks.
Journal ArticleDOI
Classification for High Resolution Remote Sensing Imagery Using a Fully Convolutional Network
TL;DR: An accurate classification approach for high resolution remote sensing imagery based on the improved FCN model is proposed, which improves the density of output class maps by introducing Atrous convolution, and designs a multi-scale network architecture by adding a skip-layer structure to make it capable for multi-resolution image classification.
Proceedings ArticleDOI
3D Packing for Self-Supervised Monocular Depth Estimation
TL;DR: Li et al. as mentioned in this paper proposed a self-supervised monocular depth estimation method combining geometry with a new deep network, PackNet, learned only from unlabeled monocular videos, which leverages symmetrical packing and unpacking blocks to jointly learn to compress and decompress detail-preserving representations using 3D convolutions.
Journal ArticleDOI
AI-assisted CT imaging analysis for COVID-19 screening: Building and deploying a medical AI system.
Bo Wang,Shuo Jin,Qingsen Yan,Haibo Xu,Chuan Luo,Lai Wei,Wei Zhao,Hou Xuexue,Wenshuo Ma,Zhengqing Xu,Zhuozhao Zheng,Wenbo Sun,Lan Lan,Wei Zhang,Xiangdong Mu,Chenxi Shi,Zhong-Xiao Wang,Jihae Lee,Zijian Jin,Minggui Lin,Jin Hongbo,Liang Zhang,Jun Guo,Benqi Zhao,Zhizhong Ren,Shuhao Wang,Wei Xu,Xinghuan Wang,Jianming Wang,Jianming Wang,Zheng You,Jiahong Dong +31 more
TL;DR: An AI system that automatically analyzes CT images and provides the probability of infection to rapidly detect COVID-19 pneumonia and is able to overcome a series of challenges in this particular situation and deploy the system in four weeks.
Proceedings ArticleDOI
Learning to Learn Single Domain Generalization
Fengchun Qiao,Long Zhao,Xi Peng +2 more
TL;DR: A new method named adversarial domain augmentation is proposed to solve the Out-of-Distribution (OOD) generalization problem by leveraging adversarial training to create "fictitious" yet "challenging" populations, from which a model can learn to generalize with theoretical guarantees.
References
More filters
Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Book
Pattern Recognition and Machine Learning
TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.
Proceedings ArticleDOI
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.
Book
A wavelet tour of signal processing
TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.