The Cityscapes Dataset for Semantic Urban Scene Understanding

doi:10.1109/CVPR.2016.350

Open AccessProceedings ArticleDOI

The Cityscapes Dataset for Semantic Urban Scene Understanding

- pp 3213-3223

TLDR

This work introduces Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling, and exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity.

Abstract:

Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes. To address this, we introduce Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling. Cityscapes is comprised of a large, diverse set of stereo video sequences recorded in streets from 50 different cities. 5000 of these images have high quality pixel-level annotations, 20 000 additional images have coarse annotations to enable methods that leverage large volumes of weakly-labeled data. Crucially, our effort exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Our accompanying empirical study provides an in-depth analysis of the dataset characteristics, as well as a performance evaluation of several state-of-the-art approaches based on our benchmark.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Mask R-CNN

Kaiming He, +3 more

TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.

...read moreread less

Journal ArticleDOI

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Vijay Badrinarayanan, +2 more

- 01 Dec 2017 -

IEEE Transactions on Pattern Analysis an...

TL;DR: Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.

...read moreread less

Proceedings ArticleDOI

Image-to-Image Translation with Conditional Adversarial Networks

Phillip Isola, +3 more

TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Journal ArticleDOI

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Liang-Chieh Chen, +4 more

- 01 Apr 2018 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.

...read moreread less

Proceedings ArticleDOI

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

Jun-Yan Zhu, +3 more

TL;DR: CycleGAN as discussed by the authors learns a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Semantic Image Segmentation via Deep Parsing Network

Ziwei Liu, +4 more

TL;DR: Deep Parsing Network (DPN) as mentioned in this paper proposes a convolutional neural network (CNN) to model unary terms and additional layers are carefully devised to approximate the mean field algorithm (MF) for pairwise terms.

...read moreread less

Proceedings ArticleDOI

From image-level to pixel-level labeling with Convolutional Networks

Pedro O. Pinheiro, +1 more

TL;DR: A Convolutional Neural Network-based model is proposed, which is constrained during training to put more weight on pixels which are important for classifying the image, and which beats the state of the art results in weakly supervised object segmentation task by a large margin.

...read moreread less

Proceedings ArticleDOI

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation

Deepak Pathak, +2 more

TL;DR: This work proposes Constrained CNN (CCNN), a method which uses a novel loss function to optimize for any set of linear constraints on the output space of a CNN, and demonstrates the generality of this new learning framework.

...read moreread less

Proceedings ArticleDOI

Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation

Guosheng Lin, +3 more

TL;DR: Zhang et al. as discussed by the authors proposed a patch-patch context between image regions and patch-background context, and formulated conditional random fields (CRFs) with CNN-based pairwise potential functions to capture semantic correlations between neighboring patches.

...read moreread less

Proceedings ArticleDOI

Pedestrian detection at 100 frames per second

Rodrigo Benenson, +3 more

TL;DR: A new pedestrian detector that improves both in speed and quality over state-of-the-art by efficiently handling different scales and transferring computation from test time to training time, detection speed is improved.

...read moreread less

Collapse

The Cityscapes Dataset for Semantic Urban Scene Understanding

Citations

Mask R-CNN

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Image-to-Image Translation with Conditional Adversarial Networks

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

References

Semantic Image Segmentation via Deep Parsing Network

From image-level to pixel-level labeling with Convolutional Networks

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation

Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation

Pedestrian detection at 100 frames per second

Related Papers (5)

Deep Residual Learning for Image Recognition

Fully convolutional networks for semantic segmentation

Microsoft COCO: Common Objects in Context

U-Net: Convolutional Networks for Biomedical Image Segmentation

ImageNet: A large-scale hierarchical image database

Trending Questions (1)