The Cityscapes Dataset for Semantic Urban Scene Understanding

doi:10.1109/CVPR.2016.350

Open AccessProceedings ArticleDOI

The Cityscapes Dataset for Semantic Urban Scene Understanding

Marius Cordts, +8 more

- pp 3213-3223

Chats0

TLDR

This work introduces Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling, and exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity.

Abstract:

Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes. To address this, we introduce Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling. Cityscapes is comprised of a large, diverse set of stereo video sequences recorded in streets from 50 different cities. 5000 of these images have high quality pixel-level annotations, 20 000 additional images have coarse annotations to enable methods that leverage large volumes of weakly-labeled data. Crucially, our effort exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Our accompanying empirical study provides an in-depth analysis of the dataset characteristics, as well as a performance evaluation of several state-of-the-art approaches based on our benchmark.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Mask R-CNN

Kaiming He, +3 more

TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.

...read moreread less

Journal ArticleDOI

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Vijay Badrinarayanan, +2 more

- 01 Dec 2017 -

IEEE Transactions on Pattern Analysis an...

TL;DR: Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.

...read moreread less

Proceedings ArticleDOI

Image-to-Image Translation with Conditional Adversarial Networks

Phillip Isola, +3 more

TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Journal ArticleDOI

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Liang-Chieh Chen, +4 more

- 01 Apr 2018 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.

...read moreread less

Proceedings ArticleDOI

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

Jun-Yan Zhu, +3 more

TL;DR: CycleGAN as discussed by the authors learns a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Posted Content

Fast R-CNN

Ross Girshick

- 30 Apr 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object detection that builds on previous work to efficiently classify object proposals using deep convolutional networks.

...read moreread less

Proceedings Article

Faster R-CNN: towards real-time object detection with region proposal networks

Shaoqing Ren, +3 more

TL;DR: Ren et al. as discussed by the authors proposed a region proposal network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals.

...read moreread less

Journal ArticleDOI

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Vijay Badrinarayanan, +2 more

- 01 Dec 2017 -

IEEE Transactions on Pattern Analysis an...

TL;DR: Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.

...read moreread less

Posted Content

Rich feature hierarchies for accurate object detection and semantic segmentation

Ross Girshick, +3 more

- 11 Nov 2013 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.

...read moreread less

Journal ArticleDOI

Object Detection with Discriminatively Trained Part-Based Models

Pedro F. Felzenszwalb, +3 more

- 01 Sep 2010 -

IEEE Transactions on Pattern Analysis an...

TL;DR: An object detection system based on mixtures of multiscale deformable part models that is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges is described.

...read moreread less

Collapse

The Cityscapes Dataset for Semantic Urban Scene Understanding

Citations

Mask R-CNN

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Image-to-Image Translation with Conditional Adversarial Networks

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

References

Fast R-CNN

Faster R-CNN: towards real-time object detection with region proposal networks

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation

Object Detection with Discriminatively Trained Part-Based Models

Related Papers (5)

Deep Residual Learning for Image Recognition

Fully convolutional networks for semantic segmentation

Microsoft COCO: Common Objects in Context

U-Net: Convolutional Networks for Biomedical Image Segmentation

ImageNet: A large-scale hierarchical image database

Trending Questions (1)