The Cityscapes Dataset for Semantic Urban Scene Understanding
Marius Cordts,Mohamed Omran,Sebastian Ramos,Timo Rehfeld,Markus Enzweiler,Rodrigo Benenson,Uwe Franke,Stefan Roth,Bernt Schiele +8 more
- pp 3213-3223
TLDR
This work introduces Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling, and exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity.Abstract:
Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes. To address this, we introduce Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling. Cityscapes is comprised of a large, diverse set of stereo video sequences recorded in streets from 50 different cities. 5000 of these images have high quality pixel-level annotations, 20 000 additional images have coarse annotations to enable methods that leverage large volumes of weakly-labeled data. Crucially, our effort exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Our accompanying empirical study provides an in-depth analysis of the dataset characteristics, as well as a performance evaluation of several state-of-the-art approaches based on our benchmark.read more
Citations
More filters
Proceedings ArticleDOI
Mask R-CNN
TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.
Journal ArticleDOI
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
TL;DR: Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.
Proceedings ArticleDOI
Image-to-Image Translation with Conditional Adversarial Networks
TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Journal ArticleDOI
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
TL;DR: This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.
Proceedings ArticleDOI
Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks
TL;DR: CycleGAN as discussed by the authors learns a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.
References
More filters
Proceedings ArticleDOI
Semantic Image Segmentation via Deep Parsing Network
TL;DR: Deep Parsing Network (DPN) as mentioned in this paper proposes a convolutional neural network (CNN) to model unary terms and additional layers are carefully devised to approximate the mean field algorithm (MF) for pairwise terms.
Proceedings ArticleDOI
From image-level to pixel-level labeling with Convolutional Networks
TL;DR: A Convolutional Neural Network-based model is proposed, which is constrained during training to put more weight on pixels which are important for classifying the image, and which beats the state of the art results in weakly supervised object segmentation task by a large margin.
Proceedings ArticleDOI
Constrained Convolutional Neural Networks for Weakly Supervised Segmentation
TL;DR: This work proposes Constrained CNN (CCNN), a method which uses a novel loss function to optimize for any set of linear constraints on the output space of a CNN, and demonstrates the generality of this new learning framework.
Proceedings ArticleDOI
Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation
TL;DR: Zhang et al. as discussed by the authors proposed a patch-patch context between image regions and patch-background context, and formulated conditional random fields (CRFs) with CNN-based pairwise potential functions to capture semantic correlations between neighboring patches.
Proceedings ArticleDOI
Pedestrian detection at 100 frames per second
TL;DR: A new pedestrian detector that improves both in speed and quality over state-of-the-art by efficiently handling different scales and transferring computation from test time to training time, detection speed is improved.