Confidence-and-Refinement Adaptation Model for Cross-Domain Semantic Segmentation
Reads0
Chats0
TLDR
A novel multi-level UDA model named Confidence-and-Refinement Adaptation Model (CRAM), which contains a confidence-aware entropy alignment (CEA) module and a style feature alignment (SFA) module, which achieves comparable performance with the existing state-of-the-art works with advantages in simplicity and convergence speed.Abstract:
With the rapid development of convolutional neural networks (CNNs), significant progress has been achieved in semantic segmentation. Despite the great success, such deep learning approaches require large scale real-world datasets with pixel-level annotations. However, considering that pixel-level labeling of semantics is extremely laborious, many researchers turn to utilize synthetic data with free annotations. But due to the clear domain gap, the segmentation model trained with the synthetic images tends to perform poorly on the real-world datasets. Unsupervised domain adaptation (UDA) for semantic segmentation recently gains an increasing research attention, which aims at alleviating the domain discrepancy. Existing methods in this scope either simply align features or the outputs across the source and target domains or have to deal with the complex image processing and post-processing problems. In this work, we propose a novel multi-level UDA model named Confidence-and-Refinement Adaptation Model (CRAM), which contains a confidence-aware entropy alignment (CEA) module and a style feature alignment (SFA) module. Through CEA, the adaptation is done locally via adversarial learning in the output space, making the segmentation model pay attention to the high-confident predictions. Furthermore, to enhance the model transfer in the shallow feature space, the SFA module is applied to minimize the appearance gap across domains. Experiments on two challenging UDA benchmarks “GTA5-to-Cityscapes” and “SYNTHIA-to-Cityscapes” demonstrate the effectiveness of CRAM. We achieve comparable performance with the existing state-of-the-art works with advantages in simplicity and convergence speed.read more
Citations
More filters
Journal ArticleDOI
Threshold-Adaptive Unsupervised Focal Loss for Domain Adaptation of Semantic Segmentation
TL;DR: A novel twostage entropy-based UDA method for semantic segmentation with a threshold-adaptative unsupervised focal loss to regularize the prediction in the target domain and a data augmentation method named cross-domain image mixing (CIM) to bridge the semantic knowledge from two domains.
Journal ArticleDOI
Threshold-Adaptive Unsupervised Focal Loss for Domain Adaptation of Semantic Segmentation
TL;DR: In this paper , the authors proposed a two-stage entropy-based UDA method for semantic segmentation, where the first stage introduces a threshold-adaptive unsupervised focal loss to regularize the prediction in the target domain and the second stage employs cross-domain image mixing to bridge the semantic knowledge between two domains.
Journal ArticleDOI
Combining Pixel-Level and Structure-Level Adaptation for Semantic Segmentation
Journal ArticleDOI
Category-Level Adversaries for Outdoor LiDAR Point Clouds Cross-Domain Semantic Segmentation
TL;DR: In this article , a multi-scale domain conditioned block is proposed to extract the critical low-level domain-dependent knowledge and reduce the domain gap caused by distinct LiDAR sampling patterns.
Journal ArticleDOI
Category-Level Adversaries for Outdoor LiDAR Point Clouds Cross-Domain Semantic Segmentation
TL;DR: In this article , a multi-scale domain conditioned block is proposed to extract the critical low-level domain-dependent knowledge and reduce the domain gap caused by distinct LiDAR sampling patterns.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Proceedings ArticleDOI
Fully convolutional networks for semantic segmentation
TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
Journal ArticleDOI
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
TL;DR: Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.
Journal ArticleDOI
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
TL;DR: This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.