RDFNet: RGB-D Multi-level Residual Feature Fusion for Indoor Semantic Segmentation

doi:10.1109/ICCV.2017.533

Proceedings ArticleDOI

RDFNet: RGB-D Multi-level Residual Feature Fusion for Indoor Semantic Segmentation

- pp 4990-4999

TLDR

This paper presents a novel network that extends the core idea of residual learning to RGB-D semantic segmentation by including multi-modal feature fusion blocks and multi-level feature refinement blocks and achieves the state-of-the-art accuracy on two challenging RGB- D indoor datasets, NYUDv2 and SUNRGB-D.

Abstract:

In multi-class indoor semantic segmentation using RGB-D data, it has been shown that incorporating depth feature into RGB feature is helpful to improve segmentation accuracy. However, previous studies have not fully exploited the potentials of multi-modal feature fusion, e.g., simply concatenating RGB and depth features or averaging RGB and depth score maps. To learn the optimal fusion of multimodal features, this paper presents a novel network that extends the core idea of residual learning to RGB-D semantic segmentation. Our network effectively captures multilevel RGB-D CNN features by including multi-modal feature fusion blocks and multi-level feature refinement blocks. Feature fusion blocks learn residual RGB and depth features and their combinations to fully exploit the complementary characteristics of RGB and depth data. Feature refinement blocks learn the combination of fused features from multiple levels to enable high-resolution prediction. Our network can efficiently train discriminative multi-level features from each modality end-to-end by taking full advantage of skip-connections. Our comprehensive experiments demonstrate that the proposed architecture achieves the state-of-the-art accuracy on two challenging RGB-D indoor datasets, NYUDv2 and SUN RGB-D.

Citations

PDF

Open Access

More filters

Posted Content

Image Segmentation Using Deep Learning: A Survey

Shervin Minaee, +5 more

- 15 Jan 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A comprehensive review of recent pioneering efforts in semantic and instance segmentation, including convolutional pixel-labeling networks, encoder-decoder architectures, multiscale and pyramid-based approaches, recurrent networks, visual attention models, and generative models in adversarial settings are provided.

...read moreread less

Learning Deconvolution Network for Semantic Segmentation

한보형, +2 more

Journal ArticleDOI

Beyond RGB: Very high resolution urban remote sensing with multimodal deep networks

Nicolas Audebert, +2 more

- 01 Nov 2017 -

Isprs Journal of Photogrammetry and Remo...

TL;DR: In this paper, the authors investigate various methods to deal with semantic labeling of very high-resolution multi-modal remote sensing data and propose an efficient multi-scale approach to leverage both a large spatial context and the high resolution data, and investigate early and late fusion of Lidar and multispectral data.

...read moreread less

Journal ArticleDOI

Survey on semantic segmentation using deep learning techniques

Fahad Lateef, +1 more

- 21 Apr 2019 -

Neurocomputing

TL;DR: A survey of semantic segmentation methods by categorizing them into ten different classes according to the common concepts underlying their architectures, and providing an overview of the publicly available datasets on which they have been assessed.

...read moreread less

Proceedings ArticleDOI

Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection

Hao Chen, +1 more

TL;DR: A novel complementarity-aware fusion (CA-Fuse) module when adopting the Convolutional Neural Network (CNN) and the proposed RGB-D fusion network disambiguates both cross-modal and cross-level fusion processes and enables more sufficient fusion results.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Book ChapterDOI

U-Net: Convolutional Networks for Biomedical Image Segmentation

Olaf Ronneberger, +2 more

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

...read moreread less

Collapse

RDFNet: RGB-D Multi-level Residual Feature Fusion for Indoor Semantic Segmentation

Citations

Image Segmentation Using Deep Learning: A Survey

Learning Deconvolution Network for Semantic Segmentation

Beyond RGB: Very high resolution urban remote sensing with multimodal deep networks

Survey on semantic segmentation using deep learning techniques

Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection

References

Deep Residual Learning for Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

U-Net: Convolutional Networks for Biomedical Image Segmentation

Related Papers (5)

Deep Residual Learning for Image Recognition

Indoor segmentation and support inference from RGBD images

Fully convolutional networks for semantic segmentation

RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture