RGBD Salient Object Detection via Deep Fusion
Reads0
Chats0
TLDR
Zhang et al. as mentioned in this paper designed a new convolutional neural network (CNN) to automatically learn the interaction mechanism for RGBD salient object detection, which takes advantage of the knowledge obtained in traditional saliency detection by adopting various flexible and interpretable saliency feature vectors as inputs.Abstract:
Numerous efforts have been made to design various low-level saliency cues for RGBD saliency detection, such as color and depth contrast features as well as background and color compactness priors. However, how these low-level saliency cues interact with each other and how they can be effectively incorporated to generate a master saliency map remain challenging problems. In this paper, we design a new convolutional neural network (CNN) to automatically learn the interaction mechanism for RGBD salient object detection. In contrast to existing works, in which raw image pixels are fed directly to the CNN, the proposed method takes advantage of the knowledge obtained in traditional saliency detection by adopting various flexible and interpretable saliency feature vectors as inputs. This guides the CNN to learn a combination of existing features to predict saliency more effectively, which presents a less complex problem than operating on the pixels directly. We then integrate a superpixel-based Laplacian propagation framework with the trained CNN to extract a spatially consistent saliency map by exploiting the intrinsic structure of the input image. Extensive quantitative and qualitative experimental evaluations on three data sets demonstrate that the proposed method consistently outperforms the state-of-the-art methods.read more
Citations
More filters
Journal ArticleDOI
Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks
TL;DR: It is demonstrated that D3Net can be used to efficiently extract salient object masks from real scenes, enabling effective background-changing application with a speed of 65 frames/s on a single GPU.
Proceedings ArticleDOI
Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection
TL;DR: Contrast prior is utilized, which used to be a dominant cue in none deep learning based SOD approaches, into CNNs-based architecture to enhance the depth information and is integrated with RGB features for SOD, using a novel fluid pyramid integration.
Proceedings ArticleDOI
Depth-Induced Multi-Scale Recurrent Attention Network for Saliency Detection
TL;DR: This work proposes a novel depth-induced multi-scale recurrent attention network for saliency detection that achieves dramatic performance especially in complex scenarios and boosts its performance by a novel recurrent attention module inspired by Internal Generative Mechanism of human brain.
Journal ArticleDOI
Review of Visual Saliency Detection With Comprehensive Information
TL;DR: Zhang et al. as mentioned in this paper reviewed different types of saliency detection algorithms, summarize the important issues of the existing methods, and discuss the existent problems and future works, and the experimental analysis and discussion are conducted to provide a holistic overview of different saliency detectors.
Journal ArticleDOI
Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection
TL;DR: A novel multi-scale multi-path fusion network with cross-modal interactions (MMCI), in which the traditional two-stream fusion architecture with single fusion path is advanced by diversifying the fusion path to a global reasoning one and another local capturing one and meanwhile introducing cross- modal interactions in multiple layers.
References
More filters
Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Journal ArticleDOI
ImageNet classification with deep convolutional neural networks
TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.
Proceedings ArticleDOI
Fully convolutional networks for semantic segmentation
TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
Journal ArticleDOI
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
TL;DR: This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.