scispace - formally typeset
Open AccessProceedings ArticleDOI

Multi-Scale Interactive Network for Salient Object Detection

TLDR
The consistency-enhanced loss is exploited to highlight the fore-/back-ground difference and preserve the intra-class consistency in the aggregate interaction modules to integrate the features from adjacent levels, in which less noise is introduced because of only using small up-/down-sampling rates.
Abstract
Deep-learning based salient object detection methods achieve great progress. However, the variable scale and unknown category of salient objects are great challenges all the time. These are closely related to the utilization of multi-level and multi-scale features. In this paper, we propose the aggregate interaction modules to integrate the features from adjacent levels, in which less noise is introduced because of only using small up-/down-sampling rates. To obtain more efficient multi-scale features from the integrated features, the self-interaction modules are embedded in each decoder unit. Besides, the class imbalance issue caused by the scale variation weakens the effect of the binary cross entropy loss and results in the spatial inconsistency of the predictions. Therefore, we exploit the consistency-enhanced loss to highlight the fore-/back-ground difference and preserve the intra-class consistency. Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches. The source code will be publicly available at https://github.com/lartpang/MINet.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

Suppress and Balance: A Simple Gated Network for Salient Object Detection

TL;DR: Zhang et al. as mentioned in this paper proposed a simple gated network (GateNet) to solve two key problems when the encoder exchanges information with the decoder: one is the lack of interference control between them, the other is without considering the disparity of the contributions of different encoder blocks.
Posted Content

Suppress and Balance: A Simple Gated Network for Salient Object Detection

TL;DR: A novel gated dual branch structure is designed to build the cooperation among different levels of features and improve the discriminability of the whole network and adopts the atrous spatial pyramid pooling based on the proposed "Fold" operation (Fold-ASPP) to accurately localize salient objects of various scales.
Proceedings ArticleDOI

Camouflaged Object Segmentation with Distraction Mining

TL;DR: Zhang et al. as mentioned in this paper developed a bio-inspired framework, termed Positioning and Focus Network (PFNet), which mimics the process of predation in nature, which contains two key modules, i.e., the positioning module (PM) and the focus module (FM).
Book ChapterDOI

Hierarchical Dynamic Filtering Network for RGB-D Salient Object Detection

TL;DR: Liang et al. as mentioned in this paper integrated the features of different modalities through densely connected structures and used their mixed features to generate dynamic filters with receptive fields of different sizes, and designed a hybrid enhanced loss function to further optimize the results.
Book ChapterDOI

A Single Stream Network for Robust and Real-Time RGB-D Salient Object Detection

TL;DR: Zhang et al. as discussed by the authors designed a single stream network to directly use the depth map to guide early fusion and middle fusion between RGB and depth, which saves the feature encoder of the depth stream and achieves a lightweight and real-time model.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
Proceedings ArticleDOI

Feature Pyramid Networks for Object Detection

TL;DR: This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles.
Related Papers (5)