Deep Level Sets for Salient Object Detection

doi:10.1109/CVPR.2017.65

Home
/
Papers
/
Deep Level Sets for Salient Object Detection

Proceedings Article•DOI•

Deep Level Sets for Salient Object Detection

Ping Hu¹, Bing Shuai¹, Jun Liu¹, Gang Wang²•Institutions (2)

Nanyang Technological University¹, Alibaba Group²

01 Jul 2017-pp 540-549

TL;DR: This work proposes a deep Level Set network to produce compact and uniform saliency maps and drives the network to learn a Level Set function for salient objects so it can output more accurate boundaries and compact saliency.

read less

Abstract: Deep learning has been applied to saliency detection in recent years. The superior performance has proved that deep networks can model the semantic properties of salient objects. Yet it is difficult for a deep network to discriminate pixels belonging to similar receptive fields around the object boundaries, thus deep networks may output maps with blurred saliency and inaccurate boundaries. To tackle such an issue, in this work, we propose a deep Level Set network to produce compact and uniform saliency maps. Our method drives the network to learn a Level Set function for salient objects so it can output more accurate boundaries and compact saliency. Besides, to propagate saliency information among pixels and recover full resolution saliency map, we extend a superpixel-based guided filter to be a layer in the network. The proposed network has a simple structure and is trained end-to-end. During testing, the network can produce saliency maps by efficiently feedforwarding testing images at a speed over 12FPS on GPUs. Evaluations on benchmark datasets show that the proposed method achieves state-of-the-art performance.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

BASNet: Boundary-Aware Salient Object Detection

[...]

Xuebin Qin¹, Zichen Zhang¹, Chenyang Huang¹, Chao Gao¹, Masood Dehghan¹, Martin Jagersand¹ - Show less +2 more•Institutions (1)

University of Alberta¹

01 Jun 2019

TL;DR: Experimental results on six public datasets show that the proposed predict-refine architecture, BASNet, outperforms the state-of-the-art methods both in terms of regional and boundary evaluation measures.

...read moreread less

Abstract: Deep Convolutional Neural Networks have been adopted for salient object detection and achieved the state-of-the-art performance. Most of the previous works however focus on region accuracy but not on the boundary quality. In this paper, we propose a predict-refine architecture, BASNet, and a new hybrid loss for Boundary-Aware Salient object detection. Specifically, the architecture is composed of a densely supervised Encoder-Decoder network and a residual refinement module, which are respectively in charge of saliency prediction and saliency map refinement. The hybrid loss guides the network to learn the transformation between the input image and the ground truth in a three-level hierarchy -- pixel-, patch- and map- level -- by fusing Binary Cross Entropy (BCE), Structural SIMilarity (SSIM) and Intersection-over-Union (IoU) losses. Equipped with the hybrid loss, the proposed predict-refine architecture is able to effectively segment the salient object regions and accurately predict the fine structures with clear boundaries. Experimental results on six public datasets show that our method outperforms the state-of-the-art methods both in terms of regional and boundary evaluation measures. Our method runs at over 25 fps on a single GPU. The code is available at: https://github.com/NathanUA/BASNet.

...read moreread less

962 citations

Cites background from "Deep Level Sets for Salient Object ..."

...[18] proposed to learn a Level Set [48] function to output accurate boundaries and compact saliency....
[...]

Proceedings Article•DOI•

EGNet: Edge Guidance Network for Salient Object Detection

[...]

Jiaxing Zhao¹, Jiang-Jiang Liu¹, Deng-Ping Fan, Yang Cao¹, Jufeng Yang¹, Ming-Ming Cheng¹ - Show less +2 more•Institutions (1)

Nankai University¹

01 Oct 2019

TL;DR: In this article, an edge guidance network (EGNet) is proposed for salient object detection with three steps to simultaneously model these two kinds of complementary information in a single network, which can help locate salient objects especially their boundaries more accurately.

...read moreread less

Abstract: Fully convolutional neural networks (FCNs) have shown their advantages in the salient object detection task. However, most existing FCNs-based methods still suffer from coarse object boundaries. In this paper, to solve this problem, we focus on the complementarity between salient edge information and salient object information. Accordingly, we present an edge guidance network (EGNet) for salient object detection with three steps to simultaneously model these two kinds of complementary information in a single network. In the ﬁrst step, we extract the salient object features by a progressive fusion way. In the second step, we integrate the local edge information and global location information to obtain the salient edge features. Finally, to sufﬁciently leverage these complementary features, we couple the same salient edge features with salient object features at various resolutions. Beneﬁting from the rich edge information and location information in salient edge features, the fused features can help locate salient objects, especially their boundaries more accurately. Experimental results demonstrate that the proposed method performs favorably against the state-of-the-art methods on six widely used datasets without any pre-processing and post-processing. The source code is available at http: //mmcheng.net/egnet/.

...read moreread less

803 citations

Proceedings Article•DOI•

PiCANet: Learning Pixel-Wise Contextual Attention for Saliency Detection

[...]

Nian Liu, Junwei Han¹, Ming-Hsuan Yang²•Institutions (2)

Northwestern Polytechnical University¹, University of California, Merced²

18 Jun 2018

TL;DR: Zhang et al. as discussed by the authors proposed a pixel-wise contextual attention network to learn to selectively attend to informative context locations for each pixel, which can generate an attention map in which each attention weight corresponds to the contextual relevance at each context location.

...read moreread less

Abstract: Contexts play an important role in the saliency detection task. However, given a context region, not all contextual information is helpful for the final task. In this paper, we propose a novel pixel-wise contextual attention network, i.e., the PiCANet, to learn to selectively attend to informative context locations for each pixel. Specifically, for each pixel, it can generate an attention map in which each attention weight corresponds to the contextual relevance at each context location. An attended contextual feature can then be constructed by selectively aggregating the contextual information. We formulate the proposed PiCANet in both global and local forms to attend to global and local contexts, respectively. Both models are fully differentiable and can be embedded into CNNs for joint training. We also incorporate the proposed models with the U-Net architecture to detect salient objects. Extensive experiments show that the proposed PiCANets can consistently improve saliency detection performance. The global and local PiCANets facilitate learning global contrast and homogeneousness, respectively. As a result, our saliency model can detect salient objects more accurately and uniformly, thus performing favorably against the state-of-the-art methods.

...read moreread less

631 citations

Proceedings Article•DOI•

Salient Object Detection With Pyramid Attention and Salient Edges

[...]

Wenguan Wang, Shuyang Zhao¹, Jianbing Shen¹, Steven C. H. Hoi², Ali Borji³ - Show less +1 more•Institutions (3)

Beijing Institute of Technology¹, Southern Methodist University², University of Central Florida³

15 Jun 2019

TL;DR: Exhaustive experiments confirm that the proposed pyramid attention and salient edges are effective for salient object detection and the deep saliency model outperforms state-of-the-art approaches for several benchmarks with a fast processing speed (25fps on one GPU).

...read moreread less

Abstract: This paper presents a new method for detecting salient objects in images using convolutional neural networks (CNNs). The proposed network, named PAGE-Net, offers two key contributions. The first is the exploitation of an essential pyramid attention structure for salient object detection. This enables the network to concentrate more on salient regions while considering multi-scale saliency information. Such a stacked attention design provides a powerful tool to efficiently improve the representation ability of the corresponding network layer with an enlarged receptive field. The second contribution lies in the emphasis on the importance of salient edges. Salient edge information offers a strong cue to better segment salient objects and refine object boundaries. To this end, our model is equipped with a salient edge detection module, which is learned for precise salient boundary estimation. This encourages better edge-preserving salient object segmentation. Exhaustive experiments confirm that the proposed pyramid attention and salient edges are effective for salient object detection. We show that our deep saliency model outperforms state-of-the-art approaches for several benchmarks with a fast processing speed (25fps on one GPU).

...read moreread less

464 citations

Cites methods from "Deep Level Sets for Salient Object ..."

...We compare the proposed PAGE-Net against 19 recent deep learning based alternatives: MDF [21], LEGS [34], DS [24], DCL [22], ELD [20], MC [57], RFCN [36], DHS [26], HEDS [14], KSR [38], NLDF [29], DLS [15], AMU [54], UCF [55], SRM [37], FSN [8], PAGR [56], RAS [7] and C2S [23]. we use either the implementations with the recommended parameter settings or the saliency maps shared by the authors....
[...]
...7 shows a visual comparison of the results of our method against those of five other top- Method LEGS [34] MDF [21] DS [24] DCL [22] ELD [20] Time(s) 1.54 7.83 0.13 0.39 0.55 Method RFCN [36] DHS [26] HEDS [14] KSR [38] NLDF [29] Time(s) 4.65 0.04 0.57 49.64 0.09 Method DLS [15] AMU [54] UCF [55] SRM [37] PAGE-Net Time(s) 0.08 0.07 0.04 0.07 0.04 Table 2: Runtime comparison (GPU time) with previous deep learning based saliency models....
[...]
...We compare the proposed PAGE-Net against 19 recent deep learning based alternatives: MDF [21], LEGS [34], DS [24], DCL [22], ELD [20], MC [57], RFCN [36], DHS [26], HEDS [14], KSR [38], NLDF [29], DLS [15], AMU [54], UCF [55], SRM [37], FSN [8], PAGR [56], RAS [7] and C2S [23]....
[...]
...Method DLS [15] AMU [54] UCF [55] SRM [37] PAGE-Net...
[...]
...For example, some methods integrate deep learning models with hand-crafted features [20], heuristic saliency priors [36], level set [15], contextual information [57], or explicit visual fixation [40]....
[...]

Book Chapter•DOI•

Reverse Attention for Salient Object Detection

[...]

Shuhan Chen¹, Xiuli Tan¹, Ben Wang¹, Xuelong Hu¹•Institutions (1)

Yangzhou University¹

08 Sep 2018

TL;DR: An accurate yet compact deep network for efficient salient object detection that employs residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy.

...read moreread less

Abstract: Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently. However, there still exists following two major challenges that hinder its application in embedded devices, low resolution output and heavy model weight. To this end, this paper presents an accurate yet compact deep network for efficient salient object detection. More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy. Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner. By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy. Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).

...read moreread less

448 citations

Cites background or methods from "Deep Level Sets for Salient Object ..."

...We compare the proposed method with 10 state-of-the-art ones, including 9 recent CNN-based approaches, DCL+ [22], DHS [26], SSD [16], RFCN [39], DLS [10], NLDF [30], DSS and DSS+ [8], Amulet [45], UCF [46], and one conventional top approach, DRFI [13], where symbol “+” indicates that the network includes CRF-based post-processing....
[...]
...We compare the proposed method with 10 state-of-the-art ones, including 9 recent CNN-based approaches, DCL [8], DHS [44], SSD [45], RFCN [9], DLS [23], NLDF [10], DSS and DSS [11], Amulet [13], UCF [14], and one conventional top approach, DRFI [42], where symbol “+” indicates that the network includes CRF-based post-processing....
[...]
...Recently, dilated convolution [23] and dense connections [17] are further incorporated to obtain high resolution saliency map....
[...]
...[23] entended a superpixel-based guided filter to be a layer in the network for boundary refinement....
[...]
..., superpixel-based filter [23], fully connected conditional random field (CRF) [8,11,24]....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Deep Residual Learning for Image Recognition

[...]

Kaiming He¹, Xiangyu Zhang¹, Shaoqing Ren¹, Jian Sun¹•Institutions (1)

Microsoft¹

27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

...read moreread less

123,388 citations

Proceedings Article•

Adam: A Method for Stochastic Optimization

[...]

Diederik P. Kingma¹, Jimmy Ba²•Institutions (2)

University of Amsterdam¹, University of Toronto²

01 Jan 2015

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

...read moreread less

111,197 citations

"Deep Level Sets for Salient Object ..." refers methods in this paper

...We use Adam [23] with an initial learning rate of 1e-4 to update the weights....
[...]

Proceedings Article•

Very Deep Convolutional Networks for Large-Scale Image Recognition

[...]

Karen Simonyan¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

...read moreread less

49,914 citations

Proceedings Article•DOI•

Densely Connected Convolutional Networks

[...]

Gao Huang¹, Zhuang Liu², Laurens van der Maaten³, Kilian Q. Weinberger¹•Institutions (3)

Cornell University¹, Tsinghua University², Facebook³

21 Jul 2017

TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.

...read moreread less

Abstract: Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections—one between each layer and its subsequent layer—our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and pre-trained models are available at https://github.com/liuzhuang13/DenseNet.

...read moreread less

27,821 citations

Book•

Partial Differential Equations

[...]

Lawrence C. Evans

01 Jan 1941

TL;DR: In this paper, the authors present a theory for linear PDEs: Sobolev spaces Second-order elliptic equations Linear evolution equations, Hamilton-Jacobi equations and systems of conservation laws.

...read moreread less

Abstract: Introduction Part I: Representation formulas for solutions: Four important linear partial differential equations Nonlinear first-order PDE Other ways to represent solutions Part II: Theory for linear partial differential equations: Sobolev spaces Second-order elliptic equations Linear evolution equations Part III: Theory for nonlinear partial differential equations: The calculus of variations Nonvariational techniques Hamilton-Jacobi equations Systems of conservation laws Appendices Bibliography Index.

...read moreread less

25,734 citations