Visual Saliency Detection via Convolutional Gated Recurrent Units

doi:10.1007/978-3-030-36711-4_15

Home
/
Papers
/
Visual Saliency Detection via Convolutional Gated Recurrent Units

Book Chapter•DOI•

Visual Saliency Detection via Convolutional Gated Recurrent Units

Sayanti Bardhan¹, Sukhendu Das¹, Shibu Jacob²•Institutions (2)

Indian Institute of Technology Madras¹, National Institute of Ocean Technology²

12 Dec 2019-pp 162-174

TL;DR: This work proposes a proposed novel end-to-end framework with a Contextual Unit (CTU) module that models the scene contextual information to give efficient saliency maps with the help of Convolutional GRU (Conv-GRU).

read less

Abstract: Context is an important aspect for accurate saliency detection. However, the question of how to formally model image context within saliency detection frameworks is still an open problem. Recent saliency detection models designed using complex Deep Neural Networks to extract robust features, however often fail to select the right contextual features. These methods generally utilize physical attributes of objects for generating final saliency maps, but ignores scene contextual information. In this paper, we overcome such limitation using (i) a proposed novel end-to-end framework with a Contextual Unit (CTU) module that models the scene contextual information to give efficient saliency maps with the help of Convolutional GRU (Conv-GRU). This is the first work reported so far that utilizes Conv-GRU to generate image saliency maps. In addition, (ii) we propose a novel way of using the Conv-GRU that helps to refine saliency maps based on input image context. The proposed model has been evaluated on challenging benchmark saliency datasets, where it outperforms prominent state-of-the-art methods.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Salient Object Detection by Contextual Refinement

[...]

Sayanti Bardhan¹•Institutions (1)

Indian Institute of Technology Madras¹

14 Jun 2020

TL;DR: A novel saliency detection framework with a Contextual Refinement Module (CRM) which consists of two sub-networks, Object Relation Unit (ORU) and Scene Context Unit (SCU) which captures complementary contextual information to give a holistic estimation of salient regions.

...read moreread less

Abstract: Context plays an important role in the saliency prediction task. In this work, we propose a saliency detection framework that not only extracts visual features but also models two kinds of context including object-object relationships within a single image and scene contextual information. Specifically, we develop a novel saliency detection framework with a Contextual Refinement Module (CRM) which consists of two sub-networks, Object Relation Unit (ORU) and Scene Context Unit (SCU). ORU encodes the object-object relationship based on object relative position and object co-occurrence pattern in an image, by graphical approach, while SCU incorporates the scene contextual information of an image. Object Relation Unit (ORU) and Scene Context Unit (SCU) captures complementary contextual information to give a holistic estimation of salient regions. Extensive experiments show the effectiveness of modelling object relations and scene context in boosting the performance of saliency prediction. In particular, our frame-work outperforms the state-of-the-art models on challenging benchmark datasets.

...read moreread less

6 citations

Journal Article•DOI•

BGRDNet: RGB-D salient object detection with a bidirectional gated recurrent decoding network

[...]

Zhengyi Liu, Yuan Wang, Zhili Zhang, Yacheng Tan

23 Mar 2022-Multimedia Tools and Applications

1 citations

Journal Article•DOI•

An Image Saliency Detection Method Based on Combining Global and Local Information

[...]

Huanming Yang, Yong-chao Gong, Kai Wang

19 Apr 2022-Mathematical Problems in Engineering

TL;DR: Experimental results show that the proposed saliency target detection algorithm can not only accurately and comprehensively extract significant target regions but also retain more texture information and complete edge information while satisfying the human visual experience.

...read moreread less

Abstract: In the field of computer vision, image saliency target detection can not only improve the accuracy of image detection but also accelerate the speed of image detection. In order to solve the existing problems of the saliency target detection algorithms at present, such as inconspicuous texture details and incomplete edge contour display, this paper proposes a saliency target detection algorithm integrating multiple information. The algorithm consists of three processes: preprocessing process, multi-information extraction process, and fusion optimization process. The frequency domain features of the image are calculated, the algorithm calculates the frequency domain features of the image, introduces power law transform and feature normalization, improves the frequency domain features of the image, saves the information of the target region, and inhibits the information of the background region. On three public MSRA, SED2, and ECSSD image datasets, the proposed algorithm is compared with other classical algorithms in subjective and objective comparison experiments. Experimental results show that the proposed algorithm can not only accurately and comprehensively extract significant target regions but also retain more texture information and complete edge information while satisfying the human visual experience. All evaluation indexes are significantly better than the comparison algorithm, showing good reliability and adaptability.

...read moreread less

Journal Article•DOI•

AGRFNet: Two-stage cross-modal and multi-level attention gated recurrent fusion network for RGB-D saliency detection

[...]

Zhengyi Liu, Yuan Wang, Yacheng Tan, Wei Li, Yun Xiao - Show less +1 more

01 Mar 2022-Signal Processing-image Communication

TL;DR: Zhang et al. as discussed by the authors proposed an Attention Gated Recurrent Unit (AGRU) for RGB-D saliency detection, which can reduce the influence of low-quality depth image, and retain more semantic features in the progressive fusion process.

...read moreread less

Abstract: RGB-D saliency detection aims to identify the most attractive objects in a pair of color and depth images. However, most existing models adopt classic U-Net framework which progressively decodes two-stream features. In this paper, we decode the cross-modal and multi-level features in a unified unit, named Attention Gated Recurrent Unit (AGRU). It can reduce the influence of low-quality depth image, and retain more semantic features in the progressive fusion process. Specifically, the features of different modalities and different levels are organized as the sequential input, recurrently fed into AGRU which consists of reset gate, update gate and memory unit to be selectively fused and adaptively memorized based on attention mechanism. Further, two-stage AGRU serves as the decoder of RGB-D salient object detection network, named AGRFNet. Due to the recurrent nature, it achieves the best performance with the little parameters. In order to further improve the performance, three auxiliary modules are designed to better fuse semantic information, refine the features of the shallow layer and enhance the local detail. Extensive experiments on seven widely used benchmark datasets demonstrate that AGRFNet performs favorably against 18 state-of-the-art RGB-D SOD approaches.

...read moreread less

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Saliency Detection via Graph-Based Manifold Ranking

[...]

Chuan Yang¹, Lihe Zhang¹, Huchuan Lu¹, Xiang Ruan², Ming-Hsuan Yang³ - Show less +1 more•Institutions (3)

Dalian University of Technology¹, Omron², University of California, Merced³

23 Jun 2013

TL;DR: This work considers both foreground and background cues in a different way and ranks the similarity of the image elements with foreground cues or background cues via graph-based manifold ranking, defined based on their relevances to the given seeds or queries.

...read moreread less

Abstract: Most existing bottom-up methods measure the foreground saliency of a pixel or region based on its contrast within a local context or the entire image, whereas a few methods focus on segmenting out background regions and thereby salient objects Instead of considering the contrast between the salient objects and their surrounding regions, we consider both foreground and background cues in a different way We rank the similarity of the image elements (pixels or regions) with foreground cues or background cues via graph-based manifold ranking The saliency of the image elements is defined based on their relevances to the given seeds or queries We represent the image as a close-loop graph with super pixels as nodes These nodes are ranked based on the similarity to background and foreground queries, based on affinity matrices Saliency detection is carried out in a two-stage scheme to extract background regions and foreground salient objects efficiently Experimental results on two large benchmark databases demonstrate the proposed method performs well when against the state-of-the-art methods in terms of accuracy and speed We also create a more difficult benchmark database containing 5,172 images to test the proposed saliency model and make this database publicly available with this paper for further studies in the saliency field

...read moreread less

2,278 citations

Proceedings Article•DOI•

Holistically-Nested Edge Detection

[...]

Saining Xie¹, Zhuowen Tu¹•Institutions (1)

University of California, San Diego¹

07 Dec 2015

TL;DR: HED turns pixel-wise edge classification into image-to-image prediction by means of a deep learning model that leverages fully convolutional neural networks and deeply-supervised nets to approach the human ability to resolve the challenging ambiguity in edge and object boundary detection.

...read moreread less

Abstract: We develop a new edge detection algorithm that addresses two critical issues in this long-standing vision problem: (1) holistic image training, and (2) multi-scale feature learning. Our proposed method, holistically-nested edge detection (HED), turns pixel-wise edge classification into image-to-image prediction by means of a deep learning model that leverages fully convolutional neural networks and deeply-supervised nets. HED automatically learns rich hierarchical representations (guided by deep supervision on side responses) that are crucially important in order to approach the human ability to resolve the challenging ambiguity in edge and object boundary detection. We significantly advance the state-of-the-art on the BSD500 dataset (ODS F-score of 0.782) and the NYU Depth dataset (ODS F-score of 0.746), and do so with an improved speed (0.4 second per image) that is orders of magnitude faster than recent CNN-based edge detection algorithms.

...read moreread less

2,173 citations

Journal Article•DOI•

Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search.

[...]

Antonio Torralba¹, Aude Oliva¹, Monica S. Castelhano², John M. Henderson³•Institutions (3)

Massachusetts Institute of Technology¹, University of Massachusetts Amherst², University of Edinburgh³

01 Oct 2006-Psychological Review

TL;DR: An original approach of attentional guidance by global scene context is presented that combines bottom-up saliency, scene context, and top-down mechanisms at an early stage of visual processing and predicts the image regions likely to be fixated by human observers performing natural search tasks in real-world scenes.

...read moreread less

Abstract: Many experiments have shown that the human visual system makes extensive use of contextual information for facilitating object search in natural scenes. However, the question of how to formally model contextual influences is still open. On the basis of a Bayesian framework, the authors present an original approach of attentional guidance by global scene context. The model comprises 2 parallel pathways; one pathway computes local features (saliency) and the other computes global (scene-centered) features. The contextual guidance model of attention combines bottom-up saliency, scene context, and top-down mechanisms at an early stage of visual processing and predicts the image regions likely to be fixated by human observers performing natural search tasks in real-world scenes.

...read moreread less

1,613 citations

Journal Article•DOI•

Salient Object Detection: A Benchmark

[...]

Ali Borji¹, Ming-Ming Cheng², Huaizu Jiang³, Jia Li⁴•Institutions (4)

University of Wisconsin–Milwaukee¹, University of Oxford², University of Massachusetts Amherst³, Beihang University⁴

07 Oct 2015-IEEE Transactions on Image Processing

TL;DR: It is found that the models designed specifically for salient object detection generally work better than models in closely related areas, which provides a precise definition and suggests an appropriate treatment of this problem that distinguishes it from other problems.

...read moreread less

Abstract: We extensively compare, qualitatively and quantitatively, 41 state-of-the-art models (29 salient object detection, 10 fixation prediction, 1 objectness, and 1 baseline) over seven challenging data sets for the purpose of benchmarking salient object detection and segmentation methods. From the results obtained so far, our evaluation shows a consistent rapid progress over the last few years in terms of both accuracy and running time. The top contenders in this benchmark significantly outperform the models identified as the best in the previous benchmark conducted three years ago. We find that the models designed specifically for salient object detection generally work better than models in closely related areas, which in turn provides a precise definition and suggests an appropriate treatment of this problem that distinguishes it from other problems. In particular, we analyze the influences of center bias and scene complexity in model performance, which, along with the hard cases for the state-of-the-art models, provide useful hints toward constructing more challenging large-scale data sets and better saliency models. Finally, we propose probable solutions for tackling several open problems, such as evaluation scores and data set bias, which also suggest future research directions in the rapidly growing field of salient object detection.

...read moreread less

1,372 citations

Proceedings Article•DOI•

The Secrets of Salient Object Segmentation

[...]

Yin Li¹, Xiaodi Hou², Christof Koch³, James M. Rehg¹, Alan L. Yuille⁴ - Show less +1 more•Institutions (4)

Georgia Institute of Technology¹, California Institute of Technology², Allen Institute for Brain Science³, University of California, Los Angeles⁴

23 Jun 2014

TL;DR: An extensive evaluation of fixation prediction and salient object segmentation algorithms as well as statistics of major datasets identifies serious design flaws of existing salient object benchmarks and proposes a new high quality dataset that offers both fixation and salient objects segmentation ground-truth.

...read moreread less

Abstract: In this paper we provide an extensive evaluation of fixation prediction and salient object segmentation algorithms as well as statistics of major datasets. Our analysis identifies serious design flaws of existing salient object benchmarks, called the dataset design bias, by over emphasising the stereotypical concepts of saliency. The dataset design bias does not only create the discomforting disconnection between fixations and salient object segmentation, but also misleads the algorithm designing. Based on our analysis, we propose a new high quality dataset that offers both fixation and salient object segmentation ground-truth. With fixations and salient object being presented simultaneously, we are able to bridge the gap between fixations and salient objects, and propose a novel method for salient object segmentation. Finally, we report significant benchmark progress on 3 existing datasets of segmenting salient objects.

...read moreread less

1,089 citations