scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Salient Object Detection via Structured Matrix Decomposition

TL;DR: A novel structured matrix decomposition model with two structural regularizations that captures the image structure and enforces patches from the same object to have similar saliency values, and a Laplacian regularization that enlarges the gaps between salient objects and the background in feature space is proposed.
Abstract: Low-rank recovery models have shown potential for salient object detection, where a matrix is decomposed into a low-rank matrix representing image background and a sparse matrix identifying salient objects. Two deficiencies, however, still exist. First, previous work typically assumes the elements in the sparse matrix are mutually independent, ignoring the spatial and pattern relations of image regions. Second, when the low-rank and sparse matrices are relatively coherent, e.g., when there are similarities between the salient objects and background or when the background is complicated, it is difficult for previous models to disentangle them. To address these problems, we propose a novel structured matrix decomposition model with two structural regularizations: (1) a tree-structured sparsity-inducing regularization that captures the image structure and enforces patches from the same object to have similar saliency values, and (2) a Laplacian regularization that enlarges the gaps between salient objects and the background in feature space. Furthermore, high-level priors are integrated to guide the matrix decomposition and boost the detection. We evaluate our model for salient object detection on five challenging datasets including single object, multiple objects and complex scene images, and show competitive results as compared with 24 state-of-the-art methods in terms of seven performance metrics.
Citations
More filters
Posted Content
TL;DR: This paper reviews deep SOD algorithms from different perspectives, including network architecture, level of supervision, learning paradigm, and object-/instance-level detection, and looks into the generalization and difficulty of existing SOD datasets.
Abstract: As an essential problem in computer vision, salient object detection (SOD) has attracted an increasing amount of research attention over the years. Recent advances in SOD are predominantly led by deep learning-based solutions (named deep SOD). To enable in-depth understanding of deep SOD, in this paper, we provide a comprehensive survey covering various aspects, ranging from algorithm taxonomy to unsolved issues. In particular, we first review deep SOD algorithms from different perspectives, including network architecture, level of supervision, learning paradigm, and object-/instance-level detection. Following that, we summarize and analyze existing SOD datasets and evaluation metrics. Then, we benchmark a large group of representative SOD models, and provide detailed analyses of the comparison results. Moreover, we study the performance of SOD algorithms under different attribute settings, which has not been thoroughly explored previously, by constructing a novel SOD dataset with rich attribute annotations covering various salient object types, challenging factors, and scene categories. We further analyze, for the first time in the field, the robustness of SOD models to random input perturbations and adversarial attacks. We also look into the generalization and difficulty of existing SOD datasets. Finally, we discuss several open issues of SOD and outline future research directions.

428 citations

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper reviewed different types of saliency detection algorithms, summarize the important issues of the existing methods, and discuss the existent problems and future works, and the experimental analysis and discussion are conducted to provide a holistic overview of different saliency detectors.
Abstract: The visual saliency detection model simulates the human visual system to perceive the scene and has been widely used in many vision tasks. With the development of acquisition technology, more comprehensive information, such as depth cue, inter-image correspondence, or temporal relationship, is available to extend image saliency detection to RGBD saliency detection, co-saliency detection, or video saliency detection. The RGBD saliency detection model focuses on extracting the salient regions from RGBD images by combining the depth information. The co-saliency detection model introduces the inter-image correspondence constraint to discover the common salient object in an image group. The goal of the video saliency detection model is to locate the motion-related salient object in video sequences, which considers the motion cue and spatiotemporal constraint jointly. In this paper, we review different types of saliency detection algorithms, summarize the important issues of the existing methods, and discuss the existent problems and future works. Moreover, the evaluation datasets and quantitative measurements are briefly introduced, and the experimental analysis and discussion are conducted to provide a holistic overview of different saliency detection methods.

328 citations

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a weakly supervised color transfer method to correct color distortion, which relaxes the need for paired underwater images for training and allows the underwater images being taken in unknown locations.
Abstract: Underwater vision suffers from severe effects due to selective attenuation and scattering when light propagates through water. Such degradation not only affects the quality of underwater images, but limits the ability of vision tasks. Different from existing methods that either ignore the wavelength dependence on the attenuation or assume a specific spectral profile, we tackle color distortion problem of underwater images from a new view. In this letter, we propose a weakly supervised color transfer method to correct color distortion. The proposed method relaxes the need for paired underwater images for training and allows the underwater images being taken in unknown locations. Inspired by cycle-consistent adversarial networks, we design a multiterm loss function including adversarial loss, cycle consistency loss, and structural similarity index measure loss, which makes the content and structure of the outputs same as the inputs, meanwhile the color is similar to the images that were taken without the water. Experiments on underwater images captured under diverse scenes show that our method produces visually pleasing results, even outperforms the state-of-the-art methods. Besides, our method can improve the performance of vision tasks.

308 citations

Journal ArticleDOI
TL;DR: An attention steered interweave fusion network (ASIF-Net) is proposed to detect salient objects, which progressively integrates cross-modal and cross-level complementarity from the RGB image and corresponding depth map via steering of an attention mechanism.
Abstract: Salient object detection from RGB-D images is an important yet challenging vision task, which aims at detecting the most distinctive objects in a scene by combining color information and depth constraints. Unlike prior fusion manners, we propose an attention steered interweave fusion network (ASIF-Net) to detect salient objects, which progressively integrates cross-modal and cross-level complementarity from the RGB image and corresponding depth map via steering of an attention mechanism. Specifically, the complementary features from RGB-D images are jointly extracted and hierarchically fused in a dense and interweaved manner. Such a manner breaks down the barriers of inconsistency existing in the cross-modal data and also sufficiently captures the complementarity. Meanwhile, an attention mechanism is introduced to locate the potential salient regions in an attention-weighted fashion, which advances in highlighting the salient objects and suppressing the cluttered background regions. Instead of focusing only on pixelwise saliency, we also ensure that the detected salient objects have the objectness characteristics (e.g., complete structure and sharp boundary) by incorporating the adversarial learning that provides a global semantic constraint for RGB-D salient object detection. Quantitative and qualitative experiments demonstrate that the proposed method performs favorably against 17 state-of-the-art saliency detectors on four publicly available RGB-D salient object detection datasets. The code and results of our method are available at https://github.com/Li-Chongyi/ASIF-Net .

188 citations


Cites background or methods from "Salient Object Detection via Struct..."

  • ...[13] formulated saliency detection as a structured matrix decomposition problem guided by high-level priors....

    [...]

  • ..., the Athena sculpture in the seventh image, the white cat in the fourth last image, and the cake in the second last image) are not effectively detected by the SMD [13] and RCRR [14] methods....

    [...]

  • ...state-of-the-art methods on four datasets, including four unsupervised RGB saliency detection methods (DSG [11], MILPS [12], SMD [13], and RCRR [14]), three deep-learning-...

    [...]

  • ...For example, the salient objects (e.g., the Athena sculpture in the seventh image, the white cat in the fourth last image, and the cake in the second last image) are not effectively detected by the SMD [13] and RCRR [14] methods....

    [...]

  • ...We extensively compare the proposed method with 17 state-of-the-art methods on four datasets, including four unsupervised RGB saliency detection methods (DSG [11], MILPS [12], SMD [13], and RCRR [14]), three deep-learningbased RGB saliency detection methods (DCL [15], DSS [16], and R3Net [19]), three unsupervised RGB-D saliency detection methods (ACSD [56], DCMC [58], and MBP [55]), and seven deep-learning-based RGB-D saliency detection methods (DF [63], CTMF [47], PCFN [64], MMCI [65], CPFP [67], DMRA [68], and TANet [66]....

    [...]

Journal ArticleDOI
TL;DR: A cascaded R-CNN to obtain the multiscale features in pyramids to solve the undetection and false detection of traffic sign detection and the data augment method expands the German traffic sign training dataset by simulation of complex environment changes.
Abstract: In recent years, the deep learning is applied to the field of traffic sign detection methods which achieves excellent performance. However, there are two main challenges in traffic sign detection to be solve urgently. For one thing, some traffic signs of small size are more difficult to detect than those of large size so that the small traffic signs are undetected. For another, some false signs are always detected because of interferences caused by the illumination variation, bad weather and some signs similar to the true traffic signs. Therefore, to solve the undetection and false detection, we first propose a cascaded R-CNN to obtain the multiscale features in pyramids. Each layer of the cascaded network except the first layer fuses the output bounding box of the previous one layer for joint training. This method contributes to the traffic sign detection. Then, we propose a multiscale attention method to obtain the weighted multiscale features by dot-product and softmax, which is summed to fine the features to highlight the traffic sign features and improve the accuracy of the traffic sign detection. Finally, we increase the number of difficult negative samples for dataset balance and data augmentation in the training to relieve the interference by complex environment and similar false traffic signs. The data augment method expands the German traffic sign training dataset by simulation of complex environment changes. We conduct numerous experiments to verify the effectiveness of our proposed algorithm. The accuracy and recall rate of our method are 98.7% and 90.5% in GTSDB, 99.7% and 83.62% in CCTSDB and 98.9% and 85.6% in Lisa dataset respectively.

182 citations


Cites result from "Salient Object Detection via Struct..."

  • ...[17], HOG+SVM [30], CCNN [32], RBD [48], SMD [49], and SRM [50], the results are shown in Table 3, and the specific experimental details are shown in Figure 10....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A new hypothesis about the role of focused attention is proposed, which offers a new set of criteria for distinguishing separable from integral features and a new rationale for predicting which tasks will show attention limits and which will not.

11,452 citations


"Salient Object Detection via Struct..." refers background in this paper

  • ...The foundation of most saliency detection algorithms can be traced back to the theories of center-surround difference [38] and multiple feature integration [39]....

    [...]

Journal ArticleDOI
TL;DR: In this article, a visual attention system inspired by the behavior and the neuronal architecture of the early primate visual system is presented, where multiscale image features are combined into a single topographical saliency map.
Abstract: A visual attention system, inspired by the behavior and the neuronal architecture of the early primate visual system, is presented. Multiscale image features are combined into a single topographical saliency map. A dynamical neural network then selects attended locations in order of decreasing saliency. The system breaks down the complex problem of scene understanding by rapidly selecting, in a computationally efficient manner, conspicuous locations to be analyzed in detail.

10,525 citations

01 Jan 1998
TL;DR: A visual attention system, inspired by the behavior and the neuronal architecture of the early primate visual system, is presented, which breaks down the complex problem of scene understanding by rapidly selecting conspicuous locations to be analyzed in detail.

8,566 citations


"Salient Object Detection via Struct..." refers background in this paper

  • ...[11], who derive saliency from the difference of Gaussians on multiple feature maps....

    [...]

  • ...Bottom-up models [7], [11]–[17] are stimulus-driven and essentially based upon local and/or global centersurround difference, using low-level features, such as color, texture and location....

    [...]

Journal ArticleDOI
TL;DR: A new superpixel algorithm is introduced, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels and is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.
Abstract: Computer vision applications have come to rely increasingly on superpixels in recent years, but it is not always clear what constitutes a good superpixel algorithm. In an effort to understand the benefits and drawbacks of existing methods, we empirically compare five state-of-the-art superpixel algorithms for their ability to adhere to image boundaries, speed, memory efficiency, and their impact on segmentation performance. We then introduce a new superpixel algorithm, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels. Despite its simplicity, SLIC adheres to boundaries as well as or better than previous methods. At the same time, it is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.

7,849 citations


"Salient Object Detection via Struct..." refers methods in this paper

  • ...Then, we perform the simple linear iterative clustering (SLIC) algorithm [81] to over-segment the image into N atom patches (superpixels) P = {P1, P2, · · · , PN}....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors prove that under some suitable assumptions, it is possible to recover both the low-rank and the sparse components exactly by solving a very convenient convex program called Principal Component Pursuit; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the e1 norm.
Abstract: This article is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a low-rank component and a sparse component. Can we recover each component individuallyq We prove that under some suitable assumptions, it is possible to recover both the low-rank and the sparse components exactly by solving a very convenient convex program called Principal Component Pursuit; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the e1 norm. This suggests the possibility of a principled approach to robust principal component analysis since our methodology and results assert that one can recover the principal components of a data matrix even though a positive fraction of its entries are arbitrarily corrupted. This extends to the situation where a fraction of the entries are missing as well. We discuss an algorithm for solving this optimization problem, and present applications in the area of video surveillance, where our methodology allows for the detection of objects in a cluttered background, and in the area of face recognition, where it offers a principled way of removing shadows and specularities in images of faces.

6,783 citations

Trending Questions (1)
Where is the matrix 4 being filmed?

Low-rank recovery models have shown potential for salient object detection, where a matrix is decomposed into a low-rank matrix representing image background and a sparse matrix identifying salient objects.