scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Saliency Detection via Dense and Sparse Reconstruction

01 Dec 2013-pp 2976-2983
TL;DR: A visual saliency detection algorithm from the perspective of reconstruction errors that applies the Bayes formula to integrate saliency measures based on dense and sparse reconstruction errors and refined by an object-biased Gaussian model is proposed.
Abstract: In this paper, we propose a visual saliency detection algorithm from the perspective of reconstruction errors. The image boundaries are first extracted via super pixels as likely cues for background templates, from which dense and sparse appearance models are constructed. For each image region, we first compute dense and sparse reconstruction errors. Second, the reconstruction errors are propagated based on the contexts obtained from K-means clustering. Third, pixel-level saliency is computed by an integration of multi-scale reconstruction errors and refined by an object-biased Gaussian model. We apply the Bayes formula to integrate saliency measures based on dense and sparse reconstruction errors. Experimental results show that the proposed algorithm performs favorably against seventeen state-of-the-art methods in terms of precision and recall. In addition, the proposed algorithm is demonstrated to be more effective in highlighting salient objects uniformly and robust to background noise.

Summary (2 min read)

1. Introduction

  • The main motivation of the present paper stems from the current debate on the weak productivity performance of European countries as opposed to that recorded, especially during the second half of the 1990s, in the US.
  • Among the largest European countries, Italy has experienced, over the last decade, a slowdown in total factor productivity (TFP) that has been particularly evident in manufacturing.
  • These findings support the need of a Lisbon-type of policy by suggesting that, if European countries want to attain the same productivity enhancement recorded among US industries, they must invest much more resources in R&D.
  • According to their explanation, the weak relationship between R&D and TFP arising for Italy is not due to structural (and, as such, not easily modifiable) features but is the outcome of a decade of slowdown in R&D investment (the Nineties) after a decade of remarkable expansion (the Eighties).
  • Section 6 performs a supplementary analysis of the Italian case.

2. The productivity slowdown in the EU and Italian manufacturing: explanations and remedies

  • The Lisbon strategy, launched by the European Council in 2000, is mainly based on the idea that the slow rate of productivity growth recorded during the last decade by EU countries (as opposed to the productivity revival experienced in the US) depends primarily upon their lower endowments of knowledge capital.
  • A large body of empirical evidence, either among countries (European Commission, 2005) or across industries and firms (Griliches, 1995; Wieser, 2005), supports the above arguments: no matter the units of observation, the returns from R&D investments are substantial and provide to the performing units permanent rather than transitory advantages.
  • By extending the analysis up to 2003, Daveri and Jona-Lasinio (2006) confirm the above findings: the labour productivity slowdown experienced in Italy during the last decade is mainly due to TFP and the growth of the latter declined particularly (albeit not only) in manufacturing.
  • The authors employ data on twelve manufacturing industries over a period of twenty-three years (i.e. from 1980 to 2002) to perform separate econometric analyses for five of the major OECD states: namely, US, Germany, France, Spain and Italy.
  • On average, the four European countries account for two thirds of the EU-15 levels of R&D expenditure and GDP, so that, although a special focus shall be devoted to Italy, the findings and policy implications of their analysis will refer to a large portion of Europe.

3. Analytical framework and data description

  • It must be stressed that, as far as the labour and capital inputs specifically devoted to R&D are already included in L and K (i.e. the inputs on the right-hand side of equation [1] are not corrected for double counting), θ must be interpreted as the excess elasticity of value added with respect to R&D capital.
  • For Italian industries, instead, the authors used the R&D data coming from ISTAT (the Italian Statistical Office), mainly because3 they are available since the late sixties, and then allowed us to extend the analysis of the Italian case to a longer period of time (see section 6).
  • Net and gross fixed capital stocks are built by OECD through a permanent inventory method accounting for the age and efficiency profile of different capital assets.
  • To be added is that the R&D stock in 1973 is computed by applying the same procedure described above although, in this case, the authors assume a depreciation rate of 15 per cent, constant across industries.

4. Comparing R&D performances across industries and countries

  • To highlight the differences among industries and countries, the authors first employ the share of R&D expenditures on value added (at current prices).
  • On the contrary, in all the industries considered Italy attained a much lower intensity of R&D outlays: only in Transport Equipment and only when opposed to Germany the Italian gap is not pronounced.
  • It must be pointed out that the above aggregate trends are the result of quite different performances at industry level.
  • The US, as opposed to Germany and France, have maintained the lead in manufacturing R&D by concentrating, over the last years, their research efforts in the ICT industry.
  • Spain, albeit remaining a backward country in terms of R&D intensity, has behaved as a typical technological follower might do, i.e. in almost all the industries it has continuously increased its R&D investment.

5. Estimation procedure and results across countries

  • In this section the authors perform an estimation of the long-run relationship between TFP and R&D capital stock.
  • Cameron (2005) uses a similar framework to estimate the impact of R&D and human capital on the productivity gap between eleven Japanese and US industries observed during 1963-1989.
  • From the bottom part of Table 4 it emerges that when the presence of structural breaks is not taken into consideration, the statistics τN rejects the null hypothesis of no cointegration for the US, France and Italy, but not for Germany7 and Spain.
  • Moving to the SUR estimate of the ECM described in equation [4], it should be added that the authors make a small-sample adjustment to calculate the covariance matrix of residuals.
  • Using data for twenty-two manufacturing industries in fourteen OECD countries over the period 1972-94, Frantzen (2002) estimates an elasticity of TFP with respect to R&D equal to 0.34.

Did you find this useful? Give us your feedback

Figures (11)

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: It is found that the models designed specifically for salient object detection generally work better than models in closely related areas, which provides a precise definition and suggests an appropriate treatment of this problem that distinguishes it from other problems.
Abstract: We extensively compare, qualitatively and quantitatively, 41 state-of-the-art models (29 salient object detection, 10 fixation prediction, 1 objectness, and 1 baseline) over seven challenging data sets for the purpose of benchmarking salient object detection and segmentation methods. From the results obtained so far, our evaluation shows a consistent rapid progress over the last few years in terms of both accuracy and running time. The top contenders in this benchmark significantly outperform the models identified as the best in the previous benchmark conducted three years ago. We find that the models designed specifically for salient object detection generally work better than models in closely related areas, which in turn provides a precise definition and suggests an appropriate treatment of this problem that distinguishes it from other problems. In particular, we analyze the influences of center bias and scene complexity in model performance, which, along with the hard cases for the state-of-the-art models, provide useful hints toward constructing more challenging large-scale data sets and better saliency models. Finally, we propose probable solutions for tackling several open problems, such as evaluation scores and data set bias, which also suggest future research directions in the rapidly growing field of salient object detection.

1,372 citations


Additional excerpts

  • ...E .528 15 GMR [78] CVPR 2013 M .149 16 DRFI [79] CVPR 2013 C .697 17 PCA [80] CVPR 2013 M + C 4.34 18 LBI [81] CVPR 2013 M + C 251. 19 GC [82] ICCV 2013 C .037 20 CHM [83] ICCV 2013 M + C 15.4 21 DSR [84] ICCV 2013 M + C 10.2 22 MC [85] ICCV 2013 M + C .195 23 UFO [86] ICCV 2013 M + C 20.3 24 MNP [52] Vis.Comp. 2013 M + C 21.0 25 GR [87] SPL 2013 M + C 1.35 26 RBD [88] CVPR 2014 M .269 27 HDCT [89] CV...

    [...]

Proceedings ArticleDOI
23 Jun 2013
TL;DR: This paper regards saliency map computation as a regression problem, which is based on multi-level image segmentation, and uses the supervised learning approach to map the regional feature vector to a saliency score, and finally fuses the saliency scores across multiple levels, yielding the salency map.
Abstract: Salient object detection has been attracting a lot of interest, and recently various heuristic computational models have been designed. In this paper, we regard saliency map computation as a regression problem. Our method, which is based on multi-level image segmentation, uses the supervised learning approach to map the regional feature vector to a saliency score, and finally fuses the saliency scores across multiple levels, yielding the saliency map. The contributions lie in two-fold. One is that we show our approach, which integrates the regional contrast, regional property and regional background ness descriptors together to form the master saliency map, is able to produce superior saliency maps to existing algorithms most of which combine saliency maps heuristically computed from different types of features. The other is that we introduce a new regional feature vector, background ness, to characterize the background, which can be regarded as a counterpart of the objectness descriptor [2]. The performance evaluation on several popular benchmark data sets validates that our approach outperforms existing state-of-the-arts.

1,057 citations


Cites methods from "Saliency Detection via Dense and Sp..."

  • ...To save the space, we only consider the top four models ranked in the survey [23]: SVO [51], CA [17], CB [32], and RC [15] and recently-developed methods: SF [21], LRK [78], HS [33], GMR [48], PCA [31], MC [50], DSR [49], RBD [55] that are not covered in [23]....

    [...]

Journal ArticleDOI
TL;DR: A new saliency method is proposed by introducing short connections to the skip-layer structures within the HED architecture, which produces state-of-the-art results on 5 widely tested salient object detection benchmarks, with advantages in terms of efficiency, effectiveness, and simplicity over the existing algorithms.
Abstract: Recent progress on salient object detection is substantial, benefiting mostly from the explosive development of Convolutional Neural Networks (CNNs). Semantic segmentation and salient object detection algorithms developed lately have been mostly based on Fully Convolutional Neural Networks (FCNs). There is still a large room for improvement over the generic FCN models that do not explicitly deal with the scale-space problem. The Holistically-Nested Edge Detector (HED) provides a skip-layer structure with deep supervision for edge and boundary detection, but the performance gain of HED on saliency detection is not obvious. In this paper, we propose a new salient object detection method by introducing short connections to the skip-layer structures within the HED architecture. Our framework takes full advantage of multi-level and multi-scale features extracted from FCNs, providing more advanced representations at each layer, a property that is critically needed to perform segment detection. Our method produces state-of-the-art results on 5 widely tested salient object detection benchmarks, with advantages in terms of efficiency (0.08 seconds per image), effectiveness, and simplicity over the existing algorithms. Beyond that, we conduct an exhaustive analysis of the role of training data on performance. We provide a training set for future research and fair comparisons.

1,041 citations


Cites methods from "Saliency Detection via Dense and Sp..."

  • ...Four classical methods are also considered including RC [41], CHM [61], DSR [62], and DRFI [25], which have been proven to be the best in the benchmark study of Borji et al....

    [...]

  • ...Ours DCL [35] DHS [36] RFCN [51] MDF [47] ELD [50] MC [49] DRFI [25] DSR [62] CHM [61] RC [41]...

    [...]

  • ...We also compare our approach with 4 classical methods: RC [7], CHM [31], DSR [32], and DRFI [24], which have been proven to be the best in the benchmark study of Borji et al. [1]....

    [...]

Proceedings ArticleDOI
07 Jun 2015
TL;DR: This paper proposes a multi-context deep learning framework for salient object detection that employs deep Convolutional Neural Networks to model saliency of objects in images and investigates different pre-training strategies to provide a better initialization for training the deep neural networks.
Abstract: Low-level saliency cues or priors do not produce good enough saliency detection results especially when the salient object presents in a low-contrast background with confusing visual appearance. This issue raises a serious problem for conventional approaches. In this paper, we tackle this problem by proposing a multi-context deep learning framework for salient object detection. We employ deep Convolutional Neural Networks to model saliency of objects in images. Global context and local context are both taken into account, and are jointly modeled in a unified multi-context deep learning framework. To provide a better initialization for training the deep neural networks, we investigate different pre-training strategies, and a task-specific pre-training scheme is designed to make the multi-context modeling suited for saliency detection. Furthermore, recently proposed contemporary deep models in the ImageNet Image Classification Challenge are tested, and their effectiveness in saliency detection are investigated. Our approach is extensively evaluated on five public datasets, and experimental results show significant and consistent improvements over the state-of-the-art methods.

983 citations


Cites background from "Saliency Detection via Dense and Sp..."

  • ...A large number of approaches [63, 52, 40, 39, 32, 35, 60, 57, 56, 47, 41, 31, 27, 25, 24, 23, 11, 44, 17, 8, 13, 1, 21] are proposed to capture different saliency cues....

    [...]

Proceedings ArticleDOI
21 Jul 2017
TL;DR: This paper develops a weakly supervised learning method for saliency detection using image-level tags only, which outperforms unsupervised ones with a large margin, and achieves comparable or even superior performance than fully supervised counterparts.
Abstract: Deep Neural Networks (DNNs) have substantially improved the state-of-the-art in salient object detection. However, training DNNs requires costly pixel-level annotations. In this paper, we leverage the observation that image-level tags provide important cues of foreground salient objects, and develop a weakly supervised learning method for saliency detection using image-level tags only. The Foreground Inference Network (FIN) is introduced for this challenging task. In the first stage of our training method, FIN is jointly trained with a fully convolutional network (FCN) for image-level tag prediction. A global smooth pooling layer is proposed, enabling FCN to assign object category tags to corresponding object regions, while FIN is capable of capturing all potential foreground regions with the predicted saliency maps. In the second stage, FIN is fine-tuned with its predicted saliency maps as ground truth. For refinement of ground truth, an iterative Conditional Random Field is developed to enforce spatial label consistency and further boost performance. Our method alleviates annotation efforts and allows the usage of existing large scale training sets with image-level tags. Our model runs at 60 FPS, outperforms unsupervised ones with a large margin, and achieves comparable or even superior performance than fully supervised counterparts.

909 citations


Cites methods from "Saliency Detection via Dense and Sp..."

  • ...We compare WSS with 16 existing methods, including 7 unsupervised ones: FT [1], DSR [28], HS [53], MR [53], wCtr [59], MBS [56], BSCA [40]; and 9 fully supervised ones: DRFI [17], HDCT [19], LEGS [46], MC [57], MDF [26], DS [29], SELD [25], DCL [27], RFCN [49]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: In this article, a visual attention system inspired by the behavior and the neuronal architecture of the early primate visual system is presented, where multiscale image features are combined into a single topographical saliency map.
Abstract: A visual attention system, inspired by the behavior and the neuronal architecture of the early primate visual system, is presented. Multiscale image features are combined into a single topographical saliency map. A dynamical neural network then selects attended locations in order of decreasing saliency. The system breaks down the complex problem of scene understanding by rapidly selecting, in a computationally efficient manner, conspicuous locations to be analyzed in detail.

10,525 citations

01 Jan 1998
TL;DR: A visual attention system, inspired by the behavior and the neuronal architecture of the early primate visual system, is presented, which breaks down the complex problem of scene understanding by rapidly selecting conspicuous locations to be analyzed in detail.

8,566 citations


"Saliency Detection via Dense and Sp..." refers background in this paper

  • ...Original IT [13] GB [10] SR [11] FT [2] CA [9] RA [18] DW [8] CB [14] RC [7] SVO [6] LR [19] DSR DSR cut GT...

    [...]

  • ...We evaluate the proposed algorithm with seventeen state-of-the-art algorithms including IT98 [13], MZ03 [24], LC06 [25], GB06 [10], SR07 [11], AC08 [1], FT09 [2], CA10 [9], RA10 [18], RC11 [7], CB11 [14], SVO11 [6], DW11 [8], SF12 [17], LR12 [19], GS12 [21] and XL13 [22] on three benchmark data sets: ASD, MSRA and SOD....

    [...]

  • ...[13] define visual attention as the local center-surround difference and propose a saliency model based on multi-scale image features....

    [...]

Journal ArticleDOI
TL;DR: Five important trends have emerged from recent work on computational models of focal visual attention that emphasize the bottom-up, image-based control of attentional deployment, providing a framework for a computational and neurobiological understanding of visual attention.
Abstract: Five important trends have emerged from recent work on computational models of focal visual attention that emphasize the bottom-up, image-based control of attentional deployment. First, the perceptual saliency of stimuli critically depends on the surrounding context. Second, a unique 'saliency map' that topographically encodes for stimulus conspicuity over the visual scene has proved to be an efficient and plausible bottom-up control strategy. Third, inhibition of return, the process by which the currently attended location is prevented from being attended again, is a crucial element of attentional deployment. Fourth, attention and eye movements tightly interplay, posing computational challenges with respect to the coordinate system used to control attention. And last, scene understanding and object recognition strongly constrain the selection of attended locations. Insights from these five key areas provide a framework for a computational and neurobiological understanding of visual attention.

4,485 citations


"Saliency Detection via Dense and Sp..." refers background in this paper

  • ...Numerous biologically plausible models have been developed to explain the cognitive process of humans and animals [12]....

    [...]

Proceedings ArticleDOI
20 Jun 2009
TL;DR: This paper introduces a method for salient region detection that outputs full resolution saliency maps with well-defined boundaries of salient objects that outperforms the five algorithms both on the ground-truth evaluation and on the segmentation task by achieving both higher precision and better recall.
Abstract: Detection of visually salient image regions is useful for applications like object segmentation, adaptive compression, and object recognition. In this paper, we introduce a method for salient region detection that outputs full resolution saliency maps with well-defined boundaries of salient objects. These boundaries are preserved by retaining substantially more frequency content from the original image than other existing techniques. Our method exploits features of color and luminance, is simple to implement, and is computationally efficient. We compare our algorithm to five state-of-the-art salient region detection methods with a frequency domain analysis, ground truth, and a salient object segmentation application. Our method outperforms the five algorithms both on the ground-truth evaluation and on the segmentation task by achieving both higher precision and better recall.

3,723 citations


"Saliency Detection via Dense and Sp..." refers background or methods or result in this paper

  • ...The evaluation of F-measure is similar to [2]....

    [...]

  • ...Original IT [13] GB [10] SR [11] FT [2] CA [9] RA [18] DW [8] CB [14] RC [7] SVO [6] LR [19] DSR DSR cut GT...

    [...]

  • ...The ASD database [2] includes 1000 images selected from the MSRA database, where each image is manually segmented into foreground and background....

    [...]

  • ...We evaluate the proposed algorithm with seventeen state-of-the-art algorithms including IT98 [13], MZ03 [24], LC06 [25], GB06 [10], SR07 [11], AC08 [1], FT09 [2], CA10 [9], RA10 [18], RC11 [7], CB11 [14], SVO11 [6], DW11 [8], SF12 [17], LR12 [19], GS12 [21] and XL13 [22] on three benchmark data sets: ASD, MSRA and SOD....

    [...]

Proceedings ArticleDOI
20 Jun 2011
TL;DR: This work proposes a regional contrast based saliency extraction algorithm, which simultaneously evaluates global contrast differences and spatial coherence, and consistently outperformed existing saliency detection methods.
Abstract: Automatic estimation of salient object regions across images, without any prior assumption or knowledge of the contents of the corresponding scenes, enhances many computer vision and computer graphics applications. We introduce a regional contrast based salient object detection algorithm, which simultaneously evaluates global contrast differences and spatial weighted coherence scores. The proposed algorithm is simple, efficient, naturally multi-scale, and produces full-resolution, high-quality saliency maps. These saliency maps are further used to initialize a novel iterative version of GrabCut, namely SaliencyCut, for high quality unsupervised salient object segmentation. We extensively evaluated our algorithm using traditional salient object detection datasets, as well as a more challenging Internet image dataset. Our experimental results demonstrate that our algorithm consistently outperforms 15 existing salient object detection and segmentation methods, yielding higher precision and better recall rates. We also show that our algorithm can be used to efficiently extract salient object masks from Internet images, enabling effective sketch-based image retrieval (SBIR) via simple shape comparisons. Despite such noisy internet images, where the saliency regions are ambiguous, our saliency guided image retrieval achieves a superior retrieval rate compared with state-of-the-art SBIR methods, and additionally provides important target object region information.

3,653 citations


"Saliency Detection via Dense and Sp..." refers background or methods in this paper

  • ...We evaluate the proposed algorithm with seventeen state-of-the-art algorithms including IT98 [13], MZ03 [24], LC06 [25], GB06 [10], SR07 [11], AC08 [1], FT09 [2], CA10 [9], RA10 [18], RC11 [7], CB11 [14], SVO11 [6], DW11 [8], SF12 [17], LR12 [19], GS12 [21] and XL13 [22] on three benchmark data sets: ASD, MSRA and SOD....

    [...]

  • ...The approach in [7] (referred as RC11) is also presented as a baseline model for comparisons....

    [...]

  • ...Figure 7(a) shows that the sparse reconstruction error based on background templates achieves better accuracy in detecting salient objects than RC11 [7], while the dense one is comparable with it....

    [...]

  • ...Recent methods [7, 8] measure global contrast-based saliency based on spatially weighted feature dissimilarities....

    [...]

  • ...Original IT [13] GB [10] SR [11] FT [2] CA [9] RA [18] DW [8] CB [14] RC [7] SVO [6] LR [19] DSR DSR cut GT...

    [...]

Frequently Asked Questions (8)
Q1. What contributions have the authors mentioned in the paper "Saliency detection via dense and sparse reconstruction" ?

In this paper, the authors propose a visual saliency detection algorithm from the perspective of reconstruction errors. 

The eigenvectors from the normalized covariance matrix of B, UB = [u1,u2, ...,uD′ ], corresponding to the largest D′ eigenvalues, are computed to form the PCA bases of the background templates. 

The reconstruction error of a pixel is assigned by integrating the multiscale reconstruction errors, which helps generate more accurate and uniform saliency maps. 

To combine the two saliency maps via dense and sparse reconstruction, the authors introduce a Bayesian integration method which performs better than the conventional integration strategy. 

For multi-scale reconstruction errors, the authors generate superpixels at eight different scales respectively with 50 to 400 superpixels. 

The authors evaluate the performance of Bayesian integrated saliency map SB by comparing it with the integration strategies formulated in [5]:Sc = 1 Z ∑ i Q (Si) or Sc = 1Z ∏ i Q (Si), (15)where Z is the partition function. 

The authors integrate multi-scale reconstruction errors and compute the pixellevel reconstruction error byE(z) =Ns∑ s=1 ωzn(s) ε̃n(s)Ns∑ s=1 ωzn(s) , ωzn(s) = 1 ‖fz − xn(s)‖2 , (7)where fz is a D-dimensional feature of pixel z and n(s) denotes the label of the segment containing pixel z at scale s. Similarly to [14], the authors utilize the similarity between pixel z and its corresponding segment n(s) as the weight to average the multi-scale reconstruction errors. 

The propagated reconstruction error of segment i belonging to cluster k (k = 1, 2, ...,K), is modified by considering its appearance-based context consisting of the other segments in cluster k as follows:ε̃i = τ Nc∑ j=1 wikj ε̃kj + (1− τ) εi, (5)wikj = exp(−‖xi−xkj‖22σx2 ) (1− δ (kj − i))Nc∑ j=1 exp(−‖xi−xkj‖ 2 2σx2 ), (6)where {k1, k2, ..., kNc} denote the Nc segment labels in cluster k and τ is a weight parameter.