scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

GrabCut in One Cut

01 Dec 2013-pp 1769-1776
TL;DR: This work proposes a new energy term explicitly measuring L1 distance between the object and background appearance models that can be globally maximized in one graph cut and shows that in many applications this simple term makes NP-hard segmentation functionals unnecessary.
Abstract: Among image segmentation algorithms there are two major groups: (a) methods assuming known appearance models and (b) methods estimating appearance models jointly with segmentation. Typically, the first group optimizes appearance log-likelihoods in combination with some spacial regularization. This problem is relatively simple and many methods guarantee globally optimal results. The second group treats model parameters as additional variables transforming simple segmentation energies into high-order NP-hard functionals (Zhu-Yuille, Chan-Vese, Grab Cut, etc). It is known that such methods indirectly minimize the appearance overlap between the segments. We propose a new energy term explicitly measuring L1 distance between the object and background appearance models that can be globally maximized in one graph cut. We show that in many applications our simple term makes NP-hard segmentation functionals unnecessary. Our one cut algorithm effectively replaces approximate iterative optimization techniques based on block coordinate descent.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
20 Jun 2011
TL;DR: This work proposes a regional contrast based saliency extraction algorithm, which simultaneously evaluates global contrast differences and spatial coherence, and consistently outperformed existing saliency detection methods.
Abstract: Automatic estimation of salient object regions across images, without any prior assumption or knowledge of the contents of the corresponding scenes, enhances many computer vision and computer graphics applications. We introduce a regional contrast based salient object detection algorithm, which simultaneously evaluates global contrast differences and spatial weighted coherence scores. The proposed algorithm is simple, efficient, naturally multi-scale, and produces full-resolution, high-quality saliency maps. These saliency maps are further used to initialize a novel iterative version of GrabCut, namely SaliencyCut, for high quality unsupervised salient object segmentation. We extensively evaluated our algorithm using traditional salient object detection datasets, as well as a more challenging Internet image dataset. Our experimental results demonstrate that our algorithm consistently outperforms 15 existing salient object detection and segmentation methods, yielding higher precision and better recall rates. We also show that our algorithm can be used to efficiently extract salient object masks from Internet images, enabling effective sketch-based image retrieval (SBIR) via simple shape comparisons. Despite such noisy internet images, where the saliency regions are ambiguous, our saliency guided image retrieval achieves a superior retrieval rate compared with state-of-the-art SBIR methods, and additionally provides important target object region information.

3,653 citations


Cites methods from "GrabCut in One Cut"

  • ...Although several new methods [53], [54] have been developed since the initial version of this work [1], to the best of our knowledge, our salient object segmentation results are still the best results reported on the most widely used benchmark [33]....

    [...]

Proceedings ArticleDOI
18 Jun 2018
TL;DR: This paper explores the use of extreme points in an object as input to obtain precise object segmentation for images and videos by adding an extra channel to the image in the input of a convolutional neural network (CNN), which contains a Gaussian centered in each of the extreme points.
Abstract: This paper explores the use of extreme points in an object (left-most, right-most, top, bottom pixels) as input to obtain precise object segmentation for images and videos. We do so by adding an extra channel to the image in the input of a convolutional neural network (CNN), which contains a Gaussian centered in each of the extreme points. The CNN learns to transform this information into a segmentation of an object that matches those extreme points. We demonstrate the usefulness of this approach for guided segmentation (grabcut-style), interactive segmentation, video object segmentation, and dense segmentation annotation. We show that we obtain the most precise results to date, also with less user input, in an extensive and varied selection of benchmarks and datasets. All our models and code are publicly available on http://www.vision.ee.ethz.ch/~cvlsegmentation/dextr/.

371 citations

Posted Content
TL;DR: In this article, the authors use extreme points in an object (leftmost, right-most, top, bottom pixels) as input to obtain precise object segmentation for images and videos by adding an extra channel to the image in the input of a convolutional neural network.
Abstract: This paper explores the use of extreme points in an object (left-most, right-most, top, bottom pixels) as input to obtain precise object segmentation for images and videos. We do so by adding an extra channel to the image in the input of a convolutional neural network (CNN), which contains a Gaussian centered in each of the extreme points. The CNN learns to transform this information into a segmentation of an object that matches those extreme points. We demonstrate the usefulness of this approach for guided segmentation (grabcut-style), interactive segmentation, video object segmentation, and dense segmentation annotation. We show that we obtain the most precise results to date, also with less user input, in an extensive and varied selection of benchmarks and datasets. All our models and code are publicly available on this http URL.

145 citations

Proceedings ArticleDOI
15 Jun 2019
TL;DR: An interactive image segmentation algorithm, which accepts user-annotations about a target object and the background, is proposed and the backpropagating refinement scheme (BRS) is developed, which corrects the mislabeled pixels in the initial result.
Abstract: An interactive image segmentation algorithm, which accepts user-annotations about a target object and the background, is proposed in this work. We convert user-annotations into interaction maps by measuring distances of each pixel to the annotated locations. Then, we perform the forward pass in a convolutional neural network, which outputs an initial segmentation map. However, the user-annotated locations can be mislabeled in the initial result. Therefore, we develop the backpropagating refinement scheme (BRS), which corrects the mislabeled pixels. Experimental results demonstrate that the proposed algorithm outperforms the conventional algorithms on four challenging datasets. Furthermore, we demonstrate the generality and applicability of BRS in other computer vision tasks, by transforming existing convolutional neural networks into user-interactive ones.

144 citations

Journal ArticleDOI
TL;DR: A very fast segmentation technique which still achieves very high quality results is demonstrated and is proposed to replace the time consuming iterative refinement of global colour models in traditional GrabCut formulation by a densely connected crf.
Abstract: Figure-ground segmentation from bounding box input, provided either automatically or manually, has been extremely popular in the last decade and influenced various applications. A lot of research has focused on high-quality segmentation, using complex formulations which often lead to slow techniques, and often hamper practical usage. In this paper we demonstrate a very fast segmentation technique which still achieves very high quality results. We propose to replace the time consuming iterative refinement of global colour models in traditional GrabCut formulation by a densely connected crf. To motivate this decision, we show that a dense crf implicitly models unnormalized global colour models for foreground and background. Such relationship provides insightful analysis to bridge between dense crf and GrabCut functional. We extensively evaluate our algorithm using two famous benchmarks. Our experimental results demonstrated that the proposed algorithm achieves an order of magnitude 10× speed-up with respect to the closest competitor, and at the same time achieves a considerably higher accuracy.

86 citations

References
More filters
Journal ArticleDOI
TL;DR: A new model for active contours to detect objects in a given image, based on techniques of curve evolution, Mumford-Shah (1989) functional for segmentation and level sets is proposed, which can detect objects whose boundaries are not necessarily defined by the gradient.
Abstract: We propose a new model for active contours to detect objects in a given image, based on techniques of curve evolution, Mumford-Shah (1989) functional for segmentation and level sets. Our model can detect objects whose boundaries are not necessarily defined by the gradient. We minimize an energy which can be seen as a particular case of the minimal partition problem. In the level set formulation, the problem becomes a "mean-curvature flow"-like evolving the active contour, which will stop on the desired boundary. However, the stopping term does not depend on the gradient of the image, as in the classical active contour models, but is instead related to a particular segmentation of the image. We give a numerical algorithm using finite differences. Finally, we present various experimental results and in particular some examples for which the classical snakes methods based on the gradient are not applicable. Also, the initial curve can be anywhere in the image, and interior contours are automatically detected.

10,404 citations


"GrabCut in One Cut" refers methods in this paper

  • ...Some well-known approaches to segmentation [25, 19, 6] consider model parameters as extra optimization variables in their segmentation energies....

    [...]

Journal ArticleDOI
01 Aug 2004
TL;DR: A more powerful, iterative version of the optimisation of the graph-cut approach is developed and the power of the iterative algorithm is used to simplify substantially the user interaction needed for a given quality of result.
Abstract: The problem of efficient, interactive foreground/background segmentation in still images is of great practical importance in image editing. Classical image segmentation tools use either texture (colour) information, e.g. Magic Wand, or edge (contrast) information, e.g. Intelligent Scissors. Recently, an approach based on optimization by graph-cut has been developed which successfully combines both types of information. In this paper we extend the graph-cut approach in three respects. First, we have developed a more powerful, iterative version of the optimisation. Secondly, the power of the iterative algorithm is used to simplify substantially the user interaction needed for a given quality of result. Thirdly, a robust algorithm for "border matting" has been developed to estimate simultaneously the alpha-matte around an object boundary and the colours of foreground pixels. We show that for moderately difficult examples the proposed method outperforms competitive tools.

5,670 citations

Journal ArticleDOI
TL;DR: This paper compares the running times of several standard algorithms, as well as a new algorithm that is recently developed that works several times faster than any of the other methods, making near real-time performance possible.
Abstract: Minimum cut/maximum flow algorithms on graphs have emerged as an increasingly useful tool for exactor approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-flow algorithms with different polynomial time complexity. Their practical efficiency, however, has to date been studied mainly outside the scope of computer vision. The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for applications in vision. We compare the running times of several standard algorithms, as well as a new algorithm that we have recently developed. The algorithms we study include both Goldberg-Tarjan style "push -relabel" methods and algorithms based on Ford-Fulkerson style "augmenting paths." We benchmark these algorithms on a number of typical graphs in the contexts of image restoration, stereo, and segmentation. In many cases, our new algorithm works several times faster than any of the other methods, making near real-time performance possible. An implementation of our max-flow/min-cut algorithm is available upon request for research purposes.

4,463 citations

Proceedings ArticleDOI
20 Jun 2009
TL;DR: This paper introduces a method for salient region detection that outputs full resolution saliency maps with well-defined boundaries of salient objects that outperforms the five algorithms both on the ground-truth evaluation and on the segmentation task by achieving both higher precision and better recall.
Abstract: Detection of visually salient image regions is useful for applications like object segmentation, adaptive compression, and object recognition. In this paper, we introduce a method for salient region detection that outputs full resolution saliency maps with well-defined boundaries of salient objects. These boundaries are preserved by retaining substantially more frequency content from the original image than other existing techniques. Our method exploits features of color and luminance, is simple to implement, and is computationally efficient. We compare our algorithm to five state-of-the-art salient region detection methods with a frequency domain analysis, ground truth, and a salient object segmentation application. Our method outperforms the five algorithms both on the ground-truth evaluation and on the segmentation task by achieving both higher precision and better recall.

3,723 citations

Proceedings ArticleDOI
20 Jun 2011
TL;DR: This work proposes a regional contrast based saliency extraction algorithm, which simultaneously evaluates global contrast differences and spatial coherence, and consistently outperformed existing saliency detection methods.
Abstract: Automatic estimation of salient object regions across images, without any prior assumption or knowledge of the contents of the corresponding scenes, enhances many computer vision and computer graphics applications. We introduce a regional contrast based salient object detection algorithm, which simultaneously evaluates global contrast differences and spatial weighted coherence scores. The proposed algorithm is simple, efficient, naturally multi-scale, and produces full-resolution, high-quality saliency maps. These saliency maps are further used to initialize a novel iterative version of GrabCut, namely SaliencyCut, for high quality unsupervised salient object segmentation. We extensively evaluated our algorithm using traditional salient object detection datasets, as well as a more challenging Internet image dataset. Our experimental results demonstrate that our algorithm consistently outperforms 15 existing salient object detection and segmentation methods, yielding higher precision and better recall rates. We also show that our algorithm can be used to efficiently extract salient object masks from Internet images, enabling effective sketch-based image retrieval (SBIR) via simple shape comparisons. Despite such noisy internet images, where the saliency regions are ambiguous, our saliency guided image retrieval achieves a superior retrieval rate compared with state-of-the-art SBIR methods, and additionally provides important target object region information.

3,653 citations


"GrabCut in One Cut" refers background or methods or result in this paper

  • ...Optimizing E3(S) results in precision = 91%, recall = 85% and F-measure = 86%, whereas incorporating the appearance term in E4(S) yields precision = 89%, recall = 89% and F-measure = 89%, which is comparable to the state-of-the-art results reported in literature [7](precision = 90%, recall = 90%, F-measure = 90%)....

    [...]

  • ...Salient objects usually have an appearance that is distinct from the background [1, 7, 17]....

    [...]

  • ...14 compares the performance of E3(S) and E4(S) with that of FT [1], CA [9], LC [24], HC [7] and RC [7] in terms of precision, recall and F-measure defined as...

    [...]

  • ...Note that our optimization requires one graph-cut only, rather than the iterated EM-style grab-cut refinement in [7]....

    [...]

  • ...Saliency segmentation results reported for dataset [1]: Precison-Recall and F-measure bars for E3(S), E4(S) are compared to FT[1], CA[9], LC[24], HC[7] and RC[7]....

    [...]