Book Chapter•DOI•

ClassCut for unsupervised class segmentation

Bogdan Alexe¹, Thomas Deselaers¹, Vittorio Ferrari¹•Institutions (1)

05 Sep 2010-pp 380-393

TL;DR: A novel method for unsupervised class segmentation on a set of images that alternates between segmenting object instances and learning a class model based on a segmentation energy defined over all images at the same time, which can be optimized efficiently by techniques used before in interactive segmentation.

read less

Abstract: We propose a novel method for unsupervised class segmentation on a set of images. It alternates between segmenting object instances and learning a class model. The method is based on a segmentation energy defined over all images at the same time, which can be optimized efficiently by techniques used before in interactive segmentation. Over iterations, our method progressively learns a class model by integrating observations over all images. In addition to appearance, this model captures the location and shape of the class with respect to an automatically determined coordinate frame common across images. This frame allows us to build stronger shape and location models, similar to those used in object class detection. Our method is inspired by interactive segmentation methods [1], but it is fully automatic and learns models characteristic for the object class rather than specific to one particular object/image. We experimentally demonstrate on the Caltech4, Caltech101, and Weizmann horses datasets that our method (a) transfers class knowledge across images and this improves results compared to segmenting every image independently; (b) outperforms Grabcut [1] for the task of unsupervised segmentation; (c) offers competitive performance compared to the state-of-the-art in unsupervised segmentation and in particular it outperforms the topic model [2].

...read moreread less

Summary (4 min read)

Jump to: [1 Introduction] – [2 Overview of Our Method] – [3 Segmentation] – [3.1 Prior ΦΘ(L, I)] – [3.2 Class Model ΨΘ(L, I)] – [3.3 Energy Minimization] – [4.1 Location Model] – [4.2 Shape Model] – [4.3 Appearance Model] – [5 Finding the Reference Frame] – [6.1 Datasets] – [6.2 Baselines and the State of the Art] – [6.3 ClassCut] and [7 Conclusion]

1 Introduction

Image segmentation is a fundamental problem in computer vision.
Interestingly, most previous approaches to unsupervised segmentation do not use energy functions similar to those in interactive and supervised segmentation, but instead use topic models [2] or other specialized generative models [10, 12] to find recurring patterns in the images.
The authors propose ClassCut, a novel method for unsupervised segmentation based on a binary pairwise energy function similar to those used in interactive/supervised segmentation.
Finally, their approach is also related to co-segmentation [21] where the goal is to segment a specific object from two images at the same time.

2 Overview of Our Method

The goal is to jointly segment objects of an unknown class from a set of images.
Analog to the scheme of GrabCut [1], ClassCut alternates two stages: (1) learning/updating a class model given the current segmentations (sec. 4); (2) jointly segmenting the objects in all images given the current class model (sec. 3).
It converges when the segmentation is unchanged in two consecutive iterations.
As the class model is used in the next segmentation iteration it transfers knowledge across images, typically from easier images to more difficult ones, aiding their segmentation.
In the next iteration, this will help in images where the airplane is difficult to segment (e.g. because of low contrast).

3 Segmentation

In (given either as a full image or as automatically determined reference frame) consists of superpixels {S1n, . . . , SKnn }.
Skn on the foreground and ljn = 0 for all superpixels S j n on the background.

3.1 Prior ΦΘ(L, I)

It penalizes neighboring superpixels having different labels.
Thus, the penalty is smaller if the two superpixels are separated by high gradients.
Objects rarely touch the boundary of the reference frame.
This term penalizes superpixels touching the border of the reference frame to be labeled foreground (fig. 2).

3.2 Class Model ΨΘ(L, I)

The scalars w are part of the model parameters Θ and weight the terms.
To compute the energy contribution for a superpixel Skn labeled foreground, the authors average over all positions in Skn and incorporate this into eq. (7) as ΩΘ(L, I) = ∑ n ∑ k 1 |Skn| ∑ s∈Skn − log pΩ(lkn|s) (8) Fig. 3a shows a final location model obtained after convergence.
Fig. 4 shows an initial shape model and a shape model after convergence.
As visual descriptors f the authors use color distributions (col) and bag-of-words [23] of SURF descriptors [24] (bow).

3.3 Energy Minimization

To label these superpixels the authors use TRW-S [15].
TRW-S not only labels them but also computes a lower bound on the energy which may be used to assess how far from the global optimum the solution is.
In their experiments, the authors observed that QPBO labels on average 91% of the superpixels according to the global optimum.
Furthermore, the authors observed that the minimization problem is hardest in the first few iterations and easier in the later iterations: over the iterations QPBO labels more superpixels and the difference between the lower bound and the actual energy of the solutions is also decreased.

4.1 Location Model

The location model Ω is initialized uniformly.
At each iteration, the authors update the parameters of the location model using the current segmentation of all images of the current class according to the maximum likelihood criterion (fig. 3a): for each cell in the 32×32 grid they reestimate the empirical probability of foreground using the current segmentations.

4.2 Shape Model

The shape model Π is initialized by accumulating the boundaries of all superpixels in the reference frame over all images.
As the boundaries of superpixels follow likely object boundaries, they will reoccur consistently along the true object boundaries across multiple images.
The initial shape model (fig. 4) already contains a rough outline of the unknown object class.
At each iteration, the authors update the parameters of the shape model using the current segmentation of all images according to the maximum likelihood criterion: for each of the 5 orientations in the 32×32 grid, they reestimate the empirical probability for a label-change at this position and with this orientation.
While the shape model only knows about the boundaries of an object but not on which side is foreground or background, jointly with the location model (and with the between-image smoothness) it will encourage similar shapes in similar spatial arrangements to be segmented in all the images.

4.3 Appearance Model

Υf are initialized using the color/ SURF observations from all images using an initial segmentation.
This initial segmentation is obtained from a generic prior of object location trained on an external set of images with objects of other classes and their ground-truth segmentations (fig. 3b).
From this object location prior, the authors select the top 75% pixels as foreground; the remaining 25% as background.
The authors observe that this location prior is essentially a Gaussian in the middle of the reference frame.
If the authors are using automatically determined reference frames, the observations for the background are collected from both pixels outside the reference frame and pixels inside the reference frame but labelled as background.

5 Finding the Reference Frame

To find the reference frame, the authors use the objectness measure of [18] which quantifies how likely it is for an image window to contain an object of any class.
Objectness is trained to distinguish windows containing an object with a welldefined boundary and center, such as cows and telephones, from amorphous background windows, such as grass and road.
The authors sample 1000 windows likely to contain an object from this measure, project the object location prior (sec. 4.3) into these windows and accumulate into an objectness map M (fig. 5, (bottom)).
M will have peaks on the objects in the image.
In the experiments the authors demonstrate that this method improves the results of unsupervised segmentation compared to using the full images (sec. 6).

6.1 Datasets

The authors evaluate their unsupervised segmentation method on three datasets of varying difficulty and compare the results to a single-image GrabCut and to other stateof-the-art methods.
In no experiment training images with segmentations of the unknown class are used.
The authors use the experimental setup of [9]: for the classes airplanes, car , faces, and motorbikes, they use the test images of [27] and segment the objects using no training data1.
The authors use an experimental setup similar to [2]: for 28 classes, they randomly select 30 images each and determine the segmentations of the objects.
Note that [2] additionally uses 30 training images for each class and solves a joint segmentation and classification task (not done here).

6.2 Baselines and the State of the Art

To initialize GrabCut, the authors train a foreground color model from the central 25% of the area of the image and a background model from the rest.
Using these models, GrabCut is iterated until convergence for each image individually.
Notice how the automatic reference frame improves the results of GrabCut from line (c) to (d) and how GrabCut is a strong competitor for previous methods [2, 9] that were designed for unsupervised segmentation.
For the datasets for which results are available, the authors compare their approach to Spatial Topic Models [2].

6.3 ClassCut

The authors evaluate the ability of ClassCut to segment objects of an unknown class in a set of images.
Note also, how ClassCut improves its accuracy over iterations (line (e) to (f)), showing that it is properly learning about the class.
Using ClassCut the authors obtain a segmentation accuracy of 83.6%, outperforming both GrabCut (line (c)) and the spatial topic model [2] (line (a)).
Since neither [2, 9] use any such measure the authors compare to the GrabCut baseline.
This shows that the segmentations obtained using ClassCut are better aligned to the ground-truth segmentation than those from GrabCut.

7 Conclusion

The authors presented a novel approach to unsupervised class segmentation.
The authors approach alternates between jointly segmenting the objects in all images and updating a class model, which allows to benefit from the insights gained in interactive segmentation and object class detection.
The authors model comprises inter-image priors and a comprehensive class model accounting for object appearance, shape, and location w.r.t. an automatically determined reference frame.
The authors demonstrate that the reference frame allows to learn a novel type of shape model and aids the segmentation process.

Did you find this useful? Give us your feedback

Figures (9)

Fig. 7. Results on Caltech4. Top row: the initial shape model as well as the shape model and the location model after convergence. Below: for each class, two examples and their segmentations. The ground-truth segmentation is shown in red.

Fig. 3. (a) Training the location model Ω. In each iteration, we segment all images and reestimate a location model specific to the current class using the current segmentations. (b) Generic object location prior. The initial segmentation used to initialize appearance models is drawn in white.

Fig. 6. Results on the Weizmann horses. From left to right: initial shape model, shape model after convergence, location model after convergence, three example images with their segmentations. The ground-truth segmentation is shown in red.

Table 1. Results are reported as percentage of pixels classified correctly into either foreground or background

Fig. 4. The shape model. We initialize our shape model Π using only boundaries between superpixels. The shape model after convergence is shown on the right.

Fig. 1. Overview of our method. The top row shows the input images, the automatically determine reference frames and the initial location and shape models. The bottom row shows how the segmentations evolve over the iterations as well as the final location and shape models.

Fig. 8. Results on Caltech101. Top row: the initial shape model as well as the shape model and the location model after convergence for four example classes. Below: for each of these classes, some examples with their segmentation. The ground-truth segmentation is shown in red.

Fig. 5. Finding the reference frame. Images with automatically determined reference frames (top) and the objectness maps (bottom).

Fig. 2. Priors. (a) The smoothness prior between two superpixels is weighted inversely to the sum over the gradients along their boundary (shown in yellow and blue for two pairs of superpixels). (b) The between image smoothness prior is weighted by the overlap (yellow) of superpixels (shown for two pairs of superpixels (red/green) in two images). (c) The border penalty assigns high values to superpixels touching the reference frame boundary (dark=low values, bright=high values).

Content maybe subject to copyright Report

ClassCut for Unsupervised Class Segmentation

Bogdan Alexe, Thomas Deselaers, and Vittorio Ferrari

Computer Vision Laboratory, ETH Zurich, Zurich, Switzerland

{bogdan,deselaers,ferrari}@vision.ee.ethz.ch

Abstract. We propose a novel method for unsupervised class segmen-

tation on a set of images. It alternates between segmenting object in-

stances and learning a class model. The method is based on a segmen-

tation energy deﬁned over all images at the same time, which can be

optimized eﬃciently by techniques used before in interactive segmenta-

tion. Over iterations, our method progressively learns a class model by

integrating observations over all images. In addition to appearance, this

model captures the location and shape of the class with respect to an

automatically determined coordinate frame common across images. This

frame allows us to build stronger shape and location models, similar to

those used in object class detection. Our method is inspired by inter-

active segmentation methods [1], but it is fully automatic and learns

models characteristic for the object class rather than speciﬁc to one par-

ticular object/image. We experimentally demonstrate on the Caltech4,

Caltech101, and Weizmann horses datasets that our method (a) trans-

fers class knowledge across images and this improves results compared

to segmenting every image independently; (b) outperforms Grabcut [1]

for the task of unsupervised segmentation; (c) oﬀers competitive per-

formance compared to the state-of-the-art in unsupervised segmentation

and in particular it outperforms the topic model [2].

1 Introduction

Image segmentation is a fundamental problem in computer vision. Over the past

years methods that use graph-cut to minimize binary pairwise energy functions

have become the de-facto standard for segmenting speciﬁc objects in individual

images [1, 3, 4]. These methods employ appearance models for the foreground

and background which are estimated through user interactions [1, 3, 4].

On the one hand, analog approaches have been presented for object class

segmentation where the appearance models are learned from a set of training

images with ground-truth segmentations [5–7]. However, obtaining ground-truth

segmentations is cumbersome and error-prone.

On the other hand, approaches to unsupervised class segmentation have also

been proposed [2, 8–10, 12, 13]. In unsupervised segmentation a set of images

depicting diﬀerent instances of an object class is given, but without information

about the appearance and shape of the objects to be segmented. The aim of an

algorithm is to automatically segment the object instance in each image.

K. Daniilidis, P. Maragos, N. Paragios (Eds.): ECCV 2010, Part V, LNCS 6315, pp. 380–393, 2010.

 Springer-Verlag Berlin Heidelberg 2010

ClassCut for Unsupervised Class Segmentation 381

Interestingly, most previous approaches to unsupervised segmentation do not

use energy functions similar to those in interactive and supervised segmentation,

but instead use topic models [2] or other specialized generative models [10, 12]

to ﬁnd recurring patterns in the images.

We propose ClassCut, a novel method for unsupervised segmentation based on

a binary pairwise energy function similar to those used in interactive/supervised

segmentation. As opposed to those, our energy function is deﬁned over a set

of images rather than on one image [1, 3–5]. Inspired by GrabCut [1], where

the two stages of learning the foreground/background appearance models and

segmenting the image are alternated, our method alternates between learning a

class model and segmenting the objects in all images jointly. The class model is

learned from all images at the same time, so as to capture knowledge about the

class rather than speciﬁc to one image [1]. Therefore, it helps the next segmen-

tation iteration, as it transfers between images knowledge about the appearance

and shape of the class. Thanks to the nature of our energy function, we can

segment all images jointly using existing eﬃcient algorithms used in interactive

segmentation approaches [1, 3, 14, 15].

Inspired by representations successfully used in supervised object class detec-

tion [16, 17], our approach anchors the object class in a reference coordinate

frame common across images. This enables modeling the spatial structure and

shape of the class, as well as designing novel priors tailored to the unsupervised

segmentation task. We determine this reference frame automatically in every

image with a procedure based on a salient object detector [18].

At each iteration ClassCut updates the class model, which captures the ap-

pearance, shape, and location distribution of the class within the reference frame.

The ﬁnal output of the method are images with segmented object instances as

well as the class model.

In the experiments, we demonstrate that our method (a) transfers knowledge

between images and this improves the performance over segmenting each image

independently; (b) outperforms the original GrabCut [1], which is the main inspi-

ration behind it and turns out to be a very competitive baseline for unsupervised

segmentation; (c) oﬀers competitive performance compared to the state-of-the-

art in unsupervised segmentation; (d) learns meaningful, intuitive class models.

Source code for ClassCut is available at http://www.vision.ee.ethz.ch/˜calvin.

Related Work. We discussed in the introduction that our method employs

energy minimization techniques used in interactive segmentation [1, 3, 4, 14, 15],

and how it is related to supervised [5, 7, 19] as well as to unsupervised [2, 10–12]

class segmentation methods.

A diﬀerent task is object discovery, which aims at ﬁnding multiple object

classes from a mixed set of unlabeled images [11, 29]. In our work instead, all

images contain instances of one class.

The two closest work to ours are [8, 9], which have a procedure iterating be-

tween updating a model and segmenting the images. In [8] the model is given

a set of class and non-class images and then it iteratively improves the fore-

ground/background labeling of image fragments based on their class likelihoods.

382 B. Alexe, T. Deselaers, and V. Ferrari

Their method learns local segmentations masks for image fragments, while our

method learns a more complete class model, including appearance, shape and

location in a global reference frame.

Arora et al. [9] learn a template consistent over all images using variational

inference. Their template model is very diﬀerent from our class model, and closer

to a constellation model [20]. Moreover, their method optimizes the segmentation

of the images individually rather than jointly.

Finally, our approach is also related to co-segmentation [21] where the goal is

to segment a speciﬁc object from two images at the same time. Here we try to go

a step further and co-segment a set of images showing diﬀerent object instances

of an unknown class.

2OverviewofOurMethod

The goal is to jointly segment objects of an unknown class from a set of images.

Analog to the scheme of GrabCut [1], ClassCut alternates two stages: (1) learn-

ing/updating a class model given the current segmentations (sec. 4); (2) jointly

segmenting the objects in all images given the current class model (sec. 3). It

converges when the segmentation is unchanged in two consecutive iterations.

Our segmentation model for stage (2) is a binary pairwise energy function,

which can be optimized eﬃciently by techniques used in interactive segmenta-

tion [1, 3, 22], but jointly over all images rather than on a single image [1]

(sec. 3).

In stage (1), learning the class model over all images at once enables cap-

turing knowledge characteristic for the class rather than speciﬁc to a particular

image [1]. As the class model is used in the next segmentation iteration it trans-

fers knowledge across images, typically from easier images to more diﬃcult ones,

aiding their segmentation. For example, the model might learn in the ﬁrst itera-

tion that airplanes are typically grayish and the background is often blue (ﬁg. 1).

In the next iteration, this will help in images where the airplane is diﬃcult to

segment (e.g. because of low contrast).

The class model we propose (sec. 3.2) consists of several components modeling

diﬀerent class characteristics: appearance, location, and shape. In addition to a

color component also used in GrabCut [1], the appearance model includes a bag-

of-words [23] of SURF descriptors [24], which is well suited for modeling class

appearance. Moreover, we model the location (sec. 3.2) and shape (sec. 3.2)

of the object class w.r.t. a reference coordinate frame common across images

(sec. 5). Overall, our model focuses on knowledge at the class level rather than

at the level of one object as in the works it is inspired from [1, 4].

In addition to the class model, the segmentation energy include priors tailored

for segmenting classes (sec. 3.1). The priors are deﬁned on superpixels [25], which

act as grouping units for homogeneous areas. Superpixels bring two advantages:

(i) they provide additional structure, i.e. the set of possible segmentations is

reduced to those aligning well with image boundaries; (ii) they reduce the com-

putational complexity of segmentation. We formulate four class segmentation

priors over superpixels and multiple images (sec. 3.1).

ClassCut for Unsupervised Class Segmentation 383

Fig. 1. Overview of our method. The top row shows the input images, the auto-

matically determine reference frames and the initial location and shape models. The

bottom row shows how the segmentations evolve over the iterations as well as the ﬁnal

location and shape models.

If a common reference frame on the objects is available, our method exploits

it to anchor the location and shape models to it and to improve the eﬀectiveness

of some of the priors. We apply a salient object detector [18] to determine this

reference frame automatically (sec. 5). In sec. 6 we show how this detector im-

proves segmentation results compared to using the whole image as a reference

frame. Fig. 1 shows an overview of the entire method.

3Segmentation

In the set of images I = {I

,...,I

} each image I

(given either as a full

image or as automatically determined reference frame) consists of superpixels

,...,S

}.WesearchforthelabelingL

∗



,...,l

),...,(l

,...,l

...,(l

,...,l

)



that sets l

= 1 for all superpixels S

on the foreground and

= 0 for all superpixels S

on the background.

To determine L

∗

, we minimize

∗

=argmin

(L, I)} with E

(L, I)=Φ

(L, I)+Ψ

(L, I)(1)

where Φ is the segmentation prior (sec. 3.1) and Ψ is the class model (sec. 3.2).

In sec. 3.3 we describe how to minimize eq. (1). Θ are the parameters of the

model.

3.1 Prior Φ

(L, I )

The prior Φ consists of four terms

(L, I)=w

Λ(L, I)+w

χ(L, I)+w

Γ (L, I)+w

Δ(L, I)(2)

The scalars w are part of the model parameters Θ and weight the terms. Below

we describe the terms in detail.

384 B. Alexe, T. Deselaers, and V. Ferrari

(a)

(b)

(c)

Fig. 2. Priors. (a) The smoothness prior between two superpixels is weighted inversely

to the sum over the gradients along their boundary (shown in yellow and blue for

two pairs of superpixels). (b) The between image smoothness prior is weighted by

the overlap (yellow) of superpixels (shown for two pairs of superpixels (red/green) in

two images). (c) The border penalty assigns high values to superpixels touching the

reference frame boundary (dark=low values, bright=high values).

The Within Image Smoothness Λ is a smoothness prior for superpixels

which generalizes the pixel-based smoothness priors typically used in interactive

segmentation [1]. It penalizes neighboring superpixels having diﬀerent labels.

Λ(L, I)=



j,k

δ(l

= l

)exp(−grad(S

)) (3)

where j, k are the indices of neighboring superpixels S

, S

within image I

δ(l

= l

) = 1 if the labels l

, l

are diﬀerent and 0 otherwise. The gradient

grad(S

) between S

and S

is computed by summing the gradient mag-

nitudes [26] along the boundary between S

, S

(ﬁg. 2a) normalized w.r.t. the

length of the boundary. Thus, the penalty is smaller if the two superpixels are

separated by high gradients. This term encourages segmentations aligned with

the image gradients.

The Between Image Smoothness χ operates on superpixels across images.

It encourages superpixels in diﬀerent images but with similar location w.r.t. the

reference frame to have the same label:

χ(L, I)=



n,m



j,k

δ(l

= l

)

∩ S

∪ S

(4)

where n, m are two images and j, k superpixels, one in I

, the other in I

This penalty grows with the overlap of the superpixels (measured as area of

intersection over area of union). Therefore only overlapping superpixels interact

(ﬁg. 2b). This term encourages similar segmentations across all images (w.r.t.

the reference frame).

The Border Penalty Γ prefers superpixels at the image boundary to be

labeled background. Objects rarely touch the boundary of the reference frame.

Notice how the object would touch even a tight bounding-box around itself only

in a few points (e.g. ﬁg. 2a). The border penalty

Γ (L, I)=



border(S

)

perimeter(S

)

(5)

HTML Viewer

Frequently Asked Questions (14)

Q1. What contributions have the authors mentioned in the paper "Classcut for unsupervised class segmentation" ?

The authors propose a novel method for unsupervised class segmentation on a set of images. The authors experimentally demonstrate on the Caltech4, Caltech101, and Weizmann horses datasets that their method ( a ) transfers class knowledge across images and this improves results compared to segmenting every image independently ; ( b ) outperforms Grabcut [ 1 ] for the task of unsupervised segmentation ; ( c ) offers competitive performance compared to the state-of-the-art in unsupervised segmentation and in particular it outperforms the topic model [ 2 ].

Q2. What is the purpose of the class model?

As the class model is used in the next segmentation iteration it transfers knowledge across images, typically from easier images to more difficult ones, aiding their segmentation.

Q3. How many pairs of terms are in the final model?

The authors observed that on average only about 2% of the pairwise terms in the final model (i.e. incorporating all cues) are non-submodular.

Q4. What is the effect of the location model on the image?

While the shape model only knows about the boundaries of an object but not on which side is foreground or background, jointly with the location model (and with the between-image smoothness) it will encourage similar shapes in similar spatial arrangements to be segmented in all the images.

Q5. What is the effect of the appearance model?

Note that their appearance model extends the model of GrabCut [1] by the bag of SURF descriptor which is known to perform well for object classes.

Q6. What is the class model the authors propose?

The class model the authors propose (sec. 3.2) consists of several components modeling different class characteristics: appearance, location, and shape.

Q7. How do the authors set the weights and object location prior?

Weights and generic object location prior are set by leaving-one-out (setting parameters on 27 classes, and testing on the remaining 1; do this 28 times).

Q8. What is the effect of QPBO on the energy of superpixels?

the authors observed that the minimization problem is hardest in the first few iterations and easier in the later iterations: over the iterations QPBO labels more superpixels and the difference between the lower bound and the actual energy of the solutions is also decreased.

Q9. How is the objectness measure used to find the reference frame?

To find the reference frame, the authors use the objectness measure of [18] which quantifies how likely it is for an image window to contain an object of any class.

Q10. What is the probability of a label change?

At each iteration, the authors update the parameters of the shape model using the current segmentation of all images according to the maximum likelihood criterion: for each of the 5 orientations in the 32×32 grid, the authors reestimate the empirical probability for a label-change at this position and with this orientation.

Q11. What is the appearance model for a superpixel?

Υ fΘ(L, I) = ∑n ∑ k − 1|Skn| ∑ s∈Skn log pf (lkn|s) (11)The appearance models capture the appearance of foreground and background region.

Q12. What is the gradient grad between Sjn and S k n?

The gradient grad(Sjn, S k n) between S j n and S k n is computed by summing the gradient magnitudes [26] along the boundary between Sjn, S k n (fig. 2a) normalized w.r.t. the length of the boundary.

Q13. What is the effect of the priors?

If a common reference frame on the objects is available, their method exploits it to anchor the location and shape models to it and to improve the effectiveness of some of the priors.

Q14. What is the penalty for superpixels touching the border of the reference frame?

The border penaltyΓ (L, I) = ∑n ∑ k lkn border(Skn) perimeter(Skn) (5)assigns a penalty proportional to the number of pixels touching the reference frame (border(Skn)) to each superpixel S k n normalized by its perimeter (perimeter(Skn)).

ClassCut for unsupervised class segmentation

Summary (4 min read)

1 Introduction

2 Overview of Our Method

3 Segmentation

3.1 Prior ΦΘ(L, I)

3.2 Class Model ΨΘ(L, I)

3.3 Energy Minimization

4.1 Location Model

4.2 Shape Model

4.3 Appearance Model

5 Finding the Reference Frame

6.1 Datasets

6.2 Baselines and the State of the Art

6.3 ClassCut

7 Conclusion

Figures (9)

Citations

Cites background or methods from "ClassCut for unsupervised class seg..."

References

"ClassCut for unsupervised class seg..." refers methods in this paper

"ClassCut for unsupervised class seg..." refers methods in this paper

"ClassCut for unsupervised class seg..." refers methods in this paper

"ClassCut for unsupervised class seg..." refers background or methods in this paper

Related Papers (5)

Frequently Asked Questions (14)

Q1. What contributions have the authors mentioned in the paper "Classcut for unsupervised class segmentation" ?

Q2. What is the purpose of the class model?

Q3. How many pairs of terms are in the final model?

Q4. What is the effect of the location model on the image?

Q5. What is the effect of the appearance model?

Q6. What is the class model the authors propose?

Q7. How do the authors set the weights and object location prior?

Q8. What is the effect of QPBO on the energy of superpixels?

Q9. How is the objectness measure used to find the reference frame?

Q10. What is the probability of a label change?

Q11. What is the appearance model for a superpixel?

Q12. What is the gradient grad between Sjn and S k n?

Q13. What is the effect of the priors?

Q14. What is the penalty for superpixels touching the border of the reference frame?