scispace - formally typeset

Journal ArticleDOI

SLIC Superpixels Compared to State-of-the-Art Superpixel Methods

01 Nov 2012-IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE Computer Society)-Vol. 34, Iss: 11, pp 2274-2282

TL;DR: A new superpixel algorithm is introduced, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels and is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.
Abstract: Computer vision applications have come to rely increasingly on superpixels in recent years, but it is not always clear what constitutes a good superpixel algorithm. In an effort to understand the benefits and drawbacks of existing methods, we empirically compare five state-of-the-art superpixel algorithms for their ability to adhere to image boundaries, speed, memory efficiency, and their impact on segmentation performance. We then introduce a new superpixel algorithm, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels. Despite its simplicity, SLIC adheres to boundaries as well as or better than previous methods. At the same time, it is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.
Topics: Image segmentation (53%), Cluster analysis (52%)

Content maybe subject to copyright    Report

JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 6, NO. 1, DECEMBER 2011 1
SLIC Superpixels Compared to State-of-the-art
Superpixel Methods
Radhakrishna Achanta, Appu Shaji, Kevin Smith,
Aurelien Lucchi, Pascal Fua, and Sabine S
¨
usstrunk
Abstract—Computer vision applications have come to rely
increasingly on superpixels in recent years, but it is not always
clear what constitutes a good superpixel algorithm. In an effort to
understand the benefits and drawbacks of existing methods, we
empirically compare five state-of-the-art superpixel algorithms
for their ability to adhere to image boundaries, speed, memory
efficiency, and their impact on segmentation performance. We
then introduce a new superpixel algorithm, simple linear iterative
clustering (SLIC), which adapts a k-means clustering approach
to efficiently generate superpixels. Despite its simplicity, SLIC
adheres to boundaries as well as or better than previous methods.
At the same time, it is faster and more memory efficient, improves
segmentation performance, and is straightforward to extend to
supervoxel generation.
Index Terms—Superpixels, segmentation, clustering, k-means.
I. INTR ODUCTION
Superpixel algorithms group pixels into perceptually mean-
ingful atomic regions, which can be used to replace the rigid
structure of the pixel grid. They capture image redundancy,
provide a convenient primitive from which to compute image
features, and greatly reduce the complexity of subsequent
image processing tasks. They have become key building blocks
of many computer vision algorithms, such as top scoring multi-
class object segmentation entries to the PASCAL VOC Chal-
lenge [9], [29], [11], depth estimation [30], segmentation [16],
body model estimation [22], and object localization [9].
There are many approaches to generate superpixels, each
with its own advantages and drawbacks that may be better
suited to a particular application. For example, if adherence to
image boundaries is of paramount importance, the graph-based
method of [8] may be an ideal choice. However, if superpixels
are to be used to build a graph, a method that produces a more
regular lattice, such as [23], is probably a better choice. While
it is difficult to define what constitutes an ideal approach for all
applications, we believe the following properties are generally
desirable:
1) Superpixels should adhere well to image boundaries.
2) When used to reduce computational complexity as a pre-
processing step, superpixels should be fast to compute,
memory efficient, and simple to use.
3) When used for segmentation purposes, superpixels
should both increase the speed and improve the quality
of the results.
We therefore performed an empirical comparison of five
state-of-the-art superpixel methods [8], [23], [26], [25], [15],
evaluating their speed, ability to adhere to image boundaries,
All authors are with the School of Computer and Communication Sciences
(IC),
´
Ecole Polytechnique F
´
ed
´
erale de Lausanne (EPFL), Switzerland.
E-mail: firstname.lastname@epfl.ch
Fig. 1: Images segmented using SLIC into superpixels of size 64, 256,
and 1024 pixels (approximately).
and impact on segmentation performance. We also provide
a qualitative review of these, and other, superpixel methods.
Our conclusion is that no existing method is satisfactory in all
regards.
To address this, we propose a new superpixel algorithm:
simple linear iterative clustering (SLIC), which adapts k-
means clustering to generate superpixels in a manner similar
to [30]. While strikingly simple, SLIC is shown to yield state-
of-the-art adherence to image boundaries on the Berkeley
benchmark [20], and outperforms existing methods when used
for segmentation on the PASCAL [7] and MSRC [24] data
sets. Furthermore, it is faster and more memory efficient than
existing methods. In addition to these quantifiable benefits,
SLIC is easy to use, offers flexibility in the compactness and
number of the superpixels it generates, is straightforward to
extend to higher dimensions, and is freely available
1
.
II. EXISTING SUPERPIXEL METHODS
Algorithms for generating superpixels can be broadly cat-
egorized as either graph-based or gradient ascent methods.
Below, we review popular superpixel methods for each of
these categories, including some that were not originally de-
signed specifically to generate superpixels. Table I provides a
qualitative and quantitative summary of the reviewed methods,
including their relative performance.
A. Graph-based algorithms
Graph-based approaches to superpixel generation treat each
pixel as a node in a graph. Edge weights between two nodes
are proportional to the similarity between neighboring pixels.
Superpixels are created by minimizing a cost function defined
over the graph.
NC05 The Normalized cuts algorithm [23] recursively
partitions a graph of all pixels in the image using contour
and texture cues, globally minimizing a cost function defined
on the edges at the partition boundaries. It produces very
1
Cross-platform executables and source code for SLIC superpixels and
supervoxels can be found at http://ivrg.epfl.ch/research/superpixels

JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 6, NO. 1, DECEMBER 2011 2
TABLE I: Summary of existing superpixel algorithms. The ability of a superpixel method to adhere to boundaries found in the Berkeley data set [20]
is measured and ranked according to two standard metrics: under-segmentation error and boundar y recall (for 500 super pixels). We also report
the average time required to segment images using an Intel Dual Core 2.26 GHz processor with 2GB RAM, and the class-averaged segmentation
accuracy obtained on the MSRC data set using the method described in [11]. Bold entries indicate best performance in each category. Ability to
specify the amount of superpixels, control their compactness, and ability to generate supervoxels is also provided.
Graph-based Gradient-ascent-based
GS04 NC05 SL08 GCa10
b
GCb10
b
WS91 MS02 TP09
b
QS09 SLIC
[8] [23] [21] [26] [26] [28] [4] [15] [25]
Adherence to boundaries
Under-segmentation error (rank) 0.23 0.22 - 0.22 0.22 - - 0.24 0.20 0.19
Boundary recall (rank) 0.84 0.68 - 0.69 0.70 - - 0.61 0.79 0.82
Segmentation speed
320 × 240 image 1.08s
a
178.15s - 5.30s 4.12s - - 8.10s 4.66s 0.36s
2048 × 1536 image 90.95s
a
N/A
c
- 315s 235s - - 800s 181s 14.94s
Segmentation accuracy (using [11] on MSRC) 74.6% 75.9% - - 73.2% - - 62.0% 75.1% 76.9%
Control over amount of superpixels No Yes Yes Yes Yes No No Yes No Yes
Control over superpixel compactness No No No No
d
No
d
No No No No Yes
Supervoxel extension No No No Yes Yes Yes No No No Yes
a
Reported time includes parameter search.
b
Considers intensity only, ignores color.
c
NC05 failed to segment 2048 × 1536 images, producing “out of memory” errors.
d
Constant-intensity (GCa10) or compact (GCb10) superpixels can be selected.
regular, visually pleasing superpixels. However, the boundary
adherence of NC05 is relatively poor and it is the slowest
among the methods (particularly for large images), although
attempts to speed up the algorithm exist [5]. NC05 has a
complexity of O(N
3
2
) [15], where N is the number of pixels.
GS04 Felzenszwalb and Huttenlocher [8] propose an alter-
native graph-based approach that has been applied to generate
superpixels. It performs an agglomerative clustering of pixels
as nodes on a graph, such that each superpixel is the minimum
spanning tree of the constituent pixels. GS04 adheres well to
image boundaries in practice, but produces superpixels with
very irregular sizes and shapes. It is O(N log N) complex and
fast in practice. However, it does not offer an explicit control
over the amount of superpixels or their compactness.
SL08 Moore et al. propose a method to generate superpixels
that conform to a grid by finding optimal paths, or seams, that
split the image into smaller vertical or horizontal regions [21].
Optimal paths are found using a graph cuts method similar
to Seam Carving [1]. While the complexity of SL08 is
O(N
3
2
log N) according to the authors, this does not account
for the pre-computed boundary maps, which strongly influence
the quality and speed of the output.
GCa10 and GCb10 In [26], Veksler et al. use a global
optimization approach similar to the texture synthesis work
of [14]. Superpixels are obtained by stitching together over-
lapping image patches such that each pixel belongs to only
one of the overlapping regions. They suggest two variants of
their method, one for generating compact superpixels (GCa10)
and one for constant-intensity superpixels (GCb10).
B. Gradient-ascent-based algorithms
Starting from a rough initial clustering of pixels, gradient
ascent methods iteratively refine the clusters until some con-
vergence criterion is met to form superpixels.
MS02 In [4], mean shift, an iterative mode-seeking pro-
cedure for locating local maxima of a density function, is
applied to find modes in the color or intensity feature space
of an image. Pixels that converge to the same mode define the
superpixels. MS02 is an older approach, producing irregularly
shaped superpixels of non-uniform size. It is O(N
2
) complex,
making it relatively slow and does not offer direct control over
the amount, size, or compactness of superpixels.
QS08 Quick shift [25] also uses a mode-seeking segmen-
tation scheme. It initializes the segmentation using a medoid
shift procedure. It then moves each point in the feature space to
the nearest neighbor that increases the Parzen density estimate.
While it has relatively good boundary adherence, QS08 is quite
slow, with an O(dN
2
) complexity (d is a small constant [25]).
QS08 does not allow for explicit control over the size or
number of superpixels. Previous works have used QS08 for
object localization [9] and motion segmentation [2].
WS91 The watershed approach [28] performs a gradient
ascent starting from local minima to produce watersheds, lines
that separate catchment basins. The resulting superpixels are
often highly irregular in size and shape, and do not exhibit
good boundary adherence. The approach of [28] is relatively
fast (O(N log N) complexity), but does not offer control over
the amount of superpixels or their compactness.
TP09 The Turbopixel method progressively dilates a set
of seed locations using level-set based geometric flow [15].
The geometric flow relies on local image gradients, aiming
to regularly distribute superpixels on the image plane. Unlike
WS91, TP09 superpixels are constrained to have uniform size,
compactness, and boundary adherence. TP09 relies on algo-
rithms of varying complexity, but in practice, as the authors
claim, has approximately O (N ) behaviour [15]. However, it is
among the slowest algorithms examined and exhibits relatively
poor boundary adherence.
III. SLIC SUPERPIXELS
We propose a new method for generating superpixels
which is faster than existing methods, more memory efficient,

JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 6, NO. 1, DECEMBER 2011 3
exhibits state-of-the-art boundary adherence, and improves
the performance of segmentation algorithms. Simple linear
iterative clustering (SLIC) is an adaptation of k-means for
superpixel generation, with two important distinctions:
1) The number of distance calculations in the optimization
is dramatically reduced by limiting the search space to a
region proportional to the superpixel size. This reduces
the complexity to be linear in the number of pixels N
and independent of the number of superpixels k.
2) A weighted distance measure combines color and spatial
proximity, while simultaneously providing control over
the size and compactness of the superpixels.
SLIC is similar to the approach used as a preprocessing step
for depth estimation described in [30], which was not fully
explored in the context of superpixel generation.
A. Algorithm
SLIC is simple to use and understand. By default, the
only parameter of the algorithm is k, the desired number
of approximately equally-sized superpixels.
2
For color images
in the CIELAB color space, the clustering procedure begins
with an initialization step where k initial cluster centers
C
i
= [l
i
a
i
b
i
x
i
y
i
]
T
are sampled on a regular grid spaced
S pixels apart. To produce roughly equally sized superpixels,
the grid interval is S =
p
N/k. The centers are moved to
seed locations corresponding to the lowest gradient position
in a 3 × 3 neighborhood. This is done to avoid centering a
superpixel on an edge, and to reduce the chance of seeding a
superpixel with a noisy pixel.
Next, in the assignment step, each pixel i is associated with
the nearest cluster center whose search region overlaps its
location, as depicted in Fig. 2. This is the key to speeding up
our algorithm because limiting the size of the search region
significantly reduces the number of distance calculations, and
results in a significant speed advantage over conventional k-
means clustering where each pixel must be compared with all
cluster centers. This is only possible through the introduction
of a distance measure D, which determines the nearest cluster
center for each pixel, as discussed in Section III-B. Since
the expected spatial extent of a superpixel is a region of
approximate size S × S, the search for similar pixels is done
in a region 2S × 2S around the superpixel center.
Once each pixel has been associated to the nearest cluster
center, an update step adjusts the cluster centers to be the mean
[l a b x y]
T
vector of all the pixels belonging to the cluster.
The L
2
norm is used to compute a residual error E between
the new cluster center locations and previous cluster center
locations. The assignment and update steps can be repeated
iteratively until the error converges, but we have found that
10 iterations suffices for most images, and report all results
in this paper using this criteria. Finally, a post-processing
step enforces connectivity by re-assigning disjoint pixels to
nearby superpixels. The entire algorithm is summarized in
Algorithm 1.
2
Optionally, the compactness of the superpixels can be controlled by
adjusting m, which is discussed in Section III-B.

(a) standard k-means searches (b) SLIC searches
the entire image a limited region
Fig. 2: Reducing the superpixel search regions. The complexity of SLIC
is linear in the number of pixels in the image O(N ), while the conven-
tional k -means algorithm is O(kN I) where I is the number of iterations.
This is achieved by limiting the search space of each cluster center in the
assignment step. (a) In the conventional k-means algorithm, distances
are computed from each cluster center to every pixel in the image.
(b) SLIC only computes distances from each cluster center to pixels
within a 2S × 2S region. Note that the expected superpixel size is only
S × S, indicated by the smaller square. This approach not only reduces
distance computations but also makes SLIC’s complexity independent
of the number of superpixels.
Algorithm 1 SLIC superpixel segmentation
/ Initialization /
Initialize cluster centers C
k
= [l
k
, a
k
, b
k
, x
k
, y
k
]
T
by
sampling pixels at regular grid steps S.
Move cluster centers to the lowest gradient position in a
3 × 3 neighborhood.
Set label l(i) = 1 for each pixel i.
Set distance d(i) = for each pixel i.
repeat
/ Assignment /
for each cluster center C
k
do
for each pixel i in a 2S × 2S region around C
k
do
Compute the distance D between C
k
and i.
if D < d(i) then
set d(i) = D
set l(i) = k
end if
end for
end for
/ Update /
Compute new cluster centers.
Compute residual error E.
until E threshold
B. Distance measure
SLIC superpixels correspond to clusters in the labxy color–
image plane space. This presents a problem in defining the
distance measure D, which may not be immediately obvious.
D computes the distance between a pixel i and cluster center
C
k
in Algorithm 1. A pixel’s color is represented in the
CIELAB color space [l a b]
T
, whose range of possible values
is known. The pixel’s position position [x y]
T
, on the other
hand, may take a range of values that varies according to the
size of the image.
Simply defining D to be the five-dimenensional Euclidean
distance in labxy space will cause inconsistencies in clustering

JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 6, NO. 1, DECEMBER 2011 4
behavior for different superpixel sizes. For large superpixels,
spatial distances outweigh color proximity, giving more rela-
tive importance to spatial proximity than color. This produces
compact superpixels that do not adhere well to image bound-
aries. For smaller superpixels, the converse is true.
To combine the two distances into a single measure, it is
necessary to normalize color proximity and spatial proximity
by their respective maximum distances within a cluster, N
s
and N
c
. Doing so, D
0
is written
d
c
=
p
(l
j
l
i
)
2
+ (a
j
a
i
)
2
+ (b
j
b
i
)
2
d
s
=
p
(x
j
x
i
)
2
+ (y
j
y
i
)
2
D
0
=
r
d
c
N
c
2
+
d
s
N
s
2
. (1)
The maximum spatial distance expected within a given cluster
should correspond to the sampling interval, N
S
= S =
p
(N /K). Determining the maximum color distance N
c
is
not so straightforward, as color distances can vary significantly
from cluster to cluster and image to image. This problem can
be avoided by fixing N
c
to a constant m so that Eq. 1 becomes
D
0
=
q
d
c
m
2
+
d
s
S
2
, (2)
which simplifies to the distance measure we use in practice
D =
s
d
c
2
+
d
s
S
2
m
2
. (3)
By defining D in this manner, m also allows us to weigh the
relative importance between color similarity and spatial prox-
imity. When m is large, spatial proximity is more important
and the resulting superpixels are more compact (i.e. they have
a lower area to perimeter ratio). When m is small, the resulting
superpixels adhere more tightly to image boundaries, but have
less regular size and shape. When using the CIELAB color
space, m can be in the range [1, 40].
Equation 3 can be adapted for grayscale images by setting
d
c
=
p
(l
j
l
i
)
2
. It can also be extended to handle 3D
supervoxels, as depicted in Figure 3, by including the depth
dimension to the spatial proximity term of Eq. 3
d
s
=
q
(x
i
x
j
)
2
+ (y
i
y
j
) + (z
i
z
j
)
2
. (4)
C. Post-processing
Like some other superpixel algorithms [8], SLIC does not
explicitly enforce connectivity. At the end of the clustering
procedure, some “orphaned” pixels that do not belong to the
same connected component as their cluster center may remain.
To correct for this, such pixels are assigned the label of the
nearest cluster center using a connected components algorithm.
D. Complexity
By localizing the search in the clustering procedure, SLIC
avoids performing thousands of redundant distance calcula-
tions. In practice, a pixel falls in the neighborhood of less than
eight cluster centers, meaning that SLIC is O(N) complex.
In contrast, the trivial upper bound for the classical k-means
algorithm is O(k
N
) [17], and the practical time complexity
t =0 t =20 t =40 t =60 t =80 t =100 t =120 t =140

Fig. 3: SLIC supervoxels computed for a video sequence. (top) frames
from a short video sequence of a flag waving. (bottom left) A volume
containing the video. The last frame appears at the top of the volume.
(bottom right) A supervoxel segmentation of the video. Supervoxels with
orange cluster centers are removed for display purposes.
is O(N kI) [6], where I is the number of iterations required
for convergence. While schemes to reduce the complexity of
k-means have been proposed using prime number length sam-
pling [27], random sampling [13], local cluster swapping [12],
and by setting lower and upper bounds [6], these methods
are very general in nature. SLIC is specifically tailored to
the problem of superpixel clustering. Finally, unlike most
superpixel methods and the aforementioned approaches to
speed up k-means, the complexity of SLIC is linear in the
number of pixels, irrespective of k.
IV. COMPARISON WITH STATE-OF-THE-ART
We performed a quantitative comparison of SLIC and ve
state-of-the-art superpixel methods using publicly available
source code. These algorithms include GS04
3
, NC05
4
, TP09
5
,
QS09
6
, and two versions of the algorithm proposed in [26],
GCa10 and GCb10
7
. Examples of superpixel segmentations
produced by each method appear in Fig. 7.
A. Adherence to boundaries
Arguably, the most important property of a superpixel
method is its ability to adhere to image boundaries. Boundary
recall and under-segmentation error are standard measures for
boundary adherence [15], [26]. In Fig. 4(a) and (b), SLIC,
GS04, NC05, TP09, QS09, and GC10, are compared using
these measures on the Berkeley database [20]. In addition, a
baseline performance obtained by segmenting the image into
uniform squares is denoted as “Squares”. The Berkeley data set
contains three-hundred 321 × 481 images, and approximately
10 human-annotated ground truth segmentations correspond-
ing to each image.
Boundary recall measures what fraction of the ground truth
edges fall within at least two pixels of a superpixel boundary.
The boundary recall of each method is plotted in Fig. 4(a)
3
http://people.cs.uchicago.edu/
pff/segment/
4
http://www.cs.sfu.ca/
mori/research/superpixels/
5
http://www.cs.toronto.edu/
babalex/turbopixels supplementary.tar.gz
6
http://www.vlfeat.org/download.html
7
http://www.csd.uwo.ca/faculty/olga/Code/superpixels1pt1.zip

JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 6, NO. 1, DECEMBER 2011 5
(a) Boundary Recall
500 1000 1500 2000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Number of superpixels
Boundary recall
GS04
NC05
TP09
QS09
GCa10
GCb10
Squares
SLIC
GSLIC
ASLIC
(b) Under-segmentation Error
500 1000 1500 2000
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
Number of superpixels
Under−segementation error
GS04
NC05
TP09
QS09
GCa10
GCb10
Squares
SLIC
GSLIC
ASLIC
(c) Segmentation Speed

















Fig. 4: Boundary adherence and segmentation speed. (a) Boundary
recall measures the fraction of the ground truth edges that fall within
at least two pixels of a superpixel boundary. While GS04 demonstrates
the best boundary recall, reducing m from the default value increases
the boundary recall of SLIC over that of GS04. (b) Under-segmentation
error measures the amount of superpixel “leak” for a given ground truth
region. SLIC outperforms the other methods, showing the lowest under-
segmentation error for most of the useful operating regime. (c) Time
required to generate superpixels for images of increasing size. SLIC is
the fastest superpixel method, followed closely by GS04, and then a
significant gap. NC05 is not plotted due to its particularly slow speed.
for increasing numbers of superpixels. A high boundary recall
indicates that very few true edges were missed. Superpixels
generated by SLIC and GS04 demonstrated the best boundary
recall performance. If we reduce SLIC’s compactness m from
its default value of 10, SLIC shows superior performance to
GS04.
Under-segmentation error, shown in Fig. 4(b), is another
measure of boundary adherence. Given a region from the
ground truth segmentation g
i
and the set of superpixels re-
quired to cover it, s
j
|s
j
T
g
i
, it measures how many pixels
from s
j
“leak” across the boundary of g
i
. If |.| is the size of a
segment in pixels, M is the number of ground truth segments,
and B is a minimum number of pixels in s
j
overlapping g
i
,
under-segmentation error is expressed as
U =
1
N
M
X
i=1
X
s
j
|s
j
T
g
i
>B
|s
j
|
N
. (5)
B is set to 5% of |s
j
| in our experiments to account for
ambiguities in the ground truth. Superpixels that do not tightly
fit the ground truth result in a high value of U .
B. Computational and memory efficiency
Superpixels are often used to replace the pixel-grid to
help speed up other algorithms. Thus, it is important that
superpixels can be generated efficiently in the first place.
In Fig. 4(c), we compare the time required for the various
superpixel methods to segment images of increasing size on
an Intel Dual Core 2.26 GHz processor with 2GB RAM.
SLIC, with its O(N) complexity, is the fastest superpixel
method, and its advantage increases with the size of the image.
While GS04 is competitive with O(N ) log N complexity, the
remaining methods show a significant gap in processing speed.
It is also important that a superpixel algorithm is memory
efficient in order to handle large images. SLIC is the most
memory efficient method, requiring only N floats to store
the distance from each pixel to its nearest cluster center.
Other methods have comparatively high memory requirements:
GS04 and GC10 require 5N floats to store edge weights and
thresholds for 4-connectivity (or 9N for 8-connectivity).
C. Segmentation performance
Superpixels are commonly used as a pre-processing step in
segmentation algorithms. A good superpixel algorithm should
improve the performance of the segmentation algorithm that
uses it. We compared the segmentation resulting from SLIC,
GS04, NC05, TP09, QS09, and GC10 on the MSRC data
set [24]. These results were obtained using the method of [11],
which uses superpixels to compute color, texture, geometry,
and location features. It then trains classifiers for the 21 object
classes and learns a CRF model. The results appearing in Table
I show that SLIC superpixels yield the best performance. SLIC
also reduces the the computational time by a factor of over
500 over NC05, the method used in [11]. Example images
segmented using SLIC are shown in Fig. 5.
We also tested on the PASCAL VOC 2010 data set [7] using
the approach of [10]. As shown in Table II, SLIC provided a
boost in segmentation accuracy over QS09 and reduced the
time spent generating superpixels by an order of magnitude.
D. Discussion
In addition to the properties discussed above, other consider-
ations should factor into the quality of a superpixel algorithm.
One such consideration is the ease of use. Superpixel methods
with many difficult-to-tune parameters can result in lost time

Figures (8)
Citations
More filters

Journal ArticleDOI
Abstract: In this paper we report the set-up and results of the Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) organized in conjunction with the MICCAI 2012 and 2013 conferences Twenty state-of-the-art tumor segmentation algorithms were applied to a set of 65 multi-contrast MR scans of low- and high-grade glioma patients—manually annotated by up to four raters—and to 65 comparable scans generated using tumor image simulation software Quantitative evaluations revealed considerable disagreement between the human raters in segmenting various tumor sub-regions (Dice scores in the range 74%–85%), illustrating the difficulty of this task We found that different algorithms worked best for different sub-regions (reaching performance comparable to human inter-rater variability), but that no single algorithm ranked in the top for all sub-regions simultaneously Fusing several good algorithms using a hierarchical majority vote yielded segmentations that consistently ranked above all individual algorithms, indicating remaining opportunities for further methodological improvements The BRATS image data and manual annotations continue to be publicly available through an online evaluation system as an ongoing benchmarking resource

2,695 citations


01 Jan 2006-

2,669 citations


Journal ArticleDOI
04 Dec 2017-Scientific Reports
TL;DR: QuPath provides researchers with powerful batch-processing and scripting functionality, and an extensible platform with which to develop and share new algorithms to analyze complex tissue images, making it suitable for a wide range of additional image analysis applications across biomedical research.
Abstract: QuPath is new bioimage analysis software designed to meet the growing need for a user-friendly, extensible, open-source solution for digital pathology and whole slide image analysis. In addition to offering a comprehensive panel of tumor identification and high-throughput biomarker evaluation tools, QuPath provides researchers with powerful batch-processing and scripting functionality, and an extensible platform with which to develop and share new algorithms to analyze complex tissue images. Furthermore, QuPath’s flexible design makes it suitable for a wide range of additional image analysis applications across biomedical research.

1,303 citations


Proceedings ArticleDOI
01 Oct 2016-
TL;DR: A fully convolutional architecture, encompassing residual learning, to model the ambiguous mapping between monocular images and depth maps is proposed and a novel way to efficiently learn feature map up-sampling within the network is presented.
Abstract: This paper addresses the problem of estimating the depth map of a scene given a single RGB image. We propose a fully convolutional architecture, encompassing residual learning, to model the ambiguous mapping between monocular images and depth maps. In order to improve the output resolution, we present a novel way to efficiently learn feature map up-sampling within the network. For optimization, we introduce the reverse Huber loss that is particularly suited for the task at hand and driven by the value distributions commonly present in depth maps. Our model is composed of a single architecture that is trained end-to-end and does not rely on post-processing techniques, such as CRFs or other additional refinement steps. As a result, it runs in real-time on images or videos. In the evaluation, we show that the proposed model contains fewer parameters and requires fewer training data than the current state of the art, while outperforming all approaches on depth estimation. Code and models are publicly available.

1,156 citations


Cites methods from "SLIC Superpixels Compared to State-..."

  • ...[29], uses a MRF to infer depth from local and global features extracted from the image, while superpixels [1] are introduced in the MRF formulation in order to enforce neighboring constraints....

    [...]


Proceedings ArticleDOI
Wangjiang Zhu1, Shuang Liang2, Yichen Wei3, Jian Sun3Institutions (3)
23 Jun 2014-
TL;DR: This work proposes a robust background measure, called boundary connectivity, which characterizes the spatial layout of image regions with respect to image boundaries and is much more robust and presents unique benefits that are absent in previous saliency measures.
Abstract: Recent progresses in salient object detection have exploited the boundary prior, or background information, to assist other saliency cues such as contrast, achieving state-of-the-art results. However, their usage of boundary prior is very simple, fragile, and the integration with other cues is mostly heuristic. In this work, we present new methods to address these issues. First, we propose a robust background measure, called boundary connectivity. It characterizes the spatial layout of image regions with respect to image boundaries and is much more robust. It has an intuitive geometrical interpretation and presents unique benefits that are absent in previous saliency measures. Second, we propose a principled optimization framework to integrate multiple low level cues, including our background measure, to obtain clean and uniform saliency maps. Our formulation is intuitive, efficient and achieves state-of-the-art results on several benchmark datasets.

1,143 citations


Cites methods from "SLIC Superpixels Compared to State-..."

  • ...For GS [26], we use the same superpixel segmentation [18], resulting smaller time cost as reported in [26]....

    [...]

  • ...The image is first abstracted as a set of nearly regular superpixels using the SLIC method [18]....

    [...]


References
More filters

Journal ArticleDOI
Jianbo Shi1, Jitendra Malik2Institutions (2)
TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Abstract: We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total similarity within the groups. We show that an efficient computational technique based on a generalized eigenvalue problem can be used to optimize this criterion. We applied this approach to segmenting static images, as well as motion sequences, and found the results to be very encouraging.

13,025 citations


Journal ArticleDOI
Mark Everingham1, Luc Van Gool2, Christopher Williams3, John Winn4  +1 moreInstitutions (5)
TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.
Abstract: The Pascal Visual Object Classes (VOC) challenge is a benchmark in visual object category recognition and detection, providing the vision and machine learning communities with a standard dataset of images and annotation, and standard evaluation procedures. Organised annually from 2005 to present, the challenge and its associated dataset has become accepted as the benchmark for object detection. This paper describes the dataset and evaluation procedure. We review the state-of-the-art in evaluated methods for both classification and detection, analyse whether the methods are statistically different, what they are learning from the images (e.g. the object or its context), and what the methods find easy or confuse. The paper concludes with lessons learnt in the three year history of the challenge, and proposes directions for future improvement and extension.

11,545 citations


Journal ArticleDOI
Dorin Comaniciu1, Peter Meer1Institutions (1)
TL;DR: It is proved the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density.
Abstract: A general non-parametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure: the mean shift. For discrete data, we prove the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density. The relation of the mean shift procedure to the Nadaraya-Watson estimator from kernel regression and the robust M-estimators; of location is also established. Algorithms for two low-level vision tasks discontinuity-preserving smoothing and image segmentation - are described as applications. In these algorithms, the only user-set parameter is the resolution of the analysis, and either gray-level or color images are accepted as input. Extensive experimental results illustrate their excellent performance.

11,014 citations


"SLIC Superpixels Compared to State-..." refers background in this paper

  • ...Superpixels are created by minimizing a cost function defined over the graph....

    [...]


Proceedings ArticleDOI
Jianbo Shi1, Jitendra Malik1Institutions (1)
17 Jun 1997-
TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Abstract: We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total similarity within the groups. We show that an efficient computational technique based on a generalized eigenvalue problem can be used to optimize this criterion. We have applied this approach to segmenting static images and found results very encouraging.

10,996 citations


"SLIC Superpixels Compared to State-..." refers background in this paper

  • ...Table 1 provides a qualitative and quanti- tative summary of the reviewed methods, including their relative performance....

    [...]

  • ...Index Terms—Superpixels, segmentation, clustering, k-means Ç...

    [...]


Journal ArticleDOI
S. P. Lloyd1Institutions (1)
Abstract: It has long been realized that in pulse-code modulation (PCM), with a given ensemble of signals to handle, the quantum values should be spaced more closely in the voltage regions where the signal amplitude is more likely to fall. It has been shown by Panter and Dite that, in the limit as the number of quanta becomes infinite, the asymptotic fractional density of quanta per unit voltage should vary as the one-third power of the probability density per unit voltage of signal amplitudes. In this paper the corresponding result for any finite number of quanta is derived; that is, necessary conditions are found that the quanta and associated quantization intervals of an optimum finite quantization scheme must satisfy. The optimization criterion used is that the average quantization noise power be a minimum. It is shown that the result obtained here goes over into the Panter and Dite result as the number of quanta become large. The optimum quautization schemes for 2^{b} quanta, b=1,2, \cdots, 7 , are given numerically for Gaussian and for Laplacian distribution of signal amplitudes.

9,657 citations


Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
202218
2021693
2020796
2019960
2018997
2017955