scispace - formally typeset
Open AccessProceedings ArticleDOI

Intelligent scissors for image composition

TLDR
Intelligent Sc scissors allows objects within digital images to be extracted quickly and accurately using simple gesture motions with a mouse, and allows creation of convincing compositions from existing images while dramatically increasing the speed and precision with which objects can be extracted.
Abstract
We present a new, interactive tool called Intelligent Scissors which we use for image segmentation and composition. Fully automated segmentation is an unsolved problem, while manual tracing is inaccurate and laboriously unacceptable. However, Intelligent Scissors allow objects within digital images to be extracted quickly and accurately using simple gesture motions with a mouse. When the gestured mouse position comes in proximity to an object edge, a live-wire boundary “snaps” to, and wraps around the object of interest. Live-wire boundary detection formulates discrete dynamic programming (DP) as a two-dimensional graph searching problem. DP provides mathematically optimal boundaries while greatly reducing sensitivity to local noise or other intervening structures. Robustness is further enhanced with on-the-fly training which causes the boundary to adhere to the specific type of edge currently being followed, rather than simply the strongest edge in the neighborhood. Boundary cooling automatically freezes unchanging segments and automates input of additional seed points. Cooling also allows the user to be much more free with the gesture path, thereby increasing the efficiency and finesse with which boundaries can be extracted. Extracted objects can be scaled, rotated, and composited using live-wire masks and spatial frequency equivalencing. Frequency equivalencing is performed by applying a Butterworth filter which matches the lowest frequency spectra to all other image components. Intelligent Scissors allow creation of convincing compositions from existing images while dramatically increasing the speed and precision with which objects can be extracted.

read more

Content maybe subject to copyright    Report

Abstract
We present a new, interactive tool called Intelligent Scissors
which we use for image segmentation and composition. Fully auto-
mated segmentation is an unsolved problem, while manual tracing
is inaccurate and laboriously unacceptable. However, Intelligent
Scissors allow objects within digital images to be extracted quickly
and accurately using simple gesture motions with a mouse. When
the gestured mouse position comes in proximity to an object edge,
a live-wire boundary “snaps” to, and wraps around the object of
interest.
Live-wire boundary detection formulates discrete dynamic pro-
gramming (DP) as a two-dimensional graph searching problem. DP
provides mathematically optimal boundaries while greatly reducing
sensitivity to local noise or other intervening structures. Robust-
ness is further enhanced with on-the-fly training which causes the
boundary to adhere to the specific type of edge currently being fol-
lowed, rather than simply the strongest edge in the neighborhood.
Boundary cooling automatically freezes unchanging segments and
automates input of additional seed points. Cooling also allows the
user to be much more free with the gesture path, thereby increasing
the efficiency and finesse with which boundaries can be extracted.
Extracted objects can be scaled, rotated, and composited using
live-wire masks and spatial frequency equivalencing. Frequency
equivalencing is performed by applying a Butterworth filter which
matches the lowest frequency spectra to all other image compo-
nents. Intelligent Scissors allow creation of convincing composi-
tions from existing images while dramatically increasing the speed
and precision with which objects can be extracted.
1. Introduction
Digital image composition has recently received much attention
for special effects in movies and in a variety of desktop applica-
tions. In movies, image composition, combined with other digital
manipulation techniques, has also been used to realistically blend
old film into a new script. The goal of image composition is to com-
bine objects or regions from various still photographs or movie
frames to create a seamless, believable, image or image sequence
which appears convincing and real. Fig. 9(d) shows a believable
composition created by combining objects extracted from three
images, Fig. 9(a-c). These objects were digitally extracted and
combined in a few minutes using a new, interactive tool calledIntel-
ligent Scissors.
When using existing images, objects of interest must be extracted
and segmented from a surrounding background of unpredictable
complexity. Manual segmentation is tedious and time consuming,
lacking in precision, and impractical when applied to long image
sequences. Further, due to the wide variety of image types and con-
tent, most current computer based segmentation techniques are
slow, inaccurate, and require significant user input to initialize or
control the segmentation process.
This paper describes a new, interactive, digital image segmenta-
tion tool called “Intelligent Scissors” which allows rapid object
extraction from arbitrarily complex backgrounds. Intelligent Scis-
sors boundary detection formulates discrete dynamic programming
(DP) as a two-dimensional graph searching problem. Presented as
part of this tool are boundary cooling and on-the-fly training, which
reduce user input and dynamically adapt the tool to specific types of
edges. Finally, we present live-wire masking and spatial frequency
equivalencing for convincing image compositions.
2. Background
Digital image segmentation techniques are used to extract image
components from their surrounding natural background. However,
currently available computer based segmentation tools are typically
primitive and often offer little more advantage than manual tracing.
Region based magic wands, provided in many desktop applica-
tions, use an interactively selected seed point to “grow” a region by
adding adjacent neighboring pixels. Since this type of region grow-
ing does not provide interactive visual feedback, resulting region
boundaries must usually be edited or modified.
Other popular boundary definition methods use active contours
or snakes[1, 5, 8, 15] to improve a manually entered rough approx-
imation. After being initialized with a rough boundary approxima-
tion, snakes iteratively adjust the boundary points in parallel in an
attempt to minimize an energy functional and achieve an optimal
boundary. The energy functional is a combination of internal
forces, such as boundary curvature, and external forces, like image
gradient magnitude. Snakes can track frame-to-frame boundary
motion provided the boundary hasn’t moved drastically. However,
active contours follow a pattern of initialization followed by energy
minimization; as a result, the user does not know what the final
boundary will look like when the rough approximation is input. If
the resulting boundary is not satisfactory, the process must be
repeated or the boundary must be manually edited. We provide a
detailed comparison of snakes and Intelligent Scissors in section
3.6.
Another class of image segmentation techniques use a graph
searching formulation of DP (or similar concepts) to find globally
optimal boundaries [2, 4, 10, 11, 14]. These techniques differ from
snakes in that boundary points are generated in a stage-wise optimal
cost fashion whereas snakes iteratively minimize an energy func-
tional for all points on a contour in parallel (giving the appearance
of wiggling). However, like snakes, these graph searching tech-
niques typically require a boundary template--in the form of a man-
ually entered rough approximation, a figure of merit, etc.--which is
used to impose directional sampling and/or searching constraints.
This limits these techniques to a boundary search with one degree
of freedom within a window about the two-dimensional boundary
template. Thus, boundary extraction using previous graph search-
ing techniques is non-interactive (beyond template specification),
losing the benefits of further human guidance and expertise.
Intelligent Scissors for Image Composition
Eric N. Mortensen
1
William A. Barrett
2
Brigham Young University
1
enm@cs.byu.edu, Dept. of Comp. Sci., BYU, Provo, UT 84602 (801)378-7605
2
barrett@cs.byu.edu, Dept. of Comp. Sci., BYU, Provo, UT 84602 (801)378-7430
Permission to make digital/hard copy of part or all of this work
for personal or classroom use is granted without fee provided
that copies are not made or distributed for profit or commercial
advantage, the copyright notice, the title of the publication and
its date appear, and notice is given that copying is by permission
of ACM, Inc. To copy otherwise, to republish, to post on
servers, or to redistribute to lists, requires prior specific
permission and/or a fee.
©1995 ACM-0-89791-701-4/95/008…$3.50
191

The most important difference between previous boundary find-
ing techniques and Intelligent Scissors presented here lies not in the
boundary defining criteria per se´, but in the method of interaction.
Namely, previous methods exhibit a pattern of boundary approxi-
mation followed by boundary refinement, whereas Intelligent Scis-
sors allow the user to interactively select the most suitable
boundary from a set of all optimal boundaries emanating from a
seed point. In addition, previous approaches do not incorporate on-
the-fly training or cooling, and are not as computationally efficient.
Finally, it appears that the problem of automated matching of spa-
tial frequencies for digital image composition has not been
addressed previously.
3. Intelligent Scissors
Boundary definition via dynamic programming can be formu-
lated as a graph searching problem [10] where the goal is to find the
optimal path between a start node and a set of goal nodes. As
applied to image boundary finding, the graph search consists of
finding the globally optimal path from a start pixel to a goal pixel--
in particular, pixels represent nodes and edges are created between
each pixel and its 8 neighbors. For this paper, optimality is defined
as the minimum cumulative cost path from a start pixel to a goal
pixel where the cumulative cost of a path is the sum of the local
edge (or link) costs on the path.
3.1. Local Costs
Since a minimum cost path should correspond to an image com-
ponent boundary, pixels (or more accurately, links between neigh-
boring pixels) that exhibit strong edge features should have low
local costs and vice-versa. Thus, local component costs are created
from the various edge features:
The local costs are computed as a weighted sum of these component
functionals. Letting l(p,q) represents the local cost on the directed
link from pixel p to a neighboring pixel q, the local cost function is
(1)
where each ω is the weight of the corresponding feature function.
(Empirically, weights of ω
Z
= 0.43, ω
D
= 0.43, and ω
G
= 0.14 seem
to work well in a wide range of images.)
The laplacian zero-crossing is a binary edge feature used for edge
localization [7, 9]. Convolution of an image with a laplacian kernel
approximates the 2
nd
partial derivative of the image. The laplacian
image zero-crossing corresponds to points of maximal (or minimal)
gradient magnitude. Thus, laplacian zero-crossings represent
“good” edge properties and should therefore have a low local cost.
If I
L
(q) is the laplacian of an image I at pixel q, then
(2)
However, application of a discrete laplacian kernel to a digital
image produces very few zero-valued pixels. Rather, a zero-cross-
ing is represented by two neighboring pixels that change from pos-
itive to negative. Of the two pixels, the one closest to zero is used
to represent the zero-crossing. The resulting feature cost contains
single-pixel wide cost “canyons” used for boundary localization.
Image Feature Formulation
Laplacian Zero-Crossing f
Z
Gradient Magnitude f
G
Gradient Direction f
D
l pq,()ω
Z
f
Z
q()⋅ω
D
f
D
pq,()⋅ω
G
f
G
q()++=
f
Z
q()
0;if I
L
q() 0=
1;if I
L
q() 0
{=
Since the laplacian zero-crossing creates a binary feature, f
Z
(q)
does not distinguish between strong, high gradient edges and weak,
low gradient edges. However, gradient magnitude provides a direct
correlation between edge strength and local cost. If I
x
and I
y
repre-
sent the partials of an image I in x and y respectively, then the gra-
dient magnitude G is approximated with
.
The gradient is scaled and inverted so high gradients produce low
costs and vice-versa. Thus, the gradient component function is
(3)
giving an inverse linear ramp function. Finally, gradient magnitude
costs are scaled by Euclidean distance. To keep the resulting max-
imum gradient at unity, f
G
(q) is scaled by 1 if q is a diagonal neigh-
bor to p and by 1/2 if q is a horizontal or vertical neighbor.
The gradient direction adds a smoothness constraint to the
boundary by associating a high cost for sharp changes in boundary
direction. The gradient direction is the unit vector defined byI
x
and
I
y
. Letting D(p) be the unit vector perpendicular (rotated 90 degrees
clockwise) to the gradient direction at pointp (i.e., for D(p) = (I
y
(p),
-I
x
(p))), the formulation of the gradient direction feature cost is
(4)
where
are vector dot products and
(5)
is the bidirectional link or edge vector between pixels p and q.
Links are either horizontal, vertical, or diagonal (relative to the
position of q in ps neighborhood) and point such that the dot prod-
uct of D(p) and L(p, q) is positive, as noted in (5). The neighbor-
hood link direction associates a high cost to an edge or link between
two pixels that have similar gradient directions but are perpendicu-
lar, or near perpendicular, to the link between them. Therefore, the
direction feature cost is low when the gradient direction of the two
pixels are similar to each other and the link between them.
3.2. Two-Dimensional Dynamic Programming
As mentioned, dynamic programming can be formulated as a
directed graph search for an optimal path. This paper utilizes an
optimal graph search similar to that presented by Dijkstra [6] and
extended by Nilsson [13]; further, this technique builds on and
extends previous boundary tracking methods in 4 important ways:
1. It imposes no directional sampling or searching constraints.
2. It utilizes a new set of edge features and costs: laplacian
zero-crossing, multiple gradient kernels.
3. The active list is sorted with an O(N) sort for N nodes/pixels.
4. No a priori goal nodes/pixels are specified.
First, formulation of boundary finding as a 2-D graph search elimi-
nates the directed sampling and searching restrictions of previous
implementations, thereby allowing boundaries of arbitrary com-
GI
x
2
I
y
2
+=
f
G
max G() G
max G()
1
G
max G
()
==
f
D
pq,()
1
π
d
p
pq,()[]cos
1
d
q
pq,()[]cos
1
+{}=
d
p
pq,()D'p() Lpq,()=
d
q
pq,()Lpq,()D'q()=
Lpq,()
qpif D ' p() qp()0;
pqif D ' p() qp()0<;
{=
192

plexity to be extracted. Second, the edge features used here are
more robust and comprehensive than previous implementations: we
maximize over different gradient kernels sizes to encompass the
various edge types and scales while simultaneously attempting to
balance edge detail with noise suppression [7], and we use the lapla-
cian zero-crossing for boundary localization and fine detail live-
wire “snapping”. Third, the discrete, bounded nature of the local
edge costs permit the use of a specialized sorting algorithm that
inserts points into a sorted list (called the active list) in constant
time. Fourth, the live-wire tool is free to define a goal pixel inter-
actively, at any “free” point in the image, after minimum cost paths
are computed to all pixels. The latter happens fast enough that the
free point almost always falls within an expanding cost wavefront
and interactivity is not impeded.
The Live-Wire 2-D dynamic programming (DP) graph search
algorithm is as follows:
Algorithm: Live-Wire 2-D DP graph search.
Input:
s {Start (or seed) pixel.}
l(q,r) {Local cost function for link between pixels q and r.}
Data Structures:
L {List of active pixels sorted by total cost (initially empty).}
N(q) {Neighborhood set of q (contains 8 neighbors of pixel).}
e(q) {Boolean function indicating if q has been expanded/processed.}
g(q) {Total cost function from seed point to q.}
Output:
p {Pointers from each pixel indicating the minimum cost path.}
Algorithm:
g(s)0; Ls; {Initialize active list with zero cost seed pixel.}
while
L≠∅
do begin
{While still points to expand:}
qmin(L); {Remove minimum cost pixel q from active list.}
e(q)TRUE; {Mark q as expanded (i.e., processed).}
for each
rN(q)
such that
not e(r)
do begin
g
tmp
g(q)+l(q,r); {Compute total cost to neighbor.}
if
rL
and
g
tmp
<g(r)
then
{Remove higher cost neighbor’s }
rL; { from list.}
if
rL
then begin
{If neighbor not on list, }
g(r)g
tmp
; { assign neighbor’s total cost, }
p(r)q; { set (or reset) back pointer, }
Lr; { and place on (or return to) }
end
{ active list.}
end
end
Notice that since the active list is sorted, when a new, lower cumu-
lative cost is computed for a pixel already on the list then that point
must be removed from the list in order to be added back to the list
with the new lower cost. Similar to adding a point to the sorted list,
this operation is also performed in constant time.
Figure 1 demonstrates the use of the 2-D DP graph search algo-
rithm to create a minimum cumulative cost path map (with corre-
sponding optimal path pointers). Figure 1(a) is the initial local cost
map with the seed point circled. For simplicity of demonstration
the local costs in this example are pixel based rather than link based
and can be thought of as representing the gradient magnitude cost
feature. Figure 1(b) shows a portion of the cumulative cost and
pointer map after expanding the seed point (with a cumulative cost
of zero). Notice how the diagonal local costs have been scaled by
Euclidean distance (consistent with the gradient magnitude cost
feature described previously). Though complicating the example,
weighing by Euclidean distance is necessary to demonstrate that the
cumulative costs to points currently on the active list can change if
even lower cumulative costs are computed from as yet unexpanded
neighbors. This is demonstrated in Figure 1(c) where two points
have now been expanded--the seed point and the next lowest cumu-
lative cost point on the active list. Notice how the points diagonal
to the seed point have changed cumulative cost and direction point-
ers. The Euclidean weighting between the seed and diagonal points
makes them more costly than non-diagonal paths. Figures 1(d),
1(e), and 1(f) show the cumulative cost/direction pointer map at
various stages of completion. Note how the algorithm produces a
“wavefront” of active points emanating from the initial start point,
called the seed point, and that the wavefront grows out faster where
there are lower costs.
3.3. Interactive “Live-Wire” Segmentation Tool
Once the optimal path pointers are generated, a desired boundary
segment can be chosen dynamically via a “free” point. Interactive
movement of the free point by the mouse cursor causes the bound-
ary to behave like a live-wire as it adapts to the new minimum cost
path by following the optimal path pointers from the free point back
45 41 35 31 29 35 33 34 36 40 50
38 29 23 22 24 29 37 38 42 39 43
28 18 16 21 28 37 46 49 47 40 35
18 12 16 27 38 53 59 53 39 33 31
14 8 132029354954352832
14 6 6 12 14 22 28 35 27 25 31
18729591421182332
164016121315192739
18 13 7 6 14 17 18 17 24 30 45
111312958312410
1411742584638
116357912111074
7 4 6 11 13 18 17 14 8 5 2
6 2 7 10 15 15 21 19 8 3 5
83479131415956
115283457259
1242156324812
1097598537815
(a)
(e)
(f)
Figure 1: (a) Initial local cost matrix. (b) Seed point (shaded)
expanded. (c) 2 points (shaded) expanded. (d) 5 points (shaded)
expanded. (e) 47 points expanded. (f) Finished total cost and path
matrix with two of many paths (free points shaded) indicated.
(c)
(d)
41 35 31 29 35
38 29 23 22 24 29
28 18 16 21 28 37
18 12 16 27 38
14 8 1320293552 352832
14 6 6 12 14 22 28 35 27 25 31
18729591421182332
164016121315192740
18 13 7 6 14 17 18 17 24 30
(b)
7295
4016
13 7 6 14
7211
401
13 7 7
6 6 12 14 23
2072959
16401613
18 13 7 6 14
193

to the seed point. By constraining the seed point and free points to
lie near a given edge, the user is able to interactively “snap” and
“wrap” the live-wire boundary around the object of interest. Figure
2 demonstrates how a live-wire boundary segment adapts to
changes in the free point (cursor position) by latching onto more
and more of an object boundary. Specifically, note the live-wire
segments corresponding to user-specified free point positions at
times t
0
, t
1
, and t
2
. Although Fig. 2 only shows live-wire segments
for three discrete time instances, live-wire segments are actually
updated dynamically and interactively (on-the-fly) with each move-
ment of the free point.
When movement of the free point causes the boundary to digress
from the desired object edge, interactive input of a new seed point
prior to the point of departure reinitiates the 2-D DP boundary
detection. This causes potential paths to be recomputed from the
new seed point while effectively “tieing off” the boundary com-
puted up to the new seed point.
Note again that optimal paths are computed from the seed point
to all points in the image (since the 2-D DP graph search produces
a minimum cost spanning tree of the image [6]). Thus, by selecting
a free point with the mouse cursor, the interactive live-wire tool is
simply selecting an optimal boundary segment from a large collec-
tion of optimal paths.
Since each pixel (or free point) defines only one optimal path to
a seed point, a minimum of two seed points must be placed to
ensure a closed object boundary. The path map from the first seed
point of every object is maintained during the course of an object’s
boundary definition to provide a closing boundary path from the
free point. The closing boundary segment from the free point to the
first seed point expedites boundary closure.
Placing seed points directly on an object’s edge is often difficult
and tedious. If a seed point is not localized to an object edge then
spikes results on the segmented boundary at those seed points (since
Figure 2: Image demonstrating how the live-wire segment adapts and
snaps to an object boundary as the free point moves (via cursor move-
ment). The path of the free point is shown in white. Live-wire segments
from previous free point positions (t
0
, t
1
, and t
2
) are shown in green.
(a) (b)
Figure 3: Comparison of live-wire without (a) and with (b) cooling.
Withot cooling (a), all seed points must be placed manually on the
object edge. With cooling (b), seed points are generated automatically
as the live-wire segment freezes.
the boundary is forced to pass through the seed points). To facilitate
seed point placement, a cursor snap is available which forces the
mouse pointer to the maximum gradient magnitude pixel within a
user specified neighborhood. The neighborhood can be anywhere
from 1×1 (resulting in no cursor snap) to 15×15 (where the cursor
can snap as much as 7 pixels in both x and y). Thus, as the mouse
cursor is moved by the user, it snaps or jumps to a neighborhood
pixel representing a “good” static edge point.
3.4. Path Cooling
Generating closed boundaries around objects of interest can
require as few as two seed points (for reasons given previously).
Simple objects typically require two to five seed points but complex
objects may require many more. Even with cursor snap, manual
placement of seed points can be tedious and often requires a large
portion of the overall boundary definition time.
(a)
(b)
Figure 4: Comparison of live-wire (a) without and (b) with dynamic
training. (a) Without training, the live-wire segment snaps to nearby
strong edges. (b) With training, it favors edges with similar characteris-
tics as those just learned. (c) The static gradient magnitude cost map
shows that without training, high gradients are favored since they map
to low costs. However, with training, the dynamic cost map (d) favors
gradients similar to those sampled from the previous boundary segment.
M
G
n
G
0
0
Cost
Gradient Magnitude
M
G
n
G
0
0
Cost
Gradient Magnitude
Static Cost Map Dynamic Cost Map
(c) (d)
194

Automatic seed point generation relieves the user from precise
manual placement of seed points by automatically selecting a pixel
on the current active boundary segment to be a new seed point.
Selection is based on “path cooling” which in turn relies on path
coalescence. Though a single minimum cost path exists from each
pixel to a given seed point, many paths “coalesce” and share por-
tions of their optimal path with paths from other pixels. Due to
Bellman’s Principle of Optimality [3], if any two optimal paths
from two distinct pixels share a common point or pixel, then the two
paths are identical from that pixel back to the seed point. This is par-
ticularly noticeable if the seed point is placed near an object edge
and the free point is moved away from the seed point but remains
in the vicinity of the object edge. Though a new optimal path is
selected and displayed every time the mouse cursor moves, the
paths are typically identical near the seed point and object edges
and only change local to the free point. As the free point moves far-
ther and farther away from the seed point, the portion of the active
live-wire boundary segment that does not change becomes longer.
New seed points are generated at the end of a stable segment (i.e.,
that has not changed recently). Stability is measured by time (in
milliseconds) on the active boundary and path coalescence (number
of times the path has been redrawn from distinct free points).
This measure of stability provides the live-wire segment with a
sense of “cooling”. The longer a pixel is on a stable section of the
live-wire boundary, the cooler it becomes until it eventually freezes
and automatically produces a new seed point.
Figure 3 illustrates the benefit of path cooling. In Fig. 3(a), the
user must place each seed point manually on the object boundary.
However, with cooling (Fig. 3(b)), only the first seed point (and last
free point) need to be specified manually; the other seed points were
generated automatically via cooling.
3.5. Interactive Dynamic Training
On occasion, a section of the desired object boundary may have
a weak gradient magnitude relative to a nearby strong gradient
edge. Since the nearby strong edge has a relatively lower cost, the
live-wire segment snaps to the strong edge rather than the desired
weaker edge. This can be seen in Fig. 4(a). The desired boundary
is the woman’s (Harriet’s) cheek. However, since part of it is so
close to the high contrast shoulder of the man (Ozzie), the live-wire
snaps to the shoulder.
Training allows dynamic adaptation of the cost function based on
a sample boundary segment. Training exploits an object’s bound-
ary segment that is already considered to be good and is performed
dynamically as part of the boundary segmentation process. As a
result, trained features are updated interactively as an object bound-
ary is being defined. On-the-fly training eliminates the need for a
separate training phase and allows the trained feature cost functions
to adapt within the object being segmented as well as between
objects in the image. Fig. 4(b) demonstrates how a trained live-wire
segment latches onto the edge that is similar to the previous training
segment rather that the nearby stronger edge.
To facilitate training and trained cost computation, a gradient
magnitude feature map or image is precomputed by scaling the min-
imized gradient magnitude image, G', into an integer range of size
n
G
(i.e., from 0 to n
G
- 1). The actual feature cost is determined by
mapping these feature values through a look-up table which con-
tains the scaled (weighted) cost for each value. Fig 4(c) illustrates
edge cost based on gradient magnitude without training. Note that
with training (Fig. 4(d)) edge cost plummets for gradients that are
specific to the object of interest’s edges.
Selection of a “good” boundary segment for training is made
interactively using the live-wire tool. To allow training to adapt to
slow (or smooth) changes in edge characteristics, the trained gradi-
ent magnitude cost function is based only on the most recent or
closest portion of the current defined object boundary. A training
length, t, specifies how many of the most recent boundary pixels are
used to generate the training statistics. A monotonically decreasing
weight function (either linearly or Gaussian based) determines the
contribution from each of the closest t pixels. This permits adaptive
training with local dependence to prevent trained feature from
being too subject to old edge characteristics. The closest pixel (i.e.,
the current active boundary segment endpoint) gets a weight of 1
and the point that is t pixels away, along the boundary from the cur-
rent active endpoint, gets a minimal weight (which can be deter-
mined by the user). The training algorithm samples the
precomputed feature maps along the closestt pixels of the edge seg-
ment and increments the feature histogram element by the corre-
sponding pixel weight to generate a histogram for each feature
involved in training.
After sampling and smoothing, each feature histogram is then
scaled and inverted (by subtracting the scaled histogram values
from its maximum value) to create the feature cost map needed to
convert feature values to trained cost functions.
Since training is based on learned edge characteristics from the
most recent portion of an object’s boundary, training is most effec-
tive for those objects with edge properties that are relatively consis-
tent along the object boundary (or, if changing, at least change
smoothly enough for the training algorithm to adapt). In fact, train-
ing can be counter-productive for objects with sudden and/or dra-
matic changes in edge features. However, training can be turned on
and off interactively throughout the definition of an object bound-
ary so that it can be used (if needed) in a section of the boundary
with similar edge characteristics and then turned off before a drastic
change occurs.
3.6 Comparison with Snakes
Due to the recent popularity of snakes and other active contours
models and since the interactive boundary wrapping of the live-
wire may seem similar to the “wiggling” of snakes, we highlight
what we feel are the similarities and their corresponding differences
between snakes and Intelligent Scissors.
Similarities (compare with corresponding differences below):
1. The gradient magnitude cost in Intelligent Scissors is similar to
the edge energy functional used in snakes.
2. Both methods employ a smoothing term to minimize the effects
of noise in the boundary.
3. Snakes and live-wire boundaries are both attracted towards
strong edge features.
4. Both techniques attempt to find globally optimal boundaries to
try to overcome the effects of noise and edge dropout.
5. Snakes and Intelligent Scissors both require interaction as part of
the boundary segmentation process.
Differences (compare with corresponding similarities above):
1. The laplacian zero-crossing binary cost feature seems to have not
been used previously in active contours models
1
(or DP bound-
ary tracking methods for that matter).
2. The active contour smoothing term is internal (i.e., based on the
contours point positions) whereas the smoothing term for live-
wire boundaries is computed from external image gradient direc-
tions
2(next page)
.
1. Kass et al. [8] did use a squared laplacian energy functional to show the rela-
tionship of scale-space continuation to the Marr-Hildreth edge detection theory. How-
ever, the squared laplacian does not represent a binary condition, nor could it since the
variational calculus minimization used in [8] required that all functionals be differen-
tiable.
195

Citations
More filters
Journal ArticleDOI

"GrabCut": interactive foreground extraction using iterated graph cuts

TL;DR: A more powerful, iterative version of the optimisation of the graph-cut approach is developed and the power of the iterative algorithm is used to simplify substantially the user interaction needed for a given quality of result.
Book

Computer Vision: Algorithms and Applications

TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
Proceedings ArticleDOI

Modeling and rendering architecture from photographs: a hybrid geometry- and image-based approach

TL;DR: This work presents a new approach for modeling and rendering existing architectural scenes from a sparse set of still photographs, which combines both geometry-based and imagebased techniques, and presents view-dependent texture mapping, a method of compositing multiple views of a scene that better simulates geometric detail on basic models.
Proceedings ArticleDOI

Graphcut textures: image and video synthesis using graph cuts

TL;DR: A new algorithm for image and video texture synthesis where patch regions from a sample image or video are transformed and copied to the output and then stitched together along optimal seams to generate a new (and typically larger) output.
Journal ArticleDOI

Lazy snapping

TL;DR: Usability studies indicate that Lazy Snapping provides a better user experience and produces better segmentation results than the state-of-the-art interactive image cutout tool, Magnetic Lasso in Adobe Photoshop.
References
More filters
Journal ArticleDOI

A note on two problems in connexion with graphs

TL;DR: A tree is a graph with one and only one path between every two nodes, where at least one path exists between any two nodes and the length of each branch is given.
Journal ArticleDOI

Snakes : Active Contour Models

TL;DR: This work uses snakes for interactive interpretation, in which user-imposed constraint forces guide the snake near features of interest, and uses scale-space continuation to enlarge the capture region surrounding a feature.
Book

Dynamic Programming

TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.
Journal ArticleDOI

Theory of Edge Detection

TL;DR: The theory of edge detection explains several basic psychophysical findings, and the operation of forming oriented zero-crossing segments from the output of centre-surround ∇2G filters acting on the image forms the basis for a physiological model of simple cells.
Book

Computer vision

Related Papers (5)
Frequently Asked Questions (12)
Q1. What have the authors contributed in "Intelligent scissors for image composition" ?

The authors present a new, interactive tool called Intelligent Scissors which they use for image segmentation and composition. Robustness is further enhanced with on-the-fly training which causes the boundary to adhere to the specific type of edge currently being followed, rather than simply the strongest edge in the neighborhood. 

The longer a pixel is on a stable section of the live-wire boundary, the cooler it becomes until it eventually freezes and automatically produces a new seed point. 

training can be turned on and off interactively throughout the definition of an object boundary so that it can be used (if needed) in a section of the boundary with similar edge characteristics and then turned off before a drastic change occurs. 

Since each pixel (or free point) defines only one optimal path to a seed point, a minimum of two seed points must be placed to ensure a closed object boundary. 

formulation of boundary finding as a 2-D graph search eliminates the directed sampling and searching restrictions of previous implementations, thereby allowing boundaries of arbitrary com-G The authorx 2 The authory 2+=f Gm a x G( ) G− m a x G( )1 

Equivalencing of spatial frequencies is performed by matching the spectral content of the cut piece and the destination image in the vicinity where it is to be pasted. 

There are many rich extensions of this work, including: (1) making use of the weighted zero-crossings in the Laplacian to perform subpixel edge filtering and anti-aliasing, (2) use of multiple layered (multiplane) masks, (3) making spatial frequency equivalencing locally adaptive, (4) varying the light source over the object using directional gradient shading (artificial or borrowed) to provide consistent lighting in the composition, and, most importantly (5) extension of the 2-D DP graph search and application of the live-wire snap and training tools to moving objects and moving, multiplane masks for composition of image sequences. 

Since training is based on learned edge characteristics from the most recent portion of an object’s boundary, training is most effective for those objects with edge properties that are relatively consistent along the object boundary (or, if changing, at least change smoothly enough for the training algorithm to adapt). 

This requires the composition artist to “slip” the cutout object behind some scene components while leaving it in front of other components. 

the live-wire tool is much more similar to previous stage-wise optimal boundary tracking approaches than it is to snakes, since Intelligent Scissors were developed as an interactive 2-D extension to previous optimal edge tracking methods rather than an improvement on active contours. 

it is possible to extend the gradient direction term to include 3 pixels and 2 links without significant loss of computational efficiency. 

That is, the position of the object edge can be estimated to subpixel accuracy by using a (linearly) weighted combination of the laplacian pixel values on either side of the zero-crossings.