scispace - formally typeset
Search or ask a question
Journal ArticleDOI

On Symmetry and Multiple-View Geometry: Structure, Pose, and Calibration from a Single Image

01 Dec 2004-International Journal of Computer Vision (Kluwer Academic Publishers)-Vol. 60, Iss: 3, pp 241-265
TL;DR: Since every symmetric structure admits a “canonical” coordinate frame with respect to which the group action can be naturally represented, the canonical pose between the viewer and this canonical frame can be recovered too, which explains why symmetric objects provide us overwhelming clues to their orientation and position.
Abstract: In this paper, we provide a principled explanation of how knowledge in global 3-D structural invariants, typically captured by a group action on a symmetric structure, can dramatically facilitate the task of reconstructing a 3-D scene from one or more images. More importantly, since every symmetric structure admits a “canonical” coordinate frame with respect to which the group action can be naturally represented, the canonical pose between the viewer and this canonical frame can be recovered too, which explains why symmetric objects (e.g., buildings) provide us overwhelming clues to their orientation and position. We give the necessary and sufficient conditions in terms of the symmetry (group) admitted by a structure under which this pose can be uniquely determined. We also characterize, when such conditions are not satisfied, to what extent this pose can be recovered. We show how algorithms from conventional multiple-view geometry, after properly modified and extended, can be directly applied to perform such recovery, from all “hidden images” of one image of the symmetric structure. We also apply our results to a wide range of applications in computer vision and image processing such as camera self-calibration, image segmentation and global orientation, large baseline feature matching, image rendering and photo editing, as well as visual illusions (caused by symmetry if incorrectly assumed).

Summary (1 min read)

Jump to:  and [Summary]

Summary

  • A retrospective study was performed in selected states of the Sudan that include Gezira state, White Nile, Blue Nile, Khartoum, River Nile and Sennar states in order to investigate the seroprevalence of Rift Valley Fever (RVF) from 2007 to 2016.
  • The risk factors that identi ed for RVF were locality, species, and animal population.
  • It affects livestock like sheep, goat, cattle and camel .it usually occurs following heavy rainfall and cause storm of abortion in pregnant animals.
  • Locality and species were signi cantly associated with seroprevalence of RVF (P-value = 049), (P-value = 0.000) respectively, While animal population was not associated in Gezira state (P-value = .415).
  • Study design Retrospective study design was carried out to investigate, sero-prevalence, associated risk factors, spatial distribution from 2007 to 2016.
  • Rift Valley fever was con ned to Africa and Madagascar and in year 2000 has occurred in Saudi Arabia and Yemen by ( 3) and (7).
  • The current study has concluded that RVF is endemic in some areas of sudan; and further surveillances is needed to throughly understand the dynamic and epidemiology of the disease.

Did you find this useful? Give us your feedback

Figures (20)

Content maybe subject to copyright    Report

International Journal of Computer Vision 60(3), 241–265, 2004
c
2004 Kluwer Academic Publishers. Manufactured in The Netherlands.
On Symmetry and Multiple-View Geometry:
Structure, Pose, and Calibration from a Single Image
WEI HONG, ALLEN YANG YANG, KUN HUANG AND YI MA
Department of Electrical & Computer Engineering, University of Illinois at Urbana-Champaign,
1308 West Main St., Urbana, IL 61801, USA
weihong@uiuc.edu
yangyang@uiuc.edu
kunhuang@uiuc.edu
yima@uiuc.edu
Received October 16, 2002; Revised March 17, 2004; Accepted March 17, 2004
Abstract. In this paper, we provide a principled explanation of how knowledge in global 3-D structural invariants,
typically captured by a group action on a symmetric structure, can dramatically facilitate the task of reconstructing
a 3-D scene from one or more images. More importantly, since every symmetric structure admits a “canonical”
coordinate frame with respect to which the group action can be naturally represented, the canonical pose between
the viewer and this canonical frame can be recovered too, which explains why symmetric objects (e.g., buildings)
provide us overwhelming clues to their orientation and position. We give the necessary and sufficient conditions
in terms of the symmetry (group) admitted by a structure under which this pose can be uniquely determined. We
also characterize, when such conditions are not satisfied, to what extent this pose can be recovered. We show how
algorithms from conventional multiple-view geometry, after properly modified and extended, can be directly applied
to perform such recovery, from all “hidden images” of one image of the symmetric structure. We also apply our
results to a wide range of applications in computer vision and image processing such as camera self-calibration,
image segmentation and global orientation, large baseline feature matching, image rendering and photo editing, as
well as visual illusions (caused by symmetry if incorrectly assumed).
Keywords: structure from symmetry, multiple-view geometry, symmetry group, reflective symmetry, rotational
symmetry, and translational symmetry
1. Introduction
One of the main goals of computer vision is the study
of how to infer three-dimensional (3-D) information
(e.g., shape, layout and motion) of a scene from its
two-dimensional (2-D) image(s). A particular thrust of
effort is to extract 3-D geometric information from 2-
D images by exploiting geometric relationships among
multiple images of the same set of features on a 3-D
This work is supported by UIUC ECE/CSL startup fund and NSF
Career Award IIS-0347456.
object. This gives rise to the subject of multiple-view
geometry,aprimary focus of study in the computer
vision community for the past two decades or so. Un-
fortunately, certain relationships among features them-
selves have been, to a large extent, ignored or at least
under-studied. Some of those relationships, as we will
see from this paper, have significant impact on the way
that 3-D information can be (and should be) inferred
from images.
Before we proceed further, let us pause and exam
the images given in Fig. 1 below. What do they have
in common? Notice that these images are just a few

242 Hong et al.
Figure 1. Symmetry is in: architecture, machines, textures, crystals, molecules, ornaments, and nature, etc.
representatives of a common phenomenon exhibited in
nature or man-made environment: symmetry.Itisnot
so hard to convince ourselves that even from only a
single image, we are able to perceive clearly the 3-D
structure and relative pose (orientation and location)
of the object being seen, even though in the image the
shape of the object is distorted by the perspective pro-
jection. The reason is, simply put, there is symmetry at
play.
1
The goals of this paper are to provide a principled
explanation why symmetry could encode 3-D informa-
tion within a single perspective image and to develop
algorithms based on multiple-view geometry that ef-
ficiently extract the 3-D information from single im-
ages. There are two things which we want to point out
already:
1. Symmetry is not the only cue which encodes 3-D in-
formation through relationships among a set of fea-
tures (in one image or more images). For instance,
incidence relations among points, lines, and planes
may as well provide 3-D information to the viewer;
2. The concept of symmetry that we consider here is
not just the (bilateral) reflective symmetry, or the
(statistical) isotopic symmetry which has been stud-
ied in a certain extent in the computer vision litera-
ture. Instead it is a more general notion describing
global structural invariants of an object under the
action of any group of transformations. To clarify
this notion is one of the goals of this paper.
Symmetry, as a useful geometric cue to 3-D informa-
tion, has been extensively discussed in psychological
vision literature (Marr, 1982; Plamer, 1999). Neverthe-
less, its contribution to computational vision so far has
been explored often through statistical methods, such as
the study of isotropic textures (e.g., for the 4th image of
Fig. 1) (Gibson, 1950; Witkin, 1988; Zabrodsky et al.,
1995; Mukherjee et al., 1995; Malik and Rosenholtz,
1997; Rosenholtz and Malik, 1997; Leung and Malik,
1997). It is the works of Garding (1992, 1993) and
Malik and Rosenholtz (1997) that have provided peo-
ple a wide range of efficient algorithms for recov-
ering the shape (i.e. the slant and tilt) of a textured
plane based on the assumption of isotropy (or weak
isotropy). These methods are mainly based on collect-
ing statistical characteristics (e.g., the distribution of
edge directions) from sample patches of the texture
and comparing them with those of adjacent patches
against the isotropic hypothesis. Information about the

On Symmetry and Multiple-View Geometry 243
surface shape is then often conveniently encoded in the
discrepancy or variation of these characteristics.
But symmetry is by nature a geometric property!
Although in many cases the result of symmetry indeed
causes certain statistical homogeneity (like the 4th im-
age of Fig. 1), there are reasons to believe that more
accurate and reliable 3-D geometric information can
be retrieved if we can directly exploit this property
through geometric means. For example, for the texture
shown in the 4th image of Fig. 1, shouldn’t we directly
exploit the fact that the tiling is invariant under certain
proper translations parallel to the plane? To a large ex-
tent, such a geometric approach is complementary to
extant statistical approaches: if statistical homogeneity
can be exploited for shape recovery, so can geometric
homogeneity, especially in cases where symmetry is the
underlying cause for such homogeneity. Of course, for
cases where statistical methods no longer apply (e.g.,
the 5th image of Fig. 1), geometric methods remain as
the only option. One may call this approach as structure
from symmetry.
We are by no means the first to notice that symme-
try, especially reflective symmetry, can be exploited
by geometric means for retrieving 3-D geometric in-
formation. Mitsumoto et al. (1992) studied how to re-
construct a 3-D object using the mirror image based
on planar symmetry, Vetter and Poggio (1994) proved
that for any reflective symmetric 3-D object one non-
accidental 2-D model view is sufficient for recognition,
Zabrodsky and Weinshall (1997) used bilateral symme-
try assumption to improve 3-D reconstruction from im-
age sequences, and Zabrodsky et al. (1995) provided
a good survey on studies of reflective symmetry and
rotational symmetry in computer vision at the time.
In 3-D object and pose recognition, Rothwell et al.
(1993) pointed out that the assumption of reflective
symmetry can be used in the construction of projective
invariants and is able to eliminate certain restrictions
on the corresponding points. Cham and Cipolla (1996)
built the correspondences of contours from reflective
symmetry. For translational symmetry, Schaffalitzky
and Zisserman (2000) used it to detect the vanish-
ing lines and points. Liu et al. (1995) analyzed the
error of obtaining 3-D invariants derived from trans-
lational symmetry. In addition to isometric symme-
try, Liebowitz and Zisserman (1998), Criminisi and
Zisserman (1999) and Criminisi and Zisserman (2000)
showed that other knowledge (e.g., length ratio, vanish-
ing line, etc.) in 3-D also allows accurate reconstruction
of structural metric and camera pose.
For the detection of symmetry from images, Marola
(1989), Kiryati and Gofman (1998) and Mukherjee
et al. (1995) presented efficient algorithms to find axes
of reflective symmetry in 2-D images, Sun and Sherrah
(1997) discussed reflective symmetry detection in 3-D
space, and Zabrodsky et al. (1995) introduced a symme-
try distance to classify reflective and rotational symme-
try in 2-D and 3-D spaces (with some related comments
given in Kanatani (1997)). Carlsson (1998) and Gool
et al. (1996) derived methods to find 3-D symmetry
from invariants in the 2-D projection. Liu and Colline
(1998) proposed a method to classify any images with
translational symmetry into the 7 Frieze groups and 17
wallpaper groups.
However, there is still a lack of formal and unified
analysis as well as efficient algorithms which would
allow people to easily make use of numerous and dif-
ferent types of symmetry that nature offers. Is there a
unified approach to study 3-D information encoded in a
2-D perspective image of an object that exhibits certain
symmetry? This paper will try to provide a definite an-
swer to this question. Our work differs from previous
results in at least the following three aspects:
1. We study symmetry under perspective projection
based on existing theory of multiple-view geom-
etry.
2
We claim that in order to fully understand
such 3-D information encoded in a single image,
one must understand geometry among multiple im-
ages.
2. In addition to recover 3-D structure of a symmetric
object from its image, we show that any type of
symmetry is naturally equipped with a canonical
(world) coordinate frame, from which the viewer’s
relative pose to the object can be recovered.
3. We give the necessary and sufficient conditions in
terms of the symmetry group of the object under
which the canonical pose can be uniquely recov-
ered, and we characterize the inherent ambiguity for
each fundamental type of symmetry. Thus, for the
first time, geometric group theory and (perspective)
multiple-view geometry are elegantly and tightly
integrated.
During the development, an important principle asso-
ciated with images of symmetric objects will be ex-
amined with care: One image of a symmetric object is
equivalent to multiple images. This principle is how-
ever not entirely correct since, as we will see, often
relationships among such “images” will not be the

244 Hong et al.
same as those among conventional images. It in fact
requires careful modifications to existing theories and
algorithms in multiple-view geometry if they are to be
correctly applied to images of symmetric objects.
2. Problem Formulation
Before we formulate the problem in a more abstract
form, let us take a look at a simple example: a planar
board with a symmetric pattern as shown in Fig. 2. It is
easy to see that, from any generic viewpoint, there are
at least four equivalent vantage points (with only the
rotational symmetry considered, for now) which give
rise to an identical image. The only question is which
corners in the image correspond to the ones on the
board. In this sense, these images are in fact different
from the original one. We may call these images as
“hidden.
3
For instance, in Fig. 2, we labeled in bracket
corresponding corner numbers for such a hidden image.
Figure 2. Left: a checker board whose symmetry includes reflection along the x and y axes and rotation about o by 90
. Right: an image taken
at location o
1
. Notice that the image would appear to be exactly the same if it were taken at o
2
instead. g
0
is the relative pose of the board we
perceive from the image on the right.
Figure 3. I
x
: Corner correspondence between the original image of the board and an “image” with the board reflected in the x-axis by 180
;
I
y
: Corner correspondence between the original image of the board and an “image” with the board reflected in the y-axis by 180
.
In addition to the rotational symmetry, another kind
of symmetry, the reflective symmetry, can give rise to
a not so conventional type of hidden images, as shown
in Fig. 3. Notice that, in the figure, the two “hidden
images” with the four corners labeled by numbers in
bracket cannot be an image of the same board from any
(physically viable) vantage point!
4
Nevertheless, as we
will see below, just like the rotational symmetry, this
type of hidden images also encodes rich 3-D geometric
information about the object.
There is yet another type of symmetry “hidden” in
a pattern like a checker board. As shown in Fig. 4 be-
low, for a pattern that repeats a fundamental region
indefinitely along one or more directions, the so-called
“infinite rapport, one would obtain exactly “the same”
image had the images been taken at vantage points that
differ from each other by multiples nT of one basic
translation T . Although all images would appear to be
the same, features (e.g., points, lines) in these images
correspond to different physical features in the world.

On Symmetry and Multiple-View Geometry 245
Figure 4. The checker pattern is repeated indefinitely along the
x-axis. Images taken at o
1
, o
2
, and o
3
will be the same.
Therefore, for an image like the 4th one in Fig. 1, it in
fact may give rise to many (in theory, possibly infinitely
many) “hidden images. There is clearly a reason to
believe that it is these (many) hidden images that give
away the geometry of the plane (e.g., tilt, slant) to the
viewer’s eyes.
It is then not hard to imagine that the combination of
the rotational, reflective and translational symmetries
will give rise to all sorts of symmetric objects in 2-D
or 3-D space, many of which could be rather compli-
cated. In our man-made world, symmetric objects are
ubiquitous, under the names of “ornament, “mosaic,
“pattern, or “tiling, etc. Fascination about symmetric
objects can be traced back to ancient Egyptians and
Greeks.
5
Nevertheless, a formal mathematical inquiry
to symmetry is known as Hilbert’s 18th problem, and
a complete answer to it was not found till 1910 by
Bieberbach (1910). While in the appendix we briefly
review results of a complete list for 2-D and 3-D sym-
metric structures and groups, this paper will focus on
how to combine this knowledge about symmetry with
multiple-view geometry so as to infer 3-D information
of a symmetric object from its image(s).
In order to explain why symmetry gives away accu-
rate information about structure and location of a sym-
metric 3-D object from a single 2-D perspective image,
we will need a mathematical framework within which
all types of symmetries (that we have mentioned or not
mentioned in the above examples) can be uniformly
taken into account. Only if we can do that, will the in-
troduction of symmetry into multiple-view geometry
become natural and convenient.
Definition 1 (Symmetric structure and its group action).
A set of points S R
3
is called a symmetric structure if
there exists a non-trivial subgroup G of the Euclidean
group E(3) that acts on it. That is, for any element
g G,itdefines a bijection (i.e. a one-to-one, onto)
map from S to itself:
g G : S S.
Sometimes we say that S has a symmetry group G.Or
G is a group of symmetries of S.
In particular, we have g(S) = g
1
(S) = S for any
g G. Mathematically, symmetric structures and
groups are equivalent ways to capture symmetry: any
symmetric structure is invariant under the action of its
symmetry group; and any group (here as a subgroup
of E(3)) defines a class of (3-D) structures that are in-
variant under this group action (see Appendix A). Here
we emphasize that G is in general a subgroup of the
Euclidean group E(3) but not the special one SE(3).
This is because many symmetric structures that we are
going to consider are invariant under reflection which
is an element in O(3) but not SO(3).
6
For simplicity,
in this paper we consider G to be a discontinuous (or
discrete) group.
7
Using the homogeneous representation of E(3), any
element g = (R, T )inG can be represented as a 4 × 4
matrix of the form
g =
RT
01
R
4×4
, (1)
where R R
3×3
is an orthogonal matrix (“R for both
rotation and reflection) and T R
3
is a vector (“T for
translation). Note that in order to represent G in this
way, aworld coordinate frame must have been chosen.
It is conventional to choose the origin of the world
coordinate frame to be the center of rotation and its
axes to line up with the axes of rotation and reflection
or direction of translation. Often the canonical world
coordinate frame results in the simplest representation
of the symmetry (Ma et al., 2003).
Now suppose that an image of a symmetric structure
S is taken at a vantage point g
0
= (R
0
, T
0
) SE(3)—
denoting the pose of the structure relative to the viewer
or the camera. Here g
0
is assumed to be represented
with respect to the canonical world coordinate frame
for the symmetry. If so, we call g
0
the canonical pose.
As we will see shortly, the canonical pose g
0
from the
viewer to the object can be uniquely determined from
a single image as long as symmetry admitted by the
object (or the scene) is “rich” enough.

Citations
More filters
Proceedings ArticleDOI
17 Oct 2005
TL;DR: This work shows that it can estimate the coarse geometric properties of a scene by learning appearance-based models of geometric classes, even in cluttered natural scenes, and provides a multiple-hypothesis framework for robustly estimating scene structure from a single image and obtaining confidences for each geometric label.
Abstract: Many computer vision algorithms limit their performance by ignoring the underlying 3D geometric structure in the image. We show that we can estimate the coarse geometric properties of a scene by learning appearance-based models of geometric classes, even in cluttered natural scenes. Geometric classes describe the 3D orientation of an image region with respect to the camera. We provide a multiple-hypothesis framework for robustly estimating scene structure from a single image and obtaining confidences for each geometric label. These confidences can then be used to improve the performance of many other applications. We provide a thorough quantitative evaluation of our algorithm on a set of outdoor images and demonstrate its usefulness in two applications: object detection and automatic single-view reconstruction.

792 citations

Book ChapterDOI
07 May 2006
TL;DR: It is shown how symmetric pairs of features can be efficiently detected, how the symmetry bonding each pair is extracted and evaluated, and how these can be grouped into symmetric constellations that specify the dominant symmetries present in the image.
Abstract: A novel and efficient method is presented for grouping feature points on the basis of their underlying symmetry and characterising the symmetries present in an image. We show how symmetric pairs of features can be efficiently detected, how the symmetry bonding each pair is extracted and evaluated, and how these can be grouped into symmetric constellations that specify the dominant symmetries present in the image. Symmetries over all orientations and radii are considered simultaneously, and the method is able to detect local or global symmetries, locate symmetric figures in complex backgrounds, detect bilateral or rotational symmetry, and detect multiple incidences of symmetry.

387 citations

Book
17 Jun 2010
TL;DR: Recognizing the fundamental relevance and group theory of symmetry has the potential to play an important role in computational sciences.
Abstract: In the arts and sciences, as well as in our daily lives, symmetry has made a profound and lasting impact. Likewise, a computational treatment of symmetry and group theory (the ultimate mathematical formalization of symmetry) has the potential to play an important role in computational sciences. Though the term Computational Symmetry was formally defined a decade ago by the first author, referring to algorithmic treatment of symmetries, seeking symmetry from digital data has been attempted for over four decades. Computational symmetry on real world data turns out to be challenging enough that, after decades of effort, a fully automated symmetry-savvy system remains elusive for real world applications. The recent resurging interests in computational symmetry for computer vision and computer graphics applications have shown promising results. Recognizing the fundamental relevance and potential power that computational symmetry affords, we offer this survey to the computer vision and computer graphics communities. This survey provides a succinct summary of the relevant mathematical theory, a historic perspective of some important symmetry-related ideas, a partial yet timely report on the state of the arts symmetry detection algorithms along with its first quantitative benchmark, a diverse set of real world applications, suggestions for future directions and a comprehensive reference list.

235 citations


Cites background from "On Symmetry and Multiple-View Geome..."

  • ...The key observation there is that the symmetry group actions associated with any symmetric structure allows us to interpret a single perspective image of the structure as multiple images, called “hidden images” in [102], taken from viewpoints related by the same group actions....

    [...]

  • ...In particular, [102] has provided a clear characterization of the relationship between 3D symmetric structures and their 2D perspective images....

    [...]

  • ...Based on these constraints, [102] has derived the...

    [...]

  • ...Interested readers may refer to [102] for a more complete survey on that subject....

    [...]

Journal ArticleDOI
27 Jul 2014
TL;DR: This work presents a method that enables users to perform the full range of 3D manipulations, including scaling, rotation, translation, and nonrigid deformations, to an object in a photograph.
Abstract: Photo-editing software restricts the control of objects in a photograph to the 2D image plane. We present a method that enables users to perform the full range of 3D manipulations, including scaling, rotation, translation, and nonrigid deformations, to an object in a photograph. As 3D manipulations often reveal parts of the object that are hidden in the original photograph, our approach uses publicly available 3D models to guide the completion of the geometry and appearance of the revealed areas of the object. The completion process leverages the structure and symmetry in the stock 3D model to factor out the effects of illumination, and to complete the appearance of the object. We demonstrate our system by producing object manipulations that would be impossible in traditional 2D photo-editing programs, such as turning a car over, making a paper-crane flap its wings, or manipulating airplanes in a historical photograph to change its story.

187 citations


Cites methods from "On Symmetry and Multiple-View Geome..."

  • ...In using symmetries to complete appearance, our work is related to approaches that extract symmetries from images and 3D models [Hong et al. 2004; Gal and Cohen-Or 2006; Pauly et al. 2005], and that use symmetries to complete geometry [Terzopoulos et al....

    [...]

  • ...In using symmetries to complete appearance, our work is related to approaches that extract symmetries from images and 3D models [Hong et al. 2004; Gal and Cohen-Or 2006; Pauly et al. 2005], and that use symmetries to complete geometry [Terzopoulos et al. 1987; Mitra et al. 2006; Mitra and Pauly…...

    [...]

Proceedings ArticleDOI
14 Jun 2020
TL;DR: HybridPose as discussed by the authors utilizes a hybrid intermediate representation to express different geometric information in the input image, including keypoints, edge vectors, and symmetry correspondences, which allows pose regression to exploit more and diverse features when one type of predicted representation is inaccurate.
Abstract: We introduce HybridPose, a novel 6D object pose estimation approach. HybridPose utilizes a hybrid intermediate representation to express different geometric information in the input image, including keypoints, edge vectors, and symmetry correspondences. Compared to a unitary representation, our hybrid representation allows pose regression to exploit more and diverse features when one type of predicted representation is inaccurate (e.g., because of occlusion). Different intermediate representations used by HybridPose can all be predicted by the same simple neural network, and outliers in predicted intermediate representations are filtered by a robust regression module. Compared to state-of-the-art pose estimation approaches, HybridPose is comparable in running time and is significantly more accurate. For example, on Occlusion Linemod dataset, our method achieves a prediction speed of 30 fps with a mean ADD(-S) accuracy of 79.2%, representing a 67.4% improvement from the current state-of-the-art approach.

126 citations


Cites background from "On Symmetry and Multiple-View Geome..."

  • ...Traditional applications of symmetry detection include face recognition [31], depth estimation [21], and 3D reconstruction [13, 43]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: New results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form that provide the basis for an automatic system that can solve the Location Determination Problem under difficult viewing.
Abstract: A new paradigm, Random Sample Consensus (RANSAC), for fitting a model to experimental data is introduced. RANSAC is capable of interpreting/smoothing data containing a significant percentage of gross errors, and is thus ideally suited for applications in automated image analysis where interpretation is based on the data provided by error-prone feature detectors. A major portion of this paper describes the application of RANSAC to the Location Determination Problem (LDP): Given an image depicting a set of landmarks with known locations, determine that point in space from which the image was obtained. In response to a RANSAC requirement, new results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form. These results provide the basis for an automatic system that can solve the LDP under difficult viewing

23,396 citations

Journal ArticleDOI
TL;DR: It is proved the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density.
Abstract: A general non-parametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure: the mean shift. For discrete data, we prove the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density. The relation of the mean shift procedure to the Nadaraya-Watson estimator from kernel regression and the robust M-estimators; of location is also established. Algorithms for two low-level vision tasks discontinuity-preserving smoothing and image segmentation - are described as applications. In these algorithms, the only user-set parameter is the resolution of the analysis, and either gray-level or color images are accepted as input. Extensive experimental results illustrate their excellent performance.

11,727 citations


"On Symmetry and Multiple-View Geome..." refers methods in this paper

  • ...Using the techniques introduced earlier in this paper, we can test whether certain image segments, obtained by other low-level segmentation algorithms such as mean shift (Comanicu and Meer, 2002), can be the perspective projection of symmetric objects in 3-D....

    [...]

Book
01 Jan 1950

3,843 citations


"On Symmetry and Multiple-View Geome..." refers background in this paper

  • ...Nevertheless, its contribution to computational vision so far has been explored often through statistical methods, such as the study of isotropic textures (e.g., for the 4th image of Fig. 1) ( Gibson, 1950; Witkin, 1988; Zabrodsky et al., 1995; Mukherjee et al., 1995; Malik and Rosenholtz, 1997; Rosenholtz and Malik, 1997; Leung and Malik, 1997)....

    [...]

  • ...1) (Gibson, 1950; Witkin, 1988; Zabrodsky et al., 1995; Mukherjee et al., 1995; Malik and Rosenholtz, 1997; Rosenholtz and Malik, 1997; Leung and Malik, 1997)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors present a comprehensive reference on tiling theory for many decades, bringing together older results that have not been brought together before, and containing a wealth of new material.
Abstract: "Remarkable...It will surely remain the unique reference in this area for many years to come." Roger Penrose , Nature "...an outstanding achievement in mathematical education." Bulletin of The London Mathematical Society "I am enormously impressed...Will be the definitive reference on tiling theory for many decades. Not only does the book bring together older results that have not been brought together before, but it contains a wealth of new material...I know of no comparable book." Martin Gardner

1,894 citations


"On Symmetry and Multiple-View Geome..." refers background in this paper

  • ...Interested readers can find a full description of all the 17 patterns in (Grünbaum and Shephard, 1987) ....

    [...]

  • ...All facts and statements will be given without proofs, and interested readers may refer to (Weyl, 1952; Grünbaum and Shephard, 1987; Martin, 1975)....

    [...]

Frequently Asked Questions (14)
Q1. What have the authors contributed in "On symmetry and multiple-view geometry: structure, pose, and calibration from a single image∗" ?

In this paper, the authors provide a principled explanation of how knowledge in global 3-D structural invariants, typically captured by a group action on a symmetric structure, can dramatically facilitate the task of reconstructing a 3-D scene from one or more images. More importantly, since every symmetric structure admits a “ canonical ” coordinate frame with respect to which the group action can be naturally represented, the canonical pose between the viewer and this canonical frame can be recovered too, which explains why symmetric objects ( e. g., buildings ) provide us overwhelming clues to their orientation and position. The authors show how algorithms from conventional multiple-view geometry, after properly modified and extended, can be directly applied to perform such recovery, from all “ hidden images ” of one image of the symmetric structure. The authors give the necessary and sufficient conditions in terms of the symmetry ( group ) admitted by a structure under which this pose can be uniquely determined. The authors also characterize, when such conditions are not satisfied, to what extent this pose can be recovered. 

Obviously this is a more principled way to study assumptions about 3-D structure that people have exploited before in multiple-view geometry, such as orthogonality and parallelism ( hence vanishing points ) etc. Furthermore, such information can be readily utilized to establish correspondence across images taken with a large baseline or change of view angle: As long as one common ( local ) symmetry can be recognized and aligned properly, the rest of the structures in the scene can then be correctly registered and reconstructed. The authors believe that, together with conventional geometric constraints among multiple images, symmetry is indeed an important cue which eventually makes 3-D reconstruction a more well-conditioned problem. 

lines, and planes are special symmetric objects which have been extensively studied as primitive geometric features for reconstructing a 3-D scene from 2-D images. 

Probably the most important observation from this paper is that, in addition to the 3-D structure, the “canonical” pose between the canonical world coordinate frame of a symmetric object and the camera can also be recovered. 

As the authors have suggested before, although symmetry is a phenomenon associated with a single image, a full understanding of its effect on 3-D reconstruction depends on the theory of multiple-view geometry. 

Given an image of a structure S with a reflective symmetry with respect to a plane in 3-D, the canonical pose g0 can be determined up to an arbitrary choice of an orthonormal frame in this plane, which is a 3- parameter family of ambiguity (i.e. SE(2)). 

Then the real kernel of L is a 3-dimensional space which has the basis{[0, 0, v1], [Im(v2), Re(v2), 0], [−Re(v2), Im(v2), 0]} ∈ R3×3. 

The ground truth for the length ratios of the white board and table are 1.51 and 1.00, and the recovered length ratio are 1.506 and 1.003, respectively. 

Using the techniques introduced earlier in this paper, the authors can test whether certain image segments, obtained by other low-level segmentation algorithms such as mean shift (Comanicu and Meer, 2002), can be the perspective projection of symmetric objects in 3-D. 

Using their methods, the camera poses can be easily obtained as a “by-product” when the authors align the symmetric objects in different images. 

As the authors will see shortly, the canonical pose g0 from the viewer to the object can be uniquely determined from a single image as long as symmetry admitted by the object (or the scene) is “rich” enough. 

As the authors have seen from above sections, there is always some ambiguity in determining the relative pose (g0) from the vantage point to the canonical world coordinate frame (where the symmetry group G was represented in the first place) if only one type of symmetry is considered. 

Figure 13 shows an comprehensive example of symmetry-based photo editing, which includes removing occlusion, copying and replacing objects in the scene, and adding new objects. 

The checker board is a planar structure which is symmetric with respect to the central line of itself (in fact there are many more local reflective symmetry on parts of the board).