scispace - formally typeset
Search or ask a question

Showing papers by "Andrew Zisserman published in 1999"


01 Jan 1999
TL;DR: A perspective (central) projection camera is represented by a matrix that can be computed from the correspondence of four (or more) points.
Abstract: A perspective (central) projection camera is represented by a matrix. The most general perspective transformation transformation between two planes (a world plane and the image plane, or two image planes induced by a world plane) is a plane projective transformation. This can be computed from the correspondence of four (or more) points. The epipolar geometry between two views is represented by the fundamental matrix. This can be computed from the correspondence of seven (or more) points. Imaging Geometry

1,301 citations


Journal ArticleDOI
01 Sep 1999
TL;DR: Methods for creating 3D graphical models of scenes from a limited numbers of images, i.e. one or two, in situations where no scene co‐ordinate measurements are available are presented.
Abstract: We present methods for creating 3D graphical models of scenes from a limited numbers of images, i.e. one or two, in situations where no scene co-ordinate measurements are available. The methods employ constraints available from geometric relationships that are common in architectural scenes - such as parallelism and orthogonality - together with constraints available from the camera. In particular, by using the circular points of a plane simple, linear algorithms are given for computing plane rectification, plane orientation and camera calibration from a single image. Examples of image based 3D modelling are given for both single images and image pairs.

310 citations


Journal ArticleDOI
TL;DR: An uncertainty analysis which includes both the errors in image localization and the uncertainty in the imaging transformation is developed, and the distribution of correspondences can be chosen to achieve a particular bound on the uncertainty.

265 citations


Proceedings ArticleDOI
01 Jun 1999
TL;DR: The novelty of the approach lies in the use of inter-image homographies to validate and best estimate the plane, and in the minimal initialization requirements-only a single 3D line with a textured neighbourhood is required to generate a plane hypothesis.
Abstract: A new method is described for automatically reconstructing 3D planar faces from multiple images of a scene. The novelty of the approach lies in the use of inter-image homographies to validate and best estimate the plane, and in the minimal initialization requirements-only a single 3D line with a textured neighbourhood is required to generate a plane hypothesis. The planar facets enable line grouping and also the construction of parts of the wireframe which were missed due to the inevitable shortcomings of feature detection and matching. The method allows a piecewise planar model of a scene to be built completely automatically, with no user intervention at any stage, given only the images and camera projection matrices as input. The robustness and reliability of the method are illustrated on several examples, from both aerial and interior views.

237 citations


Proceedings Article
08 Sep 1999
TL;DR: An algorithm for automatically matching line segments over multiple images and generating a piecewise planar reconstruction based on the matched lines shows that a planar facet hypothesis can be generated from a single 3D line, using an inter-image homography applied to the line neighbourhood.
Abstract: This paper describes two developments in the automatic reconstruction of buildings from aerial images. The first is an algorithm for automatically matching line segments over multiple images. The algorithm employs geometric constraints based on the multi-view geometry together with photometric constraints derived from the line neighbourhood, and achieves a performance of better than 95% correct matches over three views. The second development is a method for automatically computing a piecewise planar reconstruction based on the matched lines. The novelty here is that a planar facet hypothesis can be generated from a single 3D line, using an inter-image homography applied to the line neighbourhood. The algorithm has successfully generated near complete roof reconstructions from multiple images. This work has been carried out as part of the EC IMPACT project. A summary of the project is included.

225 citations


Book ChapterDOI
21 Sep 1999
TL;DR: This report is a brief overview of the use of “feature based” methods in structure and motion computation and a companion paper by Irani and Anandan reviews “direct’ methods.
Abstract: This report is a brief overview of the use of “feature based” methods in structure and motion computation. A companion paper by Irani and Anandan [16] reviews “direct” methods.

202 citations


Journal ArticleDOI
TL;DR: The aim of this work is the recovery of 3D structure and camera projection matrices for each frame of an uncalibrated image sequence, and investigates two strategies for tackling degeneracies, including a statistical model selection test to identify when degeneracies occur.
Abstract: The aim of this work is the recovery of 3D structure and camera projection matrices for each frame of an uncalibrated image sequence. In order to achieve this, correspondences are required throughout the sequence. A significant and successful mechanism for automatically establishing these correspondences is by the use of geometric constraints arising from scene rigidity. However, problems arise with such geometry guided matching if general viewpoint and general structure are assumed whilst frames in the sequence and/or scene structure do not conform to these assumptions. Such cases are termed degenerate. In this paper we describe two important cases of degeneracy and their effects on geometry guided matching. The cases are a motion degeneracy where the camera does not translate between frames, and a structure degeneracy where the viewed scene structure is planar. The effects include the loss of correspondences due to under or over fitting of geometric models estimated from image data, leading to the failure of the tracking method. These degeneracies are not a theoretical curiosity, but commonly occur in real sequences where models are statistically estimated from image points with measurement error. We investigate two strategies for tackling such degeneracies: the first uses a statistical model selection test to identify when degeneracies occur: the second uses multiple motion models to overcome the degeneracies. The strategies are evaluated on real sequences varying in motion, scene type, and length from 13 to 120 frames.

187 citations


Proceedings ArticleDOI
01 Jan 1999
TL;DR: A simple approach to combining scene and auto-calibration constraints for the calibration of cameras from single views and stereo pairs and examples of various cases of constraint combination and degeneracy as well as computational techniques are presented.
Abstract: We present a simple approach to combining scene and auto-calibration constraints for the calibration of cameras from single views and stereo pairs. Calibration constraints are provided by imaged scene structure, such as vanishing points of orthogonal directions, or rectified planes. In addition, constraints are available from the nature of the cameras and the motion between views. We formulate these constraints in terms of the geometry of the imaged absolute conic and its relationship to pole-polar pairs and the imaged circular points of planes. Three significant advantages result: first, constraints from scene features, camera characteristics and auto-calibration constraints provide linear equations in the elements of the image of the absolute conic. This means that constraints may easily be combined, and their solution is straightforward. Second, the degeneracies that occur when constraints are not independent may be easily identified. Lastly, the constraints from scene planes and image planes may be treated uniformly. Examples of various cases of constraint combination and degeneracy as well as computational techniques are presented.

152 citations


Proceedings ArticleDOI
01 Sep 1999
TL;DR: An algebraic representation is developed which unifies the three types of measurement and, amongst other advantages, permits a first order error propagation analysis to be performed, associating an uncertainty with each measurement.
Abstract: We describe how 3D affine measurements may be computed from a single perspective view of a scene given only minimal geometric information determined from the image. This minimal information is typically the vanishing line of a reference plane and a vanishing point for a direction not parallel to the plane. It is shown that affine scene structure may then be determined from the image, without knowledge of the camera's internal calibration (e.g. focal length), nor of the explicit relation between camera and world (pose). In particular we show how to: compute the distance between planes parallel to the reference plane (up to a common scale factor); compute area and length ratios on any plane parallel to the reference plane; determine the camera's (viewer's) location. Simple geometric derivations are given for these results. We also develop an algebraic representation which unifies the three types of measurement and, amongst other advantages, permits a first order error propagation analysis to be performed, associating an uncertainty with each measurement. We demonstrate the technique for a variety of applications, including height measurements in forensic images and 3D graphical modelling from single images.

130 citations


Book ChapterDOI
01 Jan 1999
TL;DR: It is shown that structures that repeat on a scene plane are related by particular parametrized transformations in perspective images, which provide powerful grouping constraints and can be used at the heart of hypothesize and verify grouping algorithms.
Abstract: The objective of this work is the automatic detection and grouping of imaged elements which repeat on a plane in a scene (for example tiled floorings). It is shown that structures that repeat on a scene plane are related by particular parametrized transformations in perspective images. These image transformations provide powerful grouping constraints, and can be used at the heart of hypothesize and verify grouping algorithms. The parametrized transformations are global across the image plane and may be computed without knowledge of the pose of the plane or camera calibration. Parametrized transformations are given for severalcl asses of repeating operation in the world as well as groupers based on these. These groupers are demonstrated on a number of real images, where both the elements and the grouping are determined automatically. It is shown that the repeating element can be learnt from the image, and hence provides an image descriptor. Also, information on the plane pose, such as its vanishing line, can be recovered from the grouping.

86 citations


Proceedings ArticleDOI
04 Feb 1999
TL;DR: In this paper, a new measurement algorithm is presented which generates height measurements and their associated errors from a single known physical measurement in an image, which draws on results from projective geometry and computer vision.
Abstract: In this paper a new measurement algorithm is presented which generates height measurements and their associated errors from a single known physical measurement in an image. The method draws on results from projective geometry and computer vision. A height measurement is obtained from each frame of the video. A `stereo like' correspondence between images is not required. Nor is any explicit camera calibration. The accuracy of the algorithm is demonstrated by a number of examples when ground truth is known. Finally, the height measurements and their variation are described for a person in motion. We draw attention to the uncertainty in heights associated with humans in motion, and the limitations of using this description for identification.

Proceedings ArticleDOI
07 Jun 1999
TL;DR: A method to completely automatically recover 3D scene structure together with a camera for each frame from a sequence of images acquired by an unknown camera undergoing unknown movement is described.
Abstract: We describe a method to completely automatically recover 3D scene structure together with a camera for each frame from a sequence of images acquired by an unknown camera undergoing unknown movement. Previous approaches have used calibration objects or landmarks to recover this information, and are therefore often limited to a particular scale. The approach of this paper is far more general, since the "landmarks" are derived directly from the imaged scene texture. The method can be applied to a large class of scenes and motions, and is demonstrated for sequences of interior and exterior scenes using both controlled-motion and hand-held cameras. We demonstrate two applications of this technology. The first is the construction of 3D graphical models of the scene; the second is the insertion of virtual objects into the original image sequence. Other applications include image compression and frame interpolation.

Proceedings ArticleDOI
01 Sep 1999
TL;DR: This paper investigates the multiple view geometry of smooth surfaces and a plane, where the plane provides a planar homography mapping between the views, and new solutions are given for the computation of epipolar and trifocal geometry for this type of scene.
Abstract: This paper investigates the multiple view geometry of smooth surfaces and a plane, where the plane provides a planar homography mapping between the views. Innovations are made in three areas: first, new solutions are given for the computation of epipolar and trifocal geometry for this type of scene. In particular it is shown that the epipole may be determined from bitangents between the homography registered occluding contours, and a new minimal solution is given for computing the trifocal tensor: Second, algorithms are demonstrated for automatically estimating the fundamental matrix and trifocal tensor from images of such scenes. Third, a method is developed for estimating camera matrices for a sequence of images of these scenes. These three areas are combined in a "freehand scanner" application where 3D texture-mapped graphical models of smooth objects are acquired directly from a video sequence of the object and plane.

Proceedings ArticleDOI
01 Jan 1999
TL;DR: By incorporating image-based geometric constraints over multiple views, this paper improves on traditional techniques which use purely 3D information and directly target perceptual cues, important to the human visual system, by which errors in AR are most readily perceived.
Abstract: The goal of augmented reality is to insert virtual objects into real video sequences. This paper shows that by incorporating image-based geometric constraints over multiple views, we improve on traditional techniques which use purely 3D information. The constraints imposed are chosen to directly target perceptual cues, important to the human visual system, by which errors in AR are most readily perceived. Imposition of the constraints is achieved by constrained maximum-likelihood estimation, and blends projective, affine and Euclidean geometry as appropriate in different cases. We introduce a number of examples of augmented reality tasks, show how image-based constraints can be incorporated into current 3D-based systems, and demonstrate the improvements conferred.

Book ChapterDOI
01 Jan 1999
TL;DR: Two image matching techniques that owe their success to a combination of geometric and photometric constraints are described and it is shown that these two techniques may be combined and are complementary for the application of image retrieval from an image database.
Abstract: We describe two image matching techniques that owe their success to a combination of geometric and photometric constraints. In the first, images are matched under similarity transformations by using local intensity invariants and semi-local geometric constraints. In the second, 3D curves and lines are matched between images using epipolar geometry and local photometric constraints. Both techniques are illustrated on real images. We show that these two techniques may be combined and are complementary for the application of image retrieval from an image database. Given a query image, local intensity invariants are used to obtain a set of potential candidate matches from the database. This is very efficient as it is implemented as an indexing algorithm. Curve matching is then used to obtain a more significant ranking score. It is shown that for correctly retrieved images many curves are matched, whilst incorrect candidates obtain very low ranking.

Proceedings ArticleDOI
04 Feb 1999
TL;DR: The aim of this work is the removal of distracting background patterns from forensic evidence so that the evidence is rendered more visible.
Abstract: The aim of this work is the removal of distracting background patterns from forensic evidence so that the evidenceis rendered more visible. An example is the image of a finger print on a non-periodic background. The methodinvolves registering the image with a control image of the background pattern that we seek to remove. A statisticalcomparison of the registered images identifies the latent mark. The registration of the images involves both a geometric and a photometric component. The geometric registrationis invariant to perspective distortion and the photometric registration invariant to affine colour-space transformations. The algorithm is based on a robust Maximum Likelihood Estimator. Both the registration and the comparisonalgorithms are automatic. The paper briefly explains these algorithms. The method applies in situations where periodic background removal (e.g. Fourier techniques) would not be successful. The method has proven effective in extracting latent fingermark detail overlaying non-periodic backgroundsand examples are shown of its success in removing such backgrounds that would otherwise hamper fingermarkidentification. Indeed, the process has succeeded in rendering visible fingerprints that were totally camouflaged bythe background pattern on bank notes. It is also applicable in removing backgrounds in other forensic cases such asfootprints, for example.Keywords: Forensic images, fingerprints, latent marks, image registration, photometric registration, spatial defor-mations

21 Sep 1999
TL;DR: The post-workshop proceedings of the 1999 International Workshop on Vision Algorithms (IWCV'99) as mentioned in this paper were published as a collection of 15 revised full papers from 65 submissions; each paper is complemented by a brief transcription of the discussion that followed its presentation.
Abstract: From the Publisher: "This book constitutes the thoroughly refereed post-workshop proceedings of the International Workshop on Vision Algorithms held in Corfu, Greece in September 1999 in conjunction with ICCV'99. The 15 revised full papers presented were carefully reviewed and selected from 65 submissions; each paper is complemented by a brief transcription of the discussion that followed its presentation. Also included are two invited contributions and two expert reviews as well as a panel discussion. The volume spans the whole range of algorithms for geometric vision. The authors and volume editors succeeded in providing added value beyond a mere collection of papers and made the volume a state-of-the-art survey of their field."--BOOK JACKET.