scispace - formally typeset
Search or ask a question

Showing papers by "Luc Van Gool published in 1999"


Journal ArticleDOI
TL;DR: A theoretical proof is given which shows that the absence of skew in the image plane is sufficient to allow for self-calibration and a method to detect critical motion sequences is proposed.
Abstract: In this paper the theoretical and practical feasibility of self-calibration in the presence of varying intrinsic camera parameters is under investigation. The paper‘s main contribution is to propose a self-calibration method which efficiently deals with all kinds of constraints on the intrinsic camera parameters. Within this framework a practical method is proposed which can retrieve metric reconstruction from image sequences obtained with uncalibrated zooming/focusing cameras. The feasibility of the approach is illustrated on real and synthetic examples. Besides this a theoretical proof is given which shows that the absence of skew in the image plane is sufficient to allow for self-calibration. A counting argument is developed which—depending on the set of constraints—gives the minimum sequence length for self-calibration and a method to detect critical motion sequences is proposed.

829 citations


Book ChapterDOI
TL;DR: This contribution develops a new technique for content-based image retrieval that classify the images based on local invariants that represent the image in a very compact way and allow fast comparison and feature matching with images in the database.
Abstract: This contribution develops a new technique for content-based image retrieval. Where most existing image retrieval systems mainly focus on color and color distribution or texture, we classify the images based on local invariants. These features represent the image in a very compact way and allow fast comparison and feature matching with images in the database. Using local features makes the system robust to occlusions and changes in the background. Using invariants makes it robust to changes in viewpoint and illumination.

223 citations


Book ChapterDOI
15 Sep 1999
TL;DR: This image sequence is calibrated with a structure-from-motion approach that considers the special viewing geometry of plenoptic scenes and dense depth maps are recovered locally for each viewpoint by applying a stereo matching technique.
Abstract: In this contribution we focus on plenoptic scene modeling and rendering from long image sequences taken with a hand-held camera. The image sequence is calibrated with a structure-from-motion approach that considers the special viewing geometry of plenoptic scenes. By applying a stereo matching technique, dense depth maps are recovered locally for each viewpoint.

135 citations


Proceedings ArticleDOI
04 Feb 1999
TL;DR: In this paper, a new measurement algorithm is presented which generates height measurements and their associated errors from a single known physical measurement in an image, which draws on results from projective geometry and computer vision.
Abstract: In this paper a new measurement algorithm is presented which generates height measurements and their associated errors from a single known physical measurement in an image. The method draws on results from projective geometry and computer vision. A height measurement is obtained from each frame of the video. A `stereo like' correspondence between images is not required. Nor is any explicit camera calibration. The accuracy of the algorithm is demonstrated by a number of examples when ground truth is known. Finally, the height measurements and their variation are described for a person in motion. We draw attention to the uncertainty in heights associated with humans in motion, and the limitations of using this description for identification.

76 citations


Book ChapterDOI
15 Sep 1999
TL;DR: This contribution focuses on calibration and 3D surface modeling from uncalibrated images, which are collected with a hand-held camera by simply waving the camera around the objects to be modeled.
Abstract: In this contribution we focus on calibration and 3D surface modeling from uncalibrated images A large number of images from a scene is collected with a hand-held camera by simply waving the camera around the objects to be modeled The images need not be taken in sequential order, thus either video streams or sets of still images may be processed Since images are taken from all possible viewpoints and directions, we are effectively sampling the viewing sphere around the objects

42 citations


Book ChapterDOI
01 Jan 1999
TL;DR: This paper contributes to the viewpoint and illumination independent recognition of planar color patterns such as labels, logos, signs, pictograms, etc by means of moment invariants by using powers of the intensities in the different color bands of a color image and combinations thereof for the construction of the moments.
Abstract: This paper contributes to the viewpoint and illumination independent recognition of planar color patterns such as labels, logos, signs, pictograms, etc. by means of moment invariants. It introduces the idea of using powers of the intensities in the different color bands of a color image and combinations thereof for the construction of the moments. First, a complete classification is made of all functions of such moments which are invariant under both affine deformations of the pattern (thus achieving viewpoint invariance) as well as linear changes of the intensity values in the individual color bands (hence, coping with changes in the irradiance pattern due to different lighting conditions and/or viewpoints). The discriminant power and classification performance of these new invariants for color pattern recogniti on has been tested on a data set consisting of images of real outdoors advertising panels. Furthermore, a comparison to moment invariants presented in literature ([1] and [2]) that come closest sto the aimed type of invariants is made and new approaches to improve their performance are presented.

40 citations


Proceedings ArticleDOI
01 Jan 1999
TL;DR: The generation of realistic 3D models for a virtual exhibition of the archaeological excavation site in Sagalassos, Turkey will be demonstrated, as the approach operates independently of object scale and requires only a single low-cost consumer photo or video camera.
Abstract: This paper addresses the problem of obtaining photo-realistic 3D models of a scene from images alone with a structure-from-motion approach. The 3D scene is observed from multiple viewpoints by freely moving a camera around the object. No restrictions on camera movement and interval camera parameters like zoom are imposed, as the camera pose and intrinsic parameters are calibrated from the sequence. The only restrictions on the scene content are the rigidity of the scene objects and opaque, piecewise smooth object surfaces. The approach operates independently of object scale and requires only a single low-cost consumer photo or video camera. The modeling system described here uses a three-step approach. First, the camera pose and intrinsic parameters are calibrated on-line by tracking salient feature points between the different views. Next, consecutive images of the sequence are treated as stereoscopic image pairs and dense correspondence maps are computed by area matching. Finally, dense and accurate depth maps are computed by linking together all correspondences over the viewpoints. The depth maps are converted to triangular surfaces meshes that are texture mapped for photo-realistic appearance. The resulting surface models are stored in VRML-format for easy exchange and visualization. The feasibility of the approach has been tested extensively and will be illustrated on several real scenes. In particular we will demonstrate the generation of realistic 3D models for a virtual exhibition of the archaeological excavation site in Sagalassos, Turkey.

28 citations



Proceedings Article
01 Sep 1999
TL;DR: In this paper, a geometry-based structure-from-motion approach is proposed to compute camera calibration and local depth estimates for lightfield data structure reconstruction, where the estimated geometry need not be globally consistent but is updated locally depending on the rendering viewpoint.
Abstract: Lightfield rendering allows fast visualization of complex scenes by view interpolation from images of densely spaced camera viewpoints. The lightfield data structure requires calibrated viewpoints, and rendering quality can be improved substantially when local scene depth is known for each viewpoint. In this contribution we propose to combine lightfield rendering with a geometry-based structure-from-motion approach that computes camera calibration and local depth estimates. The advantage of the combined approach w.r.t. a pure geometric structure recovery is that the estimated geometry need not be globally consistent but is updated locally depending on the rendering viewpoint. We concentrate on the viewpoint calibration that is computed directly from the image data by tracking image feature points. Ground-truth experiments on real lightfield sequences confirm the quality of calibration.

12 citations


Book ChapterDOI
01 Jan 1999
TL;DR: Systems of coupled, non-linear diffusion equations are proposed as a computational tool for grouping and it is shown how different cues can be used for grouping given these two blueprints plus cue-specific specialisations.
Abstract: Systems of coupled, non-linear diffusion equations are proposed as a computational tool for grouping. Grouping tasks are divided into two classes - local and bilocal - and for each a prototypical set of equations is presented. It is shown how different cues can be used for grouping given these two blueprints plus cue-specific specialisations. Results are shown for intensity, texture orientation, stereo disparity, optical flow, mirror symmetry, and regular textures. The proposed equations are particularly well suited for parallel implementations. They also show some interesting analogies with basic architectural characteristics of the cortex.

9 citations



Proceedings ArticleDOI
28 Jun 1999
TL;DR: A frame of research for information retrieval from remote sensing images is presented, based on a new concept developed at the German Aerospace Center, DLR Oberpfaffenhofen in collaboration with the Swiss Federal Institute of Technology, ETH Zurich, which emphasises the user access to the intrinsic information in RS images.
Abstract: This article presents a frame of research for information retrieval from remote sensing (RS) images, based on a new concept developed at the German Aerospace Center, DLR Oberpfaffenhofen in collaboration with the Swiss Federal Institute of Technology, ETH Zurich. The newly developed technology emphasises the user access to the intrinsic information in RS images and on information instead of data dissemination. Thus the users are enabled to enhance the robustness of their decisions and to extend the range of applications.

Proceedings ArticleDOI
11 Nov 1999
TL;DR: In this paper, a posed tree-based cluster algorithm was proposed to automatically collect and label the necessary set of training samples for crops that are planted in rows, thus eliminating every user- interaction and user-knowledge.
Abstract: Recognizing, online, cops and weeds enables to reduce the use of chemicals in agriculture. First, a sensor and classifier is proposed to measure and classify, online, the plant reflectance. However, as plant reflectance varies with unknown field dependent plant stress factors, the classifier must be trained on each field separately in order to recognize crop and weeds accurately on that field. Collecting the samples manually requires user-knowledge and time and is therefore economically not feasible. The posed tree-based cluster algorithm enables to automatically collect and label the necessary set of training samples for crops that are planted in rows, thus eliminating every user- interaction and user-knowledge. The classifier, trained with the automatically collected and labeled training samples, is able to recognize crop and weeds with an accuracy of almost 94 percent. This result in acceptable weed hit rates and significant herbicide reductions. Spot-spraying on the weeds only becomes economically feasible.© (1999) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Book ChapterDOI
01 Sep 1999
TL;DR: Two tourist guides are described, one supports a virtual tour through an archaeological site, the other a tour through a real exhibition that gives information about the paintings at the ongoing Van Dijck exhibition in the Antwerp museum of fine arts.
Abstract: Two tourist guides are described One supports a virtual tour through an archaeological site, the other a tour through a real exhibition The first system is based on the 3D reconstruction of the ancient city of Sagalassos A virtual guide, represented by an animated mask can be given commands using natural speech Through its expressions the mask makes clear whether the questions have been understood, whether they make sense, etc Its presence largely increases the intuitiveness of the interface This system is described only very concisely A second system is a palmtop assistant that gives information about the paintings at the ongoing Van Dijck exhibition in the Antwerp museum of fine arts The system consists of a handheld PC with camera and Ethernet radio link Images are taken of paintings or details thereof The images are analysed by a server, which sends back information about the particular painting or the details It gives visitors more autonomy in deciding in which order to look at pieces and how much information is required about each The system is based on image database retrieval, where interest points are characterised by geometric/photometric invariants of their neighbourhoods







Book ChapterDOI
01 Jan 1999
TL;DR: The aim of this paper is to assess the feasibility of extracting 3- dimensional information about man-made objects from very high resolution satellite imagery and to this end, a 3-dimensional, multi-view based paradigm is proposed.
Abstract: The aim of this paper is to assess the feasibility of extracting 3- dimensional information about man-made objects from very high resolution satellite imagery. To this end, a 3-dimensional, multi-view based paradigm is proposed. The underlying philosophy is to generate from the images reliable geometric 3D features, exploiting the multi-view geometric constraints for blunder correction. Scene analysis is performed by reasoning in 3D world space and verified in the images by means of a hypothesis generation and verification procedure in which the decisions are taken on the basis of a multi-view consensus. As an example of such an approach, a method for automatic modelling and 3D reconstruction of buildings is discussed and the effects of the resolution of the new generation satellite data on the performance of the algorithm and the metric accuracy of the final reconstruction is investigated.

Book ChapterDOI
26 Mar 1999
TL;DR: Two examples of non-Euclidean 3D reconstruction from multiple, uncalibrated views and scene description based on local, affinely invariant surface patches that can be extracted from single views are discussed.
Abstract: Object recognition, visual robot guidance, and several other vision applications require models of objects or scenes. Computer vision has a tradition of building these models from inherent object characteristics. The problem is that such characteristics are difficult to extract. Recently, a pure view-based object recognition approach was proposed, that is surprisingly performant. It is based on a model that is extracted directly from raw image data. Limitations of both strands raise the question whether there is room for middle ground solutions, that combine the strengths but avoid the weaknesses. Two examples are discussed, where in each case the only input required are images, but where nevertheless substantial feature extraction and analysis are involved. These are non-Euclidean 3D reconstruction from multiple, uncalibrated views and scene description based on local, affinely invariant surface patches that can be extracted from single views. Both models are useful for robot vision tasks such as visual navigation.

Book ChapterDOI
01 Sep 1999
TL;DR: In this paper, a scheme based on an adapted OLA (Optimal Level Allocation) video codec is shown, and important data rate reductions can be achieved with it.
Abstract: As cheap and powerful 3D render engines become commonplace, demand for nearly realistic 3D scenes is increasing. Besides more detailed geometric and texture information, this presupposes the ability to map dynamic textures. This is obviously needed to model movies, computer and TV screens, but also for example the landscape as seen from inside a moving vehicle, or shadow and lighting effects that are not modeled separately. Downloading the complete scene to the user, before letting him interact with the scene, becomes very unpractical and inefficient with huge scenes. If the texture is not a canned sequence, but a stream, it is altogether impossible. Often a back channel is available, which allows on demand downloading so the user can start interacting with the scene immediately. This can save considerable amounts of bandwidth. Specifically for dynamic texture, if we know the viewpoint of the user (or several users), we can code the texture taking into account the viewing conditions, i.e. coding and transmitting each part of the texture with the required resolution only. Applications that would benefit from view-dependent coding of dynamic textures, include (but are not limited to) multiplayer 3D games, walkthroughs of dynamic constructions or scenes and 3D simulations of dynamic systems. In this paper, a scheme based on an adapted OLA (Optimal Level Allocation) video codec is shown. Important data rate reductions can be achieved with it.