scispace - formally typeset
Search or ask a question

Showing papers by "Luc Van Gool published in 2007"


Proceedings ArticleDOI
29 Jul 2007
TL;DR: Algorithms to automatically derive 3D models of high visual quality from single facade images of arbitrary resolutions are described to give rise to three exciting applications: urban reconstruction based on low resolution oblique aerial imagery, reconstruction of facades based on higher resolution ground-based imagery, and the automatic derivation of shape grammar rules from facade images to build a rule base for procedural modeling technology.
Abstract: This paper describes algorithms to automatically derive 3D models of high visual quality from single facade images of arbitrary resolutions. We combine the procedural modeling pipeline of shape grammars with image analysis to derive a meaningful hierarchical facade subdivision. Our system gives rise to three exciting applications: urban reconstruction based on low resolution oblique aerial imagery, reconstruction of facades based on higher resolution ground-based imagery, and the automatic derivation of shape grammar rules from facade images to build a rule base for procedural modeling technology.

490 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a system for autonomous mobile robot navigation with only an omnidirectional camera as sensor, which is able to build automatically and robustly accurate topologically organized environment maps of a complex, natural environment.
Abstract: In this work we present a novel system for autonomous mobile robot navigation. With only an omnidirectional camera as sensor, this system is able to build automatically and robustly accurate topologically organised environment maps of a complex, natural environment. It can localise itself using such a map at each moment, including both at startup (kidnapped robot) or using knowledge of former localisations. The topological nature of the map is similar to the intuitive maps humans use, is memory-efficient and enables fast and simple path planning towards a specified goal. We developed a real-time visual servoing technique to steer the system along the computed path. A key technology making this all possible is the novel fast wide baseline feature matching, which yields an efficient description of the scene, with a focus on man-made environments.

189 citations


Proceedings ArticleDOI
26 Nov 2007
TL;DR: This paper details the process, based on archaeological data, to simulate ancient Pompeii life in real time, and describes the system pipeline, which allows for the simulation of thousands of Virtual Romans inreal time.
Abstract: Pompeii was a Roman city, destroyed and completely buried during an eruption of the volcano Mount Vesuvius. We have revived its past by creating a 3D model of its previous appearance and populated it with crowds of Virtual Romans. In this paper, we detail the process, based on archaeological data, to simulate ancient Pompeii life in real time. In a first step, an annotated city model is generated using procedural modelling. These annotations contain semantic data, such as land usage, building age, and window/door labels. In a second phase, the semantics are automatically interpreted to populate the scene and trigger special behaviors in the crowd, depending on the location of the characters. Finally, we describe the system pipeline, which allows for the simulation of thousands of Virtual Romans in real time.

99 citations


Patent
09 Jul 2007
TL;DR: In this paper, a novel endoscope or an optical large view endoscopic system with improved depth perception is described. But this system is not suitable for the detection of internal structures within a cavity or an enclosing structure, such as an animal body.
Abstract: The present invention relates to a novel endoscope or an optical large view endoscopic system with improved depth perception. In particular, a multiple viewpoint endoscope system comprising a multiple viewpoint camera setup and/or an intelligent or cognitive image control system and display device particularly adapted for localising internal structures within a cavity or an enclosing structure, such as an animal body, for instance the abdomen of an animal or human, or for localising a real or synthetic image of such internal structures within an overview image or on an overview 3D model.

63 citations


01 Jan 2007
TL;DR: Building facade reconstruction algorithms that process single images and exploit expectations about facade composition are discussed, which makes heavy use of the repetitions that tend to occur, e.g. in windows and balconies.
Abstract: Interest in the automatic production of 3D building models has increased over the last years. The reconstruction of buildings, particularly their facades, is a hard subproblem, given the large variety in their appearances and structures. This paper discusses building facade reconstruction algorithms that process single images and exploit expectations about facade composition. In particular, we make heavy use of the repetitions that tend to occur, e.g. in windows and balconies. But this is only an example of the kind of rules found in recent architectural shape grammars. We distinguish between cases without and with substantial perspective effects in the input image. The focus is on the latter case, where also some depth layering in the facade can be performed automatically. We give several examples of real building reconstructions.

46 citations


Proceedings ArticleDOI
01 Jan 2007
TL;DR: This paper investigates several aspects of 3D-2D camera pose estimation, aimed at robot navigation in poorly-textured scenes, and proposes a fast, linear algorithm for the general case with six or more points, which is of utmost importance in a test and hypothesis framework.
Abstract: This paper investigates several aspects of 3D-2D camera pose estimation, aimed at robot navigation in poorly-textured scenes. The major contribution is a fast, linear algorithm for the general case with six or more points. We show how to specialise this to work with only four or five points, which is of utmost importance in a test and hypothesis framework. Our formulation allows for an easy inclusion of lines, as well as the handling of other camera geometries, such as stereo rigs. We also treat the special case of planar motion, a valid restriction for most indoor environments. We conclude the paper with extensive simulated tests and a real test case, which substantiate the algorithm’s usability for our application domain.

25 citations


01 Jan 2007
TL;DR: A novel approach for the automatic creation of vegetation scenarios in real or virtual 3D cities in order to simplify the complex design process and time consuming modeling tasks in urban landscape planning and introduces shape grammars as a practical tool for the rule-based generation of urban open spaces.
Abstract: This paper presents a novel approach for the automatic creation of vegetation scenarios in real or virtual 3D cities in order to simplify the complex design process and time consuming modeling tasks in urban landscape planning. We introduce shape grammars as a practical tool for the rule-based generation of urban open spaces. The automatically generated designs can be used for pre-visualization, master planning, guided design variation and digital content creation in general (e.g. for the entertainment industry). In a first step, we extend the CGA shape grammar by Muller et al. (2006) with urban planning operations. In a second step, we employ the possibilities of shape grammars to encode design patterns (Alexander et al., 1977). Therefore, we propose several examples of design patterns allowing for an intuitive high-level placement of objects common in urban open spaces (e.g. plants). Furthermore, arbitrary interactions between distinct instances of the vegetation and the urban environment can be encoded. With the resulting system, the designer can efficiently vegetate landscape and city parks, alleys, gardens, patios and even single buildings by applying the corresponding shape grammar rules. Our results demonstrate the procedural design process on two practical example scenarios, each one covering a different scale and different contexts of planning. The first example illustrates a derivation of the Garden of Versailles and the second example describes the usage of high-level rule sets to generate a suburbia model.

25 citations


Journal Article
TL;DR: In this article, a low-dimensional embedding of the pose manifolds using Locally Linear Embedding (LLE) is learned, as well as the statistical relationship between body poses and their image appearance.
Abstract: We present a method to simultaneously estimate 3d body pose and action categories from monocular video sequences. Our approach learns a low-dimensional embedding of the pose manifolds using Locally Linear Embedding (LLE), as well as the statistical relationship between body poses and their image appearance. In addition, the dynamics in these pose manifolds are modelled. Sparse kernel regressors capture the nonlinearities of these mappings efficiently. Body poses are inferred by a recursive Bayesian sampling algorithm with an activity-switching mechanism based on learned transfer functions. Using a rough foreground segmentation, we compare Binary PCA and distance transforms to encode the appearance. As a postprocessing step, the globally optimal trajectory through the entire sequence is estimated, yielding a single pose estimate per frame that is consistent throughout the sequence. We evaluate the algorithm on challenging sequences with subjects that are alternating between running and walking movements. Our experiments show how the dynamical model helps to track through poorly segmented low-resolution image sequences where tracking otherwise fails, while at the same time reliably classifying the activity type.

18 citations


Book ChapterDOI
18 Nov 2007
TL;DR: In this article, a generative model of the relationship of body pose and image appearance using a sparse kernel regressor is proposed to learn a prior model of likely body poses and a nonlinear dynamical model, making both pose and bounding box estimation more robust.
Abstract: We consider the problem of monocular 3d body pose tracking from video sequences. This task is inherently ambiguous. We propose to learn a generative model of the relationship of body pose and image appearance using a sparse kernel regressor. Within a particle filtering framework, the potentially multimodal posterior probability distributions can then be inferred. The 2d bounding box location of the person in the image is estimated along with its body pose. Body poses are modelled on a low-dimensional manifold, obtained by LLE dimensionality reduction. In addition to the appearance model, we learn a prior model of likely body poses and a nonlinear dynamical model, making both pose and bounding box estimation more robust. The approach is evaluated on a number of challenging video sequences, showing the ability of the approach to deal with low-resolution images and noise.

15 citations


Book ChapterDOI
20 Oct 2007
TL;DR: The dynamical model helps to track through poorly segmented low-resolution image sequences where tracking otherwise fails, while at the same time reliably classifying the activity type, and is evaluated on challenging sequences with subjects that are alternating between running and walking movements.
Abstract: We present a method to simultaneously estimate 3d body pose and action categories from monocular video sequences. Our approach learns a low-dimensional embedding of the pose manifolds using Locally Linear Embedding (LLE), as well as the statistical relationship between body poses and their image appearance. In addition, the dynamics in these pose manifolds are modelled. Sparse kernel regressors capture the nonlinearities of these mappings efficiently. Body poses are inferred by a recursive Bayesian sampling algorithm with an activity-switching mechanism based on learned transfer functions. Using a rough foreground segmentation, we compare Binary PCA and distance transforms to encode the appearance. As a postprocessing step, the globally optimal trajectory through the entire sequence is estimated, yielding a single pose estimate per frame that is consistent throughout the sequence. We evaluate the algorithm on challenging sequences with subjects that are alternating between running and walking movements. Our experiments show how the dynamical model helps to track through poorly segmented low-resolution image sequences where tracking otherwise fails, while at the same time reliably classifying the activity type.

15 citations


Journal ArticleDOI
TL;DR: An algorithm for efficient depth calculations and view synthesis that applies a min-cut/max-flow algorithm on a graph, implemented on the CPU, to ameliorate this result by a global optimisation.

Journal Article
TL;DR: A generative model of the relationship of body pose and image appearance using a sparse kernel regressor and a nonlinear dynamical model is proposed, making both pose and bounding box estimation more robust.
Abstract: We consider the problem of monocular 3d body pose tracking from video sequences. This task is inherently ambiguous. We propose to learn a generative model of the relationship of body pose and image appearance using a sparse kernel regressor. Within a particle filtering framework, the potentially multimodal posterior probability distributions can then be inferred. The 2d bounding box location of the person in the image is estimated along with its body pose. Body poses are modelled on a low-dimensional manifold, obtained by LLE dimensionality reduction. In addition to the appearance model, we learn a prior model of likely body poses and a nonlinear dynamical model, making both pose and bounding box estimation more robust. The approach is evaluated on a number of challenging video sequences, showing the ability of the approach to deal with low-resolution images and noise.

01 Jan 2007
TL;DR: Using the LIDAR data as reference, this paper provides an evaluation procedure to these two major parts of the 3D model building pipeline: camera self-calibration and dense multi-view depth estimation.
Abstract: In this paper we want to start the discussion on whether image based 3-D modelling techniques and especially multi-view stereo can possibly be used to replace LIDAR systems for outdoor 3D data acquisition. Two main issues have to be addressed in this context: (i) camera self-calibration and (ii) dense multi-view depth estimation. To investigate both, we have acquired test data from outdoor scenes with LIDAR and cameras. Using the LIDAR data as reference we provide an evaluation procedure to these two major parts of the 3D model building pipeline. The test images are available for the community as benchmark data.

Journal ArticleDOI
TL;DR: This method is able to build absolute scale 3D without the need of a known baseline length, traditionally acquired by odometry, and uses the ground plane assumption together with the camera system's height to determine the scale factor.
Abstract: We propose a method for computing the absolute distances to static obstacles using a single omnidirectional camera. The method is applied to mobile robots. We achieve this without restricting the application to predetermined translations or the use of artificial markers. In contrast to prior work, our method is able to build absolute scale 3D without the need of a known baseline length, traditionally acquired by odometry. Instead we use the ground plane assumption together with the camera system's height to determine the scale factor. Using only one omnidirectional camera our method is cheaper, more informative and more compact than the traditional methods for distance determination, especially when a robot is already equipped with a camera for e.g. navigation. It also provides more information since it determines distances in a 3D space instead of in one plane. The experiments show promising results. The algorithm is indeed capable of determining the distances in meters to features and obstacles and is able to locate all major obstacles in the scene.