Showing papers by "Luc Van Gool published in 2008"

PDF

Open Access

Journal Article•DOI•

[...]

Herbert Bay¹, Andreas Ess¹, Tinne Tuytelaars², Luc Van Gool¹•Institutions (2)

ETH Zurich¹, Katholieke Universiteit Leuven²

01 Jun 2008-Computer Vision and Image Understanding

TL;DR: A novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features), which approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster.

...read moreread less

12,449 citations

Book Chapter•DOI•

An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector

[...]

Geert Willems¹, Tinne Tuytelaars¹, Luc Van Gool²•Institutions (2)

Katholieke Universiteit Leuven¹, ETH Zurich²

12 Oct 2008

TL;DR: In this article, the Hessian scale-invariant saliency measure is used to detect spatio-temporal interest points that are at the same time scale invariant and densely cover the video content.

...read moreread less

Abstract: Over the years, several spatio-temporal interest point detectors have been proposed. While some detectors can only extract a sparse set of scale-invariant features, others allow for the detection of a larger amount of features at user-defined scales. This paper presents for the first time spatio-temporal interest points that are at the same time scale-invariant (both spatially and temporally) and densely cover the video content. Moreover, as opposed to earlier work, the features can be computed efficiently. Applying scale-space theory, we show that this can be achieved by using the determinant of the Hessian as the saliency measure. Computations are speeded-up further through the use of approximative box-filter operations on an integral video structure. A quantitative evaluation and experimental results on action recognition show the strengths of the proposed detector in terms of repeatability, accuracy and speed, in comparison with previously proposed detectors.

...read moreread less

759 citations

Proceedings Article•DOI•

World-scale mining of objects and events from community photo collections

[...]

Till Quack¹, Bastian Leibe¹, Luc Van Gool¹•Institutions (1)

ETH Zurich¹

07 Jul 2008

TL;DR: This paper describes an approach for mining images of objects from community photo collections in an unsupervised fashion, and demonstrates this approach on several urban areas, densely covering an area of over 700 square kilometers and mining over 200,000 photos, making it probably the largest experiment of its kind to date.

...read moreread less

Abstract: In this paper, we describe an approach for mining images of objects (such as touristic sights) from community photo collections in an unsupervised fashion. Our approach relies on retrieving geotagged photos from those web-sites using a grid of geospatial tiles. The downloaded photos are clustered into potentially interesting entities through a processing pipeline of several modalities, including visual, textual and spatial proximity. The resulting clusters are analyzed and are automatically classified into objects and events. Using mining techniques, we then find text labels for these clusters, which are used to again assign each cluster to a corresponding Wikipedia article in a fully unsupervised manner. A final verification step uses the contents (including images) from the selected Wikipedia article to verify the cluster-article assignment. We demonstrate this approach on several urban areas, densely covering an area of over 700 square kilometers and mining over 200,000 photos, making it probably the largest experiment of its kind to date.

...read moreread less

295 citations

Journal Article•DOI•

3D Urban Scene Modeling Integrating Recognition and Reconstruction

[...]

Nico Cornelis¹, Bastian Leibe², Kurt Cornelis¹, Luc Van Gool²•Institutions (2)

Katholieke Universiteit Leuven¹, ETH Zurich²

01 Jul 2008-International Journal of Computer Vision

TL;DR: A novel city modeling framework which builds upon this philosophy to create 3D content at high speed by integrating it with an object recognition module that automatically detects cars in the input video streams and localizes them in 3D.

...read moreread less

Abstract: Supplying realistically textured 3D city models at ground level promises to be useful for pre-visualizing upcoming traffic situations in car navigation systems. Because this pre-visualization can be rendered from the expected future viewpoints of the driver, the required maneuver will be more easily understandable. 3D city models can be reconstructed from the imagery recorded by surveying vehicles. The vastness of image material gathered by these vehicles, however, puts extreme demands on vision algorithms to ensure their practical usability. Algorithms need to be as fast as possible and should result in compact, memory efficient 3D city models for future ease of distribution and visualization. For the considered application, these are not contradictory demands. Simplified geometry assumptions can speed up vision algorithms while automatically guaranteeing compact geometry models. In this paper, we present a novel city modeling framework which builds upon this philosophy to create 3D content at high speed. Objects in the environment, such as cars and pedestrians, may however disturb the reconstruction, as they violate the simplified geometry assumptions, leading to visually unpleasant artifacts and degrading the visual realism of the resulting 3D city model. Unfortunately, such objects are prevalent in urban scenes. We therefore extend the reconstruction framework by integrating it with an object recognition module that automatically detects cars in the input video streams and localizes them in 3D. The two components of our system are tightly integrated and benefit from each other's continuous input. 3D reconstruction delivers geometric scene context, which greatly helps improve detection precision. The detected car locations, on the other hand, are used to instantiate virtual placeholder models which augment the visual realism of the reconstructed city model.

...read moreread less

256 citations

Journal Article•DOI•

Recognizing emotions expressed by body pose

[...]

Konrad Schindler, Luc Van Gool¹, Beatrice de Gelder²•Institutions (2)

Katholieke Universiteit Leuven¹, Tilburg University²

01 Nov 2008-Neural Networks

TL;DR: This work constructs a biologically plausible hierarchy of neural detectors, which can discriminate seven basic emotional states from static views of associated body poses, and is evaluated against human test subjects on a recent set of stimuli manufactured for research on emotional body language.

...read moreread less

164 citations

Journal Article•

Articulated multi-body tracking under egomotion

[...]

Stephan Gammeter, Andreas Ess, Tobias Jäggli, Konrad Schindler, Bastian Leibe, Luc Van Gool - Show less +2 more

01 Jan 2008-Lecture Notes in Computer Science

TL;DR: In this article, the authors address the problem of 3D articulated multi-person tracking in busy street scenes from a moving, human-level observer and propose a two-stage strategy.

...read moreread less

Abstract: In this paper, we address the problem of 3D articulated multi-person tracking in busy street scenes from a moving, human-level observer. In order to handle the complexity of multi-person interactions, we propose to pursue a two-stage strategy. A multi-body detection-based tracker first analyzes the scene and recovers individual pedestrian trajectories, bridging sensor gaps and resolving temporary occlusions. A specialized articulated tracker is then applied to each recovered pedestrian trajectory in parallel to estimate the tracked person's precise body pose over time. This articulated tracker is implemented in a Gaussian Process framework and operates on global pedestrian silhouettes using a learned statistical representation of human body dynamics. We interface the two tracking levels through a guided segmentation stage, which combines traditional bottom-up cues with top-down information from a human detector and the articulated tracker's shape prediction. We show the proposed approach's viability and demonstrate its performance for articulated multi-person tracking on several challenging video sequences of a busy inner-city scenario.

...read moreread less

54 citations

Book Chapter•DOI•

Object recognition for the internet of things

[...]

Till Quack¹, Herbert Bay¹, Luc Van Gool²•Institutions (2)

ETH Zurich¹, Katholieke Universiteit Leuven²

26 Mar 2008

TL;DR: A system which allows to request information on physical objects by taking a picture of them, using a mobile phone with integrated camera, and which identifies an object from a query image through multiple recognition stages, including local visual features, global geometry, and optionally also metadata such as GPS location.

...read moreread less

Abstract: We present a system which allows to request information on physical objects by taking a picture of them. This way, using a mobile phone with integrated camera, users can interact with objects or "things" in a very simple manner. A further advantage is that the objects themselves don't have to be tagged with any kind of markers. At the core of our system lies an object recognition method, which identifies an object from a query image through multiple recognition stages, including local visual features, global geometry, and optionally also metadata such as GPS location. We present two applications for our system, namely a slide tagging application for presentation screens in smart meeting rooms and a cityguide on a mobile phone. Both systems are fully functional, including an application on the mobile phone, which allows simplest point-and-shoot interaction with objects. Experiments evaluate the performance of our approach in both application scenarios and show good recognition results under challenging conditions.

...read moreread less

49 citations

Book Chapter•DOI•

Articulated Multi-body Tracking under Egomotion

[...]

Stephan Gammeter¹, Andreas Ess¹, Tobias Jäggli¹, Konrad Schindler¹, Bastian Leibe², Luc Van Gool¹ - Show less +2 more•Institutions (2)

ETH Zurich¹, RWTH Aachen University²

12 Oct 2008

TL;DR: This paper addresses the problem of 3D articulated multi-person tracking in busy street scenes from a moving, human-level observer and proposes to pursue a two-stage strategy, which combines traditional bottom-up cues with top-down information from a human detector and the articulated tracker's shape prediction.

...read moreread less

43 citations

Proceedings Article•DOI•

Spatio-temporal features for robust content-based video copy detection

[...]

Geert Willems¹, Tinne Tuytelaars¹, Luc Van Gool²•Institutions (2)

Katholieke Universiteit Leuven¹, ETH Zurich²

30 Oct 2008

TL;DR: A new method for robust content-based video copy detection based on local spatio-temporal features as shown by experimental validation brings additional robustness and discriminativity to the task of video footage reuse detection in news broadcasts.

...read moreread less

Abstract: n this paper, we present a new method for robust content-based video copy detection based on local spatio-temporal features. As we show by experimental validation, the use of local spatio-temporal features instead of purely spatial ones brings additional robustness and discriminativity. Efficient operation is ensured by using the new spatio-temporal features proposed in [20]. To cope with the high-dimensionality of the resulting descriptors, these features are incorporated in a disk-based index and query system based on p-stable locality sensitive hashing. The system is applied to the task of video footage reuse detection in news broadcasts. Results are reported on 88 hours of news broadcast data from the TRECVID2006 dataset.

...read moreread less

39 citations

Proceedings Article•DOI•

Probabilistic parameter selection for learning scene structure from video

[...]

Michael D. Breitenstein, Eric Sommerlade¹, Bastian Leibe, Luc Van Gool², Ian Reid - Show less +1 more•Institutions (2)

University of Oxford¹, Katholieke Universiteit Leuven²

01 Jan 2008

TL;DR: This work presents an online learning approach for robustly combining unreliable observations from a pedestrian detector to estimate the rough 3D scene geometry from video sequences of a static camera based on an entropy modelling framework.

...read moreread less

Abstract: We present an online learning approach for robustly combining unreliable observations from a pedestrian detector to estimate the rough 3D scene geometry from video sequences of a static camera. Our approach is based on an entropy modelling framework, which allows to simultaneously adapt the detector parameters, such that the expected information gain about the scene structure is maximised. As a result, our approach automatically restricts the detector scale range for each image region as the estimation results become more confident, thus improving detector run-time and limiting false positives.

...read moreread less

21 citations

Proceedings Article•DOI•

An Efficient Shared Multi-Class Detection Cascade

[...]

Philipp Zehnder¹, Esther Koller-Meier¹, Luc Van Gool¹•Institutions (1)

ETH Zurich¹

01 Jan 2008

TL;DR: A novel multi-class object detector that optimizes the detection costs while retaining a desired detection rate is proposed, that uses a cascade that unites the handling of similar object classes while separating off classes at appropriate levels of the cascade.

...read moreread less

Abstract: We propose a novel multi-class object detector, that optimizes the detection costs while retaining a desired detection rate. The detector uses a cascade that unites the handling of similar object classes while separating off classes at appropriate levels of the cascade. No prior knowledge about the relationship between classes is needed as the classifier structure is automatically determined during the training phase. The detection nodes in the cascade use Haar wavelet features and Gentle AdaBoost, however the approach is not dependent on the specific features used and can easily be extended to other cases. Experiments are presented for several numbers of object classes and the approach is compared to other classifying schemes. The results demonstrate a large efficiency gain that is particularly prominent for a greater number of classes. Also the complexity of the training scales well with the number of classes.

...read moreread less

Book Chapter•DOI•

Visual topological map building in self-similar environments

[...]

Toon Goedemé¹, Tinne Tuytelaars¹, Luc Van Gool¹•Institutions (1)

Katholieke Universiteit Leuven¹

01 Jan 2008

TL;DR: This chapter describes a method to automatically build topological maps for robot navigation out of a sequence of visual observations taken from a camera mounted on the robot based on Dempster-Shafer probability theory.

...read moreread less

Abstract: This chapter describes a method to automatically build topological maps for robot navigation out of a sequence of visual observations taken from a camera mounted on the robot. This direct non-metrical approach relies completely on the detection of loop closings, i.e. repeated visitations of one particular place. In natural environments, visual loop closing can be very hard, for two reasons. Firstly, the environment at one place can look differently at different time instances due to illumination changes and viewpoint differences. Secondly, there can be different places that look alike, i.e. the environment is self-similar. Here we propose a method that combines state-of-the-art visual comparison techniques and evidence collection based on Dempster-Shafer probability theory to tackle this problem.

...read moreread less

Journal Article•DOI•

Three-dimensional measurement of microchips using structured light techniques

[...]

Javier Vargas¹, Thomas Koninckx², Juan Antonio Quiroga¹, Luc Van Gool²•Institutions (2)

Complutense University of Madrid¹, Katholieke Universiteit Leuven²

01 May 2008-Optical Engineering

TL;DR: This work presents a 3-D measurement technique capable of optically measuring microchip devices using a camera-projector system and improves the dynamic range of the imaging system through the use of a set of gray-code and phase-shift measures with different CCD integration times.

...read moreread less

Abstract: The industry dealing with microchip inspection requires fast, flexible, repeatable, and stable 3-D measuring systems. The typical devices used for this purpose are coordinate measurement machines (CMMs). These systems have limitations such as high cost, low measurement speed, and small quantity of measured 3-D points. Now optical techniques are beginning to replace the typical touch probes because of their noncontact nature, their full-field measurement capability, their high measurement density, as well as their low cost and high measurement speed. However, typical properties of microchip devices, which include a strongly spatially varying reflectance, make impossible the direct use of the classical optical 3-D measurement techniques. We present a 3-D measurement technique capable of optically measuring these devices using a camera-projector system. The proposed method improves the dynamic range of the imaging system through the use of a set of gray-code (GC) and phase- shift (PS) measures with different CCD integration times. A set of extended-range GC and PS images are obtained and used to acquire a dense 3-D measure of the object. We measure the 3-D shape of an integrated circuit and obtained satisfactory results.

...read moreread less

Proceedings Article•DOI•

Using Recognition to Guide a Robot's Attention

[...]

Alexander Thomas¹, Vittorio Ferrari, Bastian Leibe², Tinne Tuytelaars³, Luc Van Gool² - Show less +1 more•Institutions (3)

Imperial College London¹, ETH Zurich², Katholieke Universiteit Leuven³

25 Jun 2008

TL;DR: A system that is able to recognize objects of a certain class in an image and to identify their parts for potential interactions is presented, demonstrated for object instances that have never been observed during training, and under partial occlusion and against cluttered backgrounds.

...read moreread less

Abstract: In the transition from industrial to service robotics, robots will have to deal with increasingly unpredictable and variable environments We present a system that is able to recognize objects of a certain class in an image and to identify their parts for potential interactions This is demonstrated for object instances that have never been observed during training, and under partial occlusion and against cluttered backgrounds Our approach builds on the Implicit Shape Model of Leibe and Schiele, and extends it to couple recognition to the provision of meta-data useful for a task Meta-data can for example consist of part labels or depth estimates We present experimental results on wheelchairs and cars

...read moreread less

The presentation of cultural heritage models in EPOCH

[...]

Sven Havemann, Volker Settgast, Dieter W. Fellner, Geert Willems, Luc Van Gool, Gero Müller, Marting Schneider, Reinhard Klein - Show less +4 more

01 Jan 2008

TL;DR: Instead of a monolithic application this work proposes a viewer architecture that builds upon a module concept and a scripting language that permits to design with reasonable effort non-trivial interaction components for exploration and inspection of individual models as well as of complex 3D-scenes.

...read moreread less

Abstract: The presentation of CH artefacts is technically demanding because it has to meet a variety of requirements: A plethora of file formats, compatibility with numerous application scenarios from powerwall to web-browser, sustainability and long-term availability, extensibility with respect to digital model representations, and last but not least a good usability. Instead of a monolithic application we propose a viewer architecture that builds upon a module concept and a scripting language. This permits to design with reasonable effort non-trivial interaction components for exploration and inspection of individual models as well as of complex 3D-scenes. Furthermore some specific CH-models will be discussed in more detail.

...read moreread less

Proceedings Article•

Robust Multi-Person Tracking from Moving Platforms

[...]

Andreas Ess¹, Konrad Schindler¹, Bastian Leibe¹, Luc Van Gool¹•Institutions (1)

ETH Zurich¹

01 Jan 2008

TL;DR: In this article, a two-stage procedure is proposed to estimate the scene geometry and an overcomplete set of object detections, and then address object-object interactions, tracking and prediction in a second step.

...read moreread less

Abstract: In this paper, we address the problem of multi-person tracking in busy pedestrian zones, using a stereo rig mounted on a mobile platform. The complexity of the problem calls for an integrated solution, which extracts as much visual information as possible and combines it through cognitive feedback. We propose such an approach, which jointly estimates camera position, stereo depth, object detection, and tracking. We model the interplay between these components using a graphical model. Since the model has to incorporate object-object interactions, and temporal links to past frames, direct inference is intractable. We therefore propose a two-stage procedure: for each frame we first solve a simplified version of the model (disregarding interactions and temporal continuity) to estimate the scene geometry and an overcomplete set of object detections. Conditioned on these results, we then address object interactions, tracking, and prediction in a second step. The approach is experimentally evaluated on several long and difficult video sequences from busy inner-city locations. Our results show that the proposed integration makes it possible to deliver stable tracking performance in scenes of realistic complexity.

...read moreread less

Proceedings Article•

Robust Multi-Person Tracking from Moving Platforms

[...]

Andreas Ess¹, Konrad Schindler¹, Bastian Leibe¹, Luc Van Gool¹•Institutions (1)

Katholieke Universiteit Leuven¹

01 Jan 2008

TL;DR: This paper addresses the problem of multi-person tracking in busy pedestrian zones, using a stereo rig mounted on a mobile platform, and model the interplay between these components using a graphical model.

...read moreread less

Journal Article•DOI•

Privacy in video surveilled spaces

[...]

Torsten Spindler¹, Christoph Wartmann¹, Ludger Hovestadt¹, Daniel Roth¹, Luc Van Gool¹, Andreas Steffen - Show less +2 more•Institutions (1)

ETH Zurich¹

01 Apr 2008-Journal of Computer Security

TL;DR: A system prototype for self-determination and privacy enhancement in video surveilled areas by integrating computer vision and cryptographic techniques into networked building automation systems is presented.

...read moreread less

Abstract: We present a system prototype for self-determination and privacy enhancement in video surveilled areas by integrating computer vision and cryptographic techniques into networked building automation systems. This paper describes a system prototype and research work that has been conducted by an interdisciplinary team of researchers. People appearing in a video stream control their visibility on a per-viewer basis and can choose to allow either the real view or an obscured image to be seen. The parts of the video stream containing a person's image are protected by an AES cipher and can be sent over untrusted networks. This paper presents experimental results with the example of a meeting room scenario. We conclude with remarks on the system's usability and on the problems encountered.

...read moreread less

MeshLab and Arc3D : photo-reconstruction and processing 3D meshes

[...]

P. Cignoni, Massimiliano Corsini, M. Dellepiane, Guido Ranzuglia, Maarten Vergauwen, Luc Van Gool - Show less +2 more

01 Jan 2008

TL;DR: A complete free software pipeline for the 3D digital acquisition of Cultural Heritage assets based on the use of standard photographic equipment using Arc3D and MeshLab is presented.

...read moreread less

Abstract: The paper presents a complete free software pipeline for the 3D digital acquisition of Cultural Heritage assets based on the use of standard photographic equipment. The presented solution makes use of two main tools: Arc3D and MeshLab. Arc3D is a web based reconstruction service that using computer vision techniques are used to vision technique based on the automatic matching of image features compute for each photo a depth map. MeshLab is a tool that allow to import and process these range maps in order to obtain a ready to use 3D model. Through the combined use of these two systems it is possible to digitally acquire CH artifacts and monuments in affordable way.

...read moreread less

Book Chapter•DOI•

Combining Densely Sampled Form and Motion for Human Action Recognition

[...]

Konrad Schindler¹, Luc Van Gool¹•Institutions (1)

ETH Zurich¹

10 Jun 2008

TL;DR: In an evaluation on two standard datasets, the method outperforms the state-of-the-art, confirming that the combination of form and motion improves recognition.

...read moreread less

Abstract: We present a method for human action recognition from video, which exploits both form (local shape) and motion (local flow). Inspired by models of the human visual system, the two feature sets are processed independently in separate channels. The form channel extracts a dense local shape representation from every frame, while the motion channel extracts dense optic flow from the frame and its immediate predecessor. The same processing pipeline is applied in both channels: feature maps are pooled locally, down-sampled, and compared to a collection of learnt templates, yielding a vector of similarity scores. In a final step, the two score vectors are merged, and recognition is performed with a discriminative classifier. In an evaluation on two standard datasets our method outperforms the state-of-the-art, confirming that the combination of form and motion improves recognition.

...read moreread less

Proceedings Article•

Proceedings of the 1st ACM workshop on Analysis and retrieval of events/actions and workflows in video streams

[...]

Anastasios Doulamis¹, Luc Van Gool, Mark S. Nixon², Theodora Varvarigou³, Nikolaos Doulamis³, Manolis Sardis - Show less +2 more•Institutions (3)

Technical University of Crete¹, University of Southampton², National Technical University of Athens³

31 Oct 2008

TL;DR: The 1st ACM Workshop on Analysis and Retrieval of Events, Actions and Workflows in Video Streams as discussed by the authors was held this year in Vancouver, Canada, with 16 papers that cover a variety of topics.

...read moreread less

Abstract: It is our great pleasure to welcome you to the 1st ACM Workshop on Analysis and Retrieval of Events, Actions and Workflows in Video Streams --ACM AREA 2008 which is held this year in Vancouver, Canada. The mission of this workshop is to present the current research advantages in the area of cognitive video supervision and analysis of events, actions and workflows which is a critical research task for many real-life multimedia applications. ACM AREA 2008 gives researchers a unique opportunity to share their perspectives with their colleagues interested in the various aspects of video supervision and event analysis. The call for papers attracted submissions from Asia, Europe and the United States. The program committee accepted 16 papers that cover a variety of topics. These papers have been organized in four sessions. More specifically, the first session is dedicated to new algorithms and methods for object tracking under complex environments as well as with object labeling/matching techniques. The second session presents the new research outcomes in the area of detecting events, actions and workflows in video sequences. Event-driven analysis of videos is presented in the third session, while the workshop ends with a special session that covers the recent advantages of ongoing research projects in this research field. We hope that these proceedings will serve as a valuable reference for events detection and retrieval in video streams.

...read moreread less

IA Generative Model for True Orthorectification

[...]

Christoph Strecha, Luc Van Gool, Pascal Fua

01 Jan 2008

TL;DR: This paper deals with the computation of a true orthographic image given a set of overlapping perspective images by using a Bayesian approach and defining a generative model of the input images.

...read moreread less

Abstract: Orthographic images compose an efficient and economic way to represent aerial images. This kind of information allows to measure two-dimensional objects and relate these to Geographic Information Systems. This paper deals with the computation of a true orthographic image given a set of overlapping perspective images. These are, together with the internal and external calibration the only input to our approach. These few requirements form a large advantage to systems where the digital surface model (DSM), e.g. provided by LIDAR data, is necessary. We used a Bayesian approach and define a generative model of the input images. In this, the input images are regarded as noisy measurements of an underlying true and hence unknown orthoimage. These measurements are obtained by an image formation process (generative model) that involves apart from the true orthoimage several additional parameters. Our goal is to invert the image formation process by estimating those parameters which make our input images most likely. We present results on aerial images of a complex urban environment.

...read moreread less

Proceedings Article•DOI•

Multi-object tracking driven event detection for evaluation

[...]

Daniel Roth¹, Esther Koller-Meier¹, Luc Van Gool¹•Institutions (1)

ETH Zurich¹

31 Oct 2008-Area

TL;DR: A monocular object tracker, able to detect and track multiple object classes in non-controlled environments, using Bayesian per-pixel classification to segment an image into foreground and background objects, based on observations of object appearances and motions in real-time.

...read moreread less

Abstract: This paper describes a monocular object tracker, able to detect and track multiple object classes in non-controlled environments. Our tracking framework uses Bayesian per-pixel classification to segment an image into foreground and background objects, based on observations of object appearances and motions in real-time. Furthermore, semantically high level events are automatically extracted from the tracking data for performance evaluation. The reliability of the event detection is demonstrated by applying it to state-of-the-art methods and comparing the results to human annotated ground truth data for multiple public datasets.

...read moreread less

Proceedings of the 1st ACM Workshop on Analysis and Retrieval of Events/Actions and Workflows in Video Streams, AREA 2008, Vancouver, British Columbia, Canada, October 31, 2008

[...]

Anastasios Doulamis, Luc Van Gool, Mark S. Nixon, Theodora A. Varvarigou, Nikolaos Doulamis, Manolis Sardis - Show less +2 more

01 Jan 2008

Proceedings Article•DOI•

First ACM international workshop on analysis and retrieval of events, actions and workflows in video streams

[...]

Anastasios Doulamis¹, Luc Van Gool, Mark S. Nixon², Theodora Varvarigou³, Nikolaos Doulamis³ - Show less +1 more•Institutions (3)

Technical University of Crete¹, University of Southampton², National Technical University of Athens³

26 Oct 2008

TL;DR: This workshop consists of 16 high quality papers organized in four thematic sessions dedicated to new objects tracking algorithms under complex environments and to object labeling techniques for detecting high level semantics in video sequences.

...read moreread less

Abstract: AREA 2008 is the first ACM international workshop on analysis and retrieval of events, actions and workflows in video streams. Such research is nowadays critical for many real-life applications, such as area supervision, semantic characterization and annotation of video streams, quality assurance, and security. This workshop consists of 16 high quality papers organized in four thematic sessions. More specifically, the first session is dedicated to new objects tracking algorithms under complex environments and to object labeling techniques. The second session deals with methods, tools and architectures for detecting high level semantics (events, actions, and workflows) in video sequences. The third session presents new algorithms for analyzing video sequences oriented to detecting humans' actions or implicitly annotating multimedia content. Finally, the fourth includes a special session of the recent advantages of the ongoing research projects in the field of multimedia analysis, cognitive video supervision, personalized video annotation, fast retrieval of multimedia content in compressed domain and scheduling tools for interactive multimedia services. We hope that these proceedings will serve as a valuable reference for analysis of events in video streams.

...read moreread less

Book Chapter•DOI•

Robust Vision-only Mobile Robot Navigation with Topological Maps

[...]

Toon Goedemé, Luc Van Gool

01 Jun 2008

TL;DR: In this paper, the authors present a system for autonomous mobile robot navigation with only an omnidirectional camera as sensor, which is able to build automatically and robust accurate topologically organized environment maps of a complex, natural environment.

...read moreread less

Abstract: In this work we present a novel system for autonomous mobile robot navigation. With only an omnidirectional camera as sensor, this system is able to build automatically and robust accurate topologically organised environment maps of a complex, natural environment. It can localise itself using that map at each moment, including both at startup (kidnapped robot) or using knowledge of former localisations. The topological nature of the map is similar to the intuitive maps humans use, is memory-efficient and enables fast and simple path planning towards a specified goal. We developed a real-time visual servoing technique to steer the system along the computed path. The key technology making this all possible is the novel fast wide baseline feature matching, which yields an efficient description of the scene, with a focus on man-made environments.

...read moreread less

Proceedings Article•DOI•

The DIRAC AWEAR audio-visual platform for detection of unexpected and incongruent events

[...]

Jörn Anemüller¹, Jörg-Hendrik Bach¹, Barbara Caputo², Michal Havlena³, Luo Jie², Hendrik Kayser¹, Bastian Leibe⁴, Petr Motlicek², Tomas Pajdla³, Misha Pavel⁵, Akihiko Torii³, Luc Van Gool⁶, Alon Zweig⁷, Hynek Hermansky² - Show less +10 more•Institutions (7)

University of Oldenburg¹, Idiap Research Institute², Czech Technical University in Prague³, ETH Zurich⁴, Oregon Health & Science University⁵, Katholieke Universiteit Leuven⁶, Hebrew University of Jerusalem⁷

20 Oct 2008

TL;DR: An audio-visual platform has been constructed with the goal to help users with disabilities or a high cognitive load to deal with unexpected events, and algorithmic approaches to the detection of such events are developed.

...read moreread less

Abstract: It is of prime importance in everyday human life to cope with and respond appropriately to events that are not foreseen by prior experience. Machines to a large extent lack the ability to respond appropriately to such inputs. An important class of unexpected events is defined by incongruent combinations of inputs from different modalities and therefore multimodal information provides a crucial cue for the identification of such events, e.g., the sound of a voice is being heard while the person in the field-of-view does not move her lips. In the project DIRAC ("Detection and Identification of Rare Audio-visual Cues") we have been developing algorithmic approaches to the detection of such events, as well as an experimental hardware platform to test it. An audio-visual platform ("AWEAR" - audio-visual wearable device) has been constructed with the goal to help users with disabilities or a high cognitive load to deal with unexpected events. Key hardware components include stereo panoramic vision sensors and 6-channel worn-behind-the-ear (hearing aid) microphone arrays. Data have been recorded to study audio-visual tracking, a/v scene/object classification and a/v detection of incongruencies.

...read moreread less

A common infrastructure for cultural heritage applications

[...]

Luc Van Eycken, Bert Deknuydt, Luc Van Gool

01 Jan 2008

TL;DR: In this paper, a unified approach for managing cultural heritage information is proposed to handle storing and exchanging data, and an implementation demonstrates its use in two different cultural heritage applications in the context of large-scale projects.

...read moreread less

Abstract: This paper explores the infrastructure needs for cultural heritage applications. Small dedicated applications as well as very large projects are considered. A unified approach for managing cultural heritage information is proposed to handle storing and exchanging data. An implementation demonstrates its use in two different cultural heritage applications.

...read moreread less

Book Chapter•DOI•

Divide-and-texture: hierarchical texture description

[...]

Geert Caenen, Alexey Zalesny, Luc Van Gool

01 Oct 2008

A Generative Model for True Orthorectification

[...]

Christoph Strecha, Luc Van Gool, Pascal Fua

01 Jan 2008

TL;DR: In this article, a Bayesian approach is used to define a generative model of the input images, which are regarded as noisy measurements of an underlying true and hence unknown orthoimage.

...read moreread less

Abstract: Orthographic images compose an efficient and economic way to represent aerial images. This kind of information allows to measure two-dimensional objects and relate these to Geographic Information Systems. This paper deals with the computation of a true ortho-graphic image given a set of overlapping perspective images. These are, together with the internal and external calibration the only input to our approach. These few requirements form a large advantage to systems where the digital surface model (DSM), e.g. provided by LIDAR data, is necessary. We used a Bayesian approach and define a generative model of the input images. In this, the input images are regarded as noisy measurements of an underlying true and hence unknown orthoimage. These measurements are obtained by an image formation process (generative model) that involves apart from the true orthoimage several additional parameters. Our goal is to invert the image formation process by estimating those parameters which make our input images most likely. We present results on aerial images of a complex urban environment.

...read moreread less