scispace - formally typeset
Search or ask a question
Author

Martin Spengler

Bio: Martin Spengler is an academic researcher from ETH Zurich. The author has contributed to research in topics: Robustness (computer science) & Video tracking. The author has an hindex of 6, co-authored 9 publications receiving 362 citations.

Papers
More filters
Journal ArticleDOI
01 Apr 2003
TL;DR: It is argued that the principles of sensor and model integration can increase the robustness of today's computer-vision systems substantially and, as an example, multi-cue tracking of faces is discussed.
Abstract: . Even though many of today's vision algorithms are very successful, they lack robustness, since they are typically tailored to a particular situation. In this paper, we argue that the principles of sensor and model integration can increase the robustness of today's computer-vision systems substantially. As an example, multi-cue tracking of faces is discussed. The approach is based on the principles of self-organization of the integration mechanism and self-adaptation of the cue models during tracking. Experiments show that the robustness of simple models is leveraged significantly by sensor and model integration.

173 citations

Book ChapterDOI
07 Jul 2001
TL;DR: It is argued that the principles of sensor and model integration can increase the robustness of today's computer vision systems substantially and multi-cue tracking of faces is discussed.
Abstract: Even though many of today's vision algorithms are very successful, they lack robustness since they are typically limited to a particular situation. In this paper we argue that the principles of sensor and model integration can increase the robustness of today's computer vision systems substantially. As an example multi-cue tracking of faces is discussed. The approach is based on the principles of self-organization of the integration mechanism and self-adaptation of the cue models during tracking. Experiments show that the robustness of simple models is leveraged significantly by sensor and model integration.

115 citations

01 Jan 2003
TL;DR: A tracking/surveillance system that supports a human operator by automatically detecting abandoned objects and drawing the operator’s attention to such events and the system provides the appropriate key frames for interpreting the incident.
Abstract: In recent years, visual surveillance has gained importance in security, law enforcement and military applications. Due to the wide use of video surveillance systems, the amount of data that has to be monitored and interpreted has increased enormously. Nowadays, the pure mass of information that has to be handled by the operators of such systems has overgrown their capabilities. It is therefore crucial to support human operators with (semi-)automatic surveillance systems which notify their supervisors in case of an incident potentially relevant to security. In this paper, we present a tracking/surveillance system that supports a human operator by automatically detecting abandoned objects and drawing the operator’s attention to such events. It consists of two major parts: A Bayesian multi-people tracker that explains as much of the scene as possible and a blob-based object detection system that identifies abandoned objects using the unexplained image parts. If a potentially abandoned object is detected, the operator is notified and the system provides the appropriate key frames for interpreting the incident.

42 citations

Journal ArticleDOI
TL;DR: A novel method for automatically choosing the object model which best fits the current context based on information-theoretic concepts is introduced and integrated into a multi-cue face tracking system and experimentally evaluated.

13 citations

01 Jul 2001
TL;DR: In this article, a method for automatically choosing the object model which best fits the current context based on information-theoretic concepts is proposed to increase robustness of visual tracking.
Abstract: A major challenge for real-world object tracking is the dynamic nature of the environmental conditions with respect to illumination, motion, visibility, etc. For such an environment which may experience drastic changes at any time, integration of multiple and complementary cues promises to increase robustness of visual tracking. Nevertheless, one has to expect that false positive tracking will occur. In order to be able to recover from such tracking failure this paper introduces a novel method for automatically choosing the object model which best fits the current context based on information-theoretic concepts. In order to validate the effectiveness of the proposed model switching, it is integrated into a multi-cue face tracking system and experimentally evaluated.

7 citations


Cited by
More filters
Book ChapterDOI
28 May 2002
TL;DR: This work introduces a new Monte Carlo tracking technique based on the same principle of color histogram distance, but within a probabilistic framework, and introduces the following ingredients: multi-part color modeling to capture a rough spatial layout ignored by global histograms, incorporation of a background color model when relevant, and extension to multiple objects.
Abstract: Color-based trackers recently proposed in [3,4,5] have been proved robust and versatile for a modest computational cost They are especially appealing for tracking tasks where the spatial structure of the tracked objects exhibits such a dramatic variability that trackers based on a space-dependent appearance reference would break down very fast Trackers in [3,4,5] rely on the deterministic search of a window whose color content matches a reference histogram color modelRelying on the same principle of color histogram distance, but within a probabilistic framework, we introduce a new Monte Carlo tracking technique The use of a particle filter allows us to better handle color clutter in the background, as well as complete occlusion of the tracked entities over a few framesThis probabilistic approach is very flexible and can be extended in a number of useful ways In particular, we introduce the following ingredients: multi-part color modeling to capture a rough spatial layout ignored by global histograms, incorporation of a background color model when relevant, and extension to multiple objects

1,549 citations

Journal ArticleDOI
TL;DR: An overview of the current state of the art of pedestrian detection from both methodological and experimental perspectives is provided and a clear advantage of HOG/linSVM at higher image resolutions and lower processing speeds is indicated.
Abstract: Pedestrian detection is a rapidly evolving area in computer vision with key applications in intelligent vehicles, surveillance, and advanced robotics. The objective of this paper is to provide an overview of the current state of the art from both methodological and experimental perspectives. The first part of the paper consists of a survey. We cover the main components of a pedestrian detection system and the underlying models. The second (and larger) part of the paper contains a corresponding experimental study. We consider a diverse set of state-of-the-art systems: wavelet-based AdaBoost cascade, HOG/linSVM, NN/LRF, and combined shape-texture detection. Experiments are performed on an extensive data set captured onboard a vehicle driving through urban environment. The data set includes many thousands of training samples as well as a 27-minute test sequence involving more than 20,000 images with annotated pedestrian locations. We consider a generic evaluation setting and one specific to pedestrian detection onboard a vehicle. Results indicate a clear advantage of HOG/linSVM at higher image resolutions and lower processing speeds, and a superiority of the wavelet-based AdaBoost cascade approach at lower image resolutions and (near) real-time processing speeds. The data set (8.5 GB) is made public for benchmarking purposes.

1,263 citations

Journal ArticleDOI
30 Sep 2008
TL;DR: This paper presents a survey on crowd analysis methods employed in computer vision research and discusses perspectives from other research disciplines and how they can contribute to the computer vision approach.
Abstract: In the year 1999 the world population reached 6 billion, doubling the previous census estimate of 1960. Recently, the United States Census Bureau issued a revised forecast for world population showing a projected growth to 9.4 billion by 2050 (US Census Bureau, http://www.census.gov/ipc/www/worldpop.html). Different research disci- plines have studied the crowd phenomenon and its dynamics from a social, psychological and computational standpoint respectively. This paper presents a survey on crowd analysis methods employed in computer vision research and discusses perspectives from other research disciplines and how they can contribute to the computer vision approach.

584 citations

Journal ArticleDOI
08 Nov 2004
TL;DR: Generic importance sampling mechanisms for data fusion are introduced and it is shown how each of the three cues can be modeled by an appropriate data likelihood function, and how the intermittent cues are best handled by generating proposal distributions from their likelihood functions.
Abstract: The effectiveness of probabilistic tracking of objects in image sequences has been revolutionized by the development of particle filtering. Whereas Kalman filters are restricted to Gaussian distributions, particle filters can propagate more general distributions, albeit only approximately. This is of particular benefit in visual tracking because of the inherent ambiguity of the visual world that stems from its richness and complexity. One important advantage of the particle filtering framework is that it allows the information from different measurement sources to be fused in a principled manner. Although this fact has been acknowledged before, it has not been fully exploited within a visual tracking context. Here we introduce generic importance sampling mechanisms for data fusion and discuss them for fusing color with either stereo sound, for teleconferencing, or with motion, for surveillance with a still camera. We show how each of the three cues can be modeled by an appropriate data likelihood function, and how the intermittent cues (sound or motion) are best handled by generating proposal distributions from their likelihood functions. Finally, the effective fusion of the cues by particle filtering is demonstrated on real teleconference and surveillance data.

561 citations

Journal ArticleDOI
01 Mar 2012
TL;DR: This paper provides an overview of MHI-based human motion recognition techniques and applications and points some areas for further research based on the MHI method and its variants.
Abstract: The motion history image (MHI) approach is a view-based temporal template method which is simple but robust in representing movements and is widely employed by various research groups for action recognition, motion analysis and other related applications. In this paper, we provide an overview of MHI-based human motion recognition techniques and applications. Since the inception of the MHI template for motion representation, various approaches have been adopted to improve this basic MHI technique. We present all important variants of the MHI method. This paper points some areas for further research based on the MHI method and its variants.

292 citations