A multiview approach to tracking people in crowded scenes using a planar homography constraint

doi:10.1007/11744085_11

Home
/
Papers
/
A multiview approach to tracking people in crowded scenes using a planar homography constraint

Book Chapter•DOI•

A multiview approach to tracking people in crowded scenes using a planar homography constraint

Saad M. Khan¹, Mubarak Shah¹•Institutions (1)

University of Central Florida¹

07 May 2006-pp 133-146

TL;DR: In this paper, a multi-view approach is presented to track people in crowded scenes where people may be partially or completely occluding each other, by using multiple views in synergy so that information from all views is combined to detect objects.

read less

Abstract: Occlusion and lack of visibility in dense crowded scenes make it very difficult to track individual people correctly and consistently. This problem is particularly hard to tackle in single camera systems. We present a multi-view approach to tracking people in crowded scenes where people may be partially or completely occluding each other. Our approach is to use multiple views in synergy so that information from all views is combined to detect objects. To achieve this we present a novel planar homography constraint to resolve occlusions and robustly determine locations on the ground plane corresponding to the feet of the people. To find tracks we obtain feet regions over a window of frames and stack them creating a space time volume. Feet regions belonging to the same person form contiguous spatio-temporal regions that are clustered using a graph cuts segmentation approach. Each cluster is the track of a person and a slice in time of this cluster gives the tracked location. Experimental results are shown in scenes of dense crowds where severe occlusions are quite common. The algorithm is able to accurately track people in all views maintaining correct correspondences across views. Our algorithm is ideally suited for conditions when occlusions between people would seriously hamper tracking performance or if there simply are not enough features to distinguish between different people.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

A survey of advances in vision-based human motion capture and analysis

[...]

Thomas B. Moeslund¹, Adrian Hilton², Volker Krüger³•Institutions (3)

Aalborg University¹, University of Surrey², Aalborg University – Copenhagen³

01 Nov 2006-Computer Vision and Image Understanding

TL;DR: This survey reviews recent trends in video-based human capture and analysis, as well as discussing open problems for future research to achieve automatic visual analysis of human movement.

...read moreread less

2,738 citations

Proceedings Article•DOI•

On-line Boosting and Vision

[...]

Helmut Grabner¹, Horst Bischof¹•Institutions (1)

Graz University of Technology¹

17 Jun 2006

TL;DR: This paper proposes a novel on-line AdaBoost feature selection method and demonstrates the multifariousness of the method on such diverse tasks as learning complex background models, visual tracking and object detection.

...read moreread less

Abstract: Boosting has become very popular in computer vision, showing impressive performance in detection and recognition tasks. Mainly off-line training methods have been used, which implies that all training data has to be a priori given; training and usage of the classifier are separate steps. Training the classifier on-line and incrementally as new data becomes available has several advantages and opens new areas of application for boosting in computer vision. In this paper we propose a novel on-line AdaBoost feature selection method. In conjunction with efficient feature extraction methods the method is real time capable. We demonstrate the multifariousness of the method on such diverse tasks as learning complex background models, visual tracking and object detection. All approaches benefit significantly by the on-line training.

...read moreread less

1,159 citations

Journal Article•DOI•

Intelligent multi-camera video surveillance: A review

[...]

Xiaogang Wang¹•Institutions (1)

The Chinese University of Hong Kong¹

01 Jan 2013-Pattern Recognition Letters

TL;DR: This paper reviews the recent development of relevant technologies from the perspectives of computer vision and pattern recognition, and discusses how to face emerging challenges of intelligent multi-camera video surveillance.

...read moreread less

695 citations

Journal Article•DOI•

Online Multiperson Tracking-by-Detection from a Single, Uncalibrated Camera

[...]

Michael D. Breitenstein¹, Fabian Reichlin, Bastian Leibe², Esther Koller-Meier¹, L. Van Gool³ - Show less +1 more•Institutions (3)

ETH Zurich¹, RWTH Aachen University², Katholieke Universiteit Leuven³

01 Sep 2011-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper proposes a novel approach for multiperson tracking-by-detection in a particle filtering framework that detects and tracks a large number of dynamically moving people in complex scenes with occlusions, requires no camera or ground plane calibration, and only makes use of information from the past.

...read moreread less

Abstract: In this paper, we address the problem of automatically detecting and tracking a variable number of persons in complex scenes using a monocular, potentially moving, uncalibrated camera. We propose a novel approach for multiperson tracking-by-detection in a particle filtering framework. In addition to final high-confidence detections, our algorithm uses the continuous confidence of pedestrian detectors and online-trained, instance-specific classifiers as a graded observation model. Thus, generic object category knowledge is complemented by instance-specific information. The main contribution of this paper is to explore how these unreliable information sources can be used for robust multiperson tracking. The algorithm detects and tracks a large number of dynamically moving people in complex scenes with occlusions, does not rely on background modeling, requires no camera or ground plane calibration, and only makes use of information from the past. Hence, it imposes very few restrictions and is suitable for online applications. Our experiments show that the method yields good tracking performance in a large variety of highly dynamic scenarios, such as typical surveillance videos, webcam footage, or sports sequences. We demonstrate that our algorithm outperforms other methods that rely on additional information. Furthermore, we analyze the influence of different algorithm components on the robustness.

...read moreread less

658 citations

Journal Article•DOI•

Crowd analysis: a survey

[...]

B. Zhan¹, Dorothy Monekosso¹, Paolo Remagnino¹, Sergio A. Velastin¹, Li-Qun Xu² - Show less +1 more•Institutions (2)

Kingston University¹, BT Group²

30 Sep 2008

TL;DR: This paper presents a survey on crowd analysis methods employed in computer vision research and discusses perspectives from other research disciplines and how they can contribute to the computer vision approach.

...read moreread less

Abstract: In the year 1999 the world population reached 6 billion, doubling the previous census estimate of 1960. Recently, the United States Census Bureau issued a revised forecast for world population showing a projected growth to 9.4 billion by 2050 (US Census Bureau, http://www.census.gov/ipc/www/worldpop.html). Different research disci- plines have studied the crowd phenomenon and its dynamics from a social, psychological and computational standpoint respectively. This paper presents a survey on crowd analysis methods employed in computer vision research and discusses perspectives from other research disciplines and how they can contribute to the computer vision approach.

...read moreread less

584 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78

Collapse

References

PDF

Open Access

More filters

Book•

The Ecological Approach to Visual Perception

[...]

James J. Gibson¹•Institutions (1)

Princeton University¹

01 Jan 1979

TL;DR: The relationship between Stimulation and Stimulus Information for visual perception is discussed in detail in this article, where the authors also present experimental evidence for direct perception of motion in the world and movement of the self.

...read moreread less

Abstract: Contents: Preface. Introduction. Part I: The Environment To Be Perceived.The Animal And The Environment. Medium, Substances, Surfaces. The Meaningful Environment. Part II: The Information For Visual Perception.The Relationship Between Stimulation And Stimulus Information. The Ambient Optic Array. Events And The Information For Perceiving Events. The Optical Information For Self-Perception. The Theory Of Affordances. Part III: Visual Perception.Experimental Evidence For Direct Perception: Persisting Layout. Experiments On The Perception Of Motion In The World And Movement Of The Self. The Discovery Of The Occluding Edge And Its Implications For Perception. Looking With The Head And Eyes. Locomotion And Manipulation. The Theory Of Information Pickup And Its Consequences. Part IV: Depiction.Pictures And Visual Awareness. Motion Pictures And Visual Awareness. Conclusion. Appendixes: The Principal Terms Used in Ecological Optics. The Concept of Invariants in Ecological Optics.

...read moreread less

21,493 citations

Journal Article•DOI•

Normalized cuts and image segmentation

[...]

Jianbo Shi¹, Jitendra Malik²•Institutions (2)

Carnegie Mellon University¹, University of California, Berkeley²

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.

...read moreread less

Abstract: We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total similarity within the groups. We show that an efficient computational technique based on a generalized eigenvalue problem can be used to optimize this criterion. We applied this approach to segmenting static images, as well as motion sequences, and found the results to be very encouraging.

...read moreread less

13,789 citations

Proceedings Article•DOI•

Normalized cuts and image segmentation

[...]

Jianbo Shi¹, Jitendra Malik¹•Institutions (1)

University of California, Berkeley¹

17 Jun 1997

...read moreread less

Abstract: We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total similarity within the groups. We show that an efficient computational technique based on a generalized eigenvalue problem can be used to optimize this criterion. We have applied this approach to segmenting static images and found results very encouraging.

...read moreread less

11,827 citations

Proceedings Article•DOI•

Adaptive background mixture models for real-time tracking

[...]

Chris Stauffer¹, W.E.L. Grimson¹•Institutions (1)

Massachusetts Institute of Technology¹

23 Jun 1999

TL;DR: This paper discusses modeling each pixel as a mixture of Gaussians and using an on-line approximation to update the model, resulting in a stable, real-time outdoor tracker which reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes.

...read moreread less

Abstract: A common method for real-time segmentation of moving regions in image sequences involves "background subtraction", or thresholding the error between an estimate of the image without moving objects and the current image. The numerous approaches to this problem differ in the type of background model used and the procedure used to update the model. This paper discusses modeling each pixel as a mixture of Gaussians and using an on-line approximation to update the model. The Gaussian, distributions of the adaptive mixture model are then evaluated to determine which are most likely to result from a background process. Each pixel is classified based on whether the Gaussian distribution which represents it most effectively is considered part of the background model. This results in a stable, real-time outdoor tracker which reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes. This system has been run almost continuously for 16 months, 24 hours a day, through rain and snow.

...read moreread less

7,660 citations

Book•

Vision: A Computational Investigation into the Human Representation and Processing of Visual Information

[...]

David Marr

01 Jan 1982

TL;DR: Marr's posthumously published Vision (1982) influenced a generation of brain and cognitive scientists, inspiring many to enter the field of visual perception as discussed by the authors, where the process of vision constructs a set of representations, starting from a description of the input image and culminating with three-dimensional objects in the surrounding environment, a central theme and one that has had farreaching influence in both neuroscience and cognitive science, is the notion of different levels of analysis.

...read moreread less

Abstract: "David Marr's posthumously published Vision (1982) influenced a generation of brain and cognitive scientists, inspiring many to enter the field. In Vision, Marr describes a general framework for understanding visual perception and touches on broader questions about how the brain and its functions can be studied and understood. Researchers from a range of brain and cognitive sciences have long valued Marr's creativity, intellectual power, and ability to integrate insights and data from neuroscience, psychology, and computation. This MIT Press edition makes Marr's influential work available to a new generation of students and scientists. In Marr's framework, the process of vision constructs a set of representations, starting from a description of the input image and culminating with a description of three-dimensional objects in the surrounding environment. A central theme, and one that has had far-reaching influence in both neuroscience and cognitive science, is the notion of different levels of analysis--in Marr's framework, the computational level, the algorithmic level, and the hardware implementation level. Now, thirty years later, the main problems that occupied Marr remain fundamental open problems in the study of perception. Vision provides inspiration for the continuing efforts to integrate knowledge from cognition and computation to understand vision and the brain."--MIT CogNet.

...read moreread less

5,482 citations