Human motion analysis: a review

doi:10.1109/NAMW.1997.609859

Home
/
Papers
/
Human motion analysis: a review

Proceedings Article•DOI•

Human motion analysis: a review

Jake K. Aggarwal, Qin Cai¹•Institutions (1)

University of Texas at Austin¹

16 Jun 1997-pp 90-102

TL;DR: The paper gives an overview of the various tasks involved in motion analysis of the human body, and focuses on three major areas related to interpreting human motion: motion analysis involving human body parts, tracking of human motion using single or multiple cameras, and recognizing human activities from image sequences.

read less

Abstract: Human motion analysis is receiving increasing attention from computer vision researchers. This interest is motivated by a wide spectrum of applications, such as athletic performance analysis, surveillance, man-machine interfaces, content-based image storage and retrieval, and video conferencing. The paper gives an overview of the various tasks involved in motion analysis of the human body. The authors focus on three major areas related to interpreting human motion: 1) motion analysis involving human body parts, 2) tracking of human motion using single or multiple cameras, and 3) recognizing human activities from image sequences. Motion analysis of human body parts involves the low-level segmentation of the human body into segments connected by joints, and recovers the 3D structure of the human body using its 2D projections over a sequence of images. Tracking human motion using a single or multiple camera focuses on higher-level processing, in which moving humans are tracked without identifying specific parts of the body structure. After successfully matching the moving human image from one frame to another in image sequences, understanding the human movements or activities comes naturally, which leads to a discussion of recognizing human activities. The review is illustrated by examples.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Object tracking: A survey

[...]

Alper Yilmaz¹, Omar Javed, Mubarak Shah²•Institutions (2)

Ohio State University¹, University of Central Florida²

25 Dec 2006-ACM Computing Surveys

TL;DR: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends to discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

Abstract: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the tracking problem in the context of a particular application. In this survey, we categorize the tracking methods on the basis of the object and motion representations used, provide detailed descriptions of representative methods in each category, and examine their pros and cons. Moreover, we discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

5,318 citations

Journal Article•DOI•

Kernel-based object tracking

[...]

Dorin Comaniciu¹, Visvanathan Ramesh, Peter Meer•Institutions (1)

Princeton University¹

01 May 2003-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A new approach toward target representation and localization, the central component in visual tracking of nonrigid objects, is proposed, which employs a metric derived from the Bhattacharyya coefficient as similarity measure, and uses the mean shift procedure to perform the optimization.

...read moreread less

Abstract: A new approach toward target representation and localization, the central component in visual tracking of nonrigid objects, is proposed. The feature histogram-based target representations are regularized by spatial masking with an isotropic kernel. The masking induces spatially-smooth similarity functions suitable for gradient-based optimization, hence, the target localization problem can be formulated using the basin of attraction of the local maxima. We employ a metric derived from the Bhattacharyya coefficient as similarity measure, and use the mean shift procedure to perform the optimization. In the presented tracking examples, the new method successfully coped with camera motion, partial occlusions, clutter, and target scale variations. Integration with motion filters and data association techniques is also discussed. We describe only a few of the potential applications: exploitation of background information, Kalman tracking using motion models, and face tracking.

...read moreread less

4,996 citations

Proceedings Article•DOI•

Recognizing human actions: a local SVM approach

[...]

Christian Schüldt¹, Ivan Laptev¹, Barbara Caputo¹•Institutions (1)

Royal Institute of Technology¹

23 Aug 2004

TL;DR: This paper construct video representations in terms of local space-time features and integrate such representations with SVM classification schemes for recognition and presents the presented results of action recognition.

...read moreread less

Abstract: Local space-time features capture local events in video and can be adapted to the size, the frequency and the velocity of moving patterns. In this paper, we demonstrate how such features can be used for recognizing complex motion patterns. We construct video representations in terms of local space-time features and integrate such representations with SVM classification schemes for recognition. For the purpose of evaluation we introduce a new video database containing 2391 sequences of six human actions performed by 25 people in four different scenarios. The presented results of action recognition justify the proposed method and demonstrate its advantage compared to other relative approaches for action recognition.

...read moreread less

3,238 citations

Cites background from "Human motion analysis: a review"

...All of these conditions introduce challenging problems that have been addressed in computer vision in the past (see [1, 11] for a review)....
[...]

Journal Article•DOI•

The recognition of human movement using temporal templates

[...]

Aaron F. Bobick¹, James W. Davis²•Institutions (2)

Georgia Tech Research Institute¹, Ohio State University²

01 Mar 2001-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A view-based approach to the representation and recognition of human movement is presented, and a recognition method matching temporal templates against stored instances of views of known actions is developed.

...read moreread less

Abstract: A view-based approach to the representation and recognition of human movement is presented. The basis of the representation is a temporal template-a static vector-image where the vector value at each point is a function of the motion properties at the corresponding spatial location in an image sequence. Using aerobics exercises as a test domain, we explore the representational power of a simple, two component version of the templates: The first value is a binary value indicating the presence of motion and the second value is a function of the recency of motion in a sequence. We then develop a recognition method matching temporal templates against stored instances of views of known actions. The method automatically performs temporal segmentation, is invariant to linear changes in speed, and runs in real-time on standard platforms.

...read moreread less

2,932 citations

Journal Article•DOI•

A survey of socially interactive robots

[...]

Terrence Fong¹, Terrence Fong², Illah Nourbakhsh², Kerstin Dautenhahn³•Institutions (3)

École Polytechnique Fédérale de Lausanne¹, Carnegie Mellon University², University of Hertfordshire³

31 Mar 2003-Robotics and Autonomous Systems

TL;DR: The context for socially interactive robots is discussed, emphasizing the relationship to other research fields and the different forms of “social robots”, and a taxonomy of design methods and system components used to build socially interactive Robots is presented.

...read moreread less

2,869 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Determining Optical Flow

[...]

Berthold K. P. Horn¹, Brian G. Schunck¹•Institutions (1)

Massachusetts Institute of Technology¹

12 Nov 1981

TL;DR: In this article, a method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image, and an iterative implementation is shown which successfully computes the Optical Flow for a number of synthetic image sequences.

...read moreread less

Abstract: Optical flow cannot be computed locally, since only one independent measurement is available from the image sequence at a point, while the flow velocity has two components. A second constraint is needed. A method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image. An iterative implementation is shown which successfully computes the optical flow for a number of synthetic image sequences. The algorithm is robust in that it can handle image sequences that are quantized rather coarsely in space and time. It is also insensitive to quantization of brightness levels and additive noise. Examples are included where the assumption of smoothness is violated at singular points or along lines in the image.

...read moreread less

8,078 citations

Journal Article•DOI•

Pfinder: real-time tracking of the human body

[...]

Christopher R. Wren¹, Ali Azarbayejani¹, Trevor Darrell¹, Alex Pentland¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jul 1997-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Pfinder is a real-time system for tracking people and interpreting their behavior that uses a multiclass statistical model of color and shape to obtain a 2D representation of head and hands in a wide range of viewing conditions.

...read moreread less

Abstract: Pfinder is a real-time system for tracking people and interpreting their behavior. It runs at 10 Hz on a standard SGI Indy computer, and has performed reliably on thousands of people in many different physical locations. The system uses a multiclass statistical model of color and shape to obtain a 2D representation of head and hands in a wide range of viewing conditions. Pfinder has been successfully used in a wide range of applications including wireless interfaces, video databases, and low-bandwidth coding.

...read moreread less

4,280 citations

Journal Article•DOI•

Representation and recognition of the spatial organization of three-dimensional shapes.

[...]

David Marr¹, H. K. Nishihara¹•Institutions (1)

Massachusetts Institute of Technology¹

23 Feb 1978-Proceedings of The Royal Society B: Biological Sciences

TL;DR: The human visual process can be studied by examining the computational problems associated with deriving useful information from retinal images by applying the approach to the problem of representing three-dimensional shapes for the purpose of recognition.

...read moreread less

Abstract: The human visual process can be studied by examining the computational problems associated with deriving useful information from retinal images. In this paper, we apply this approach to the problem of representing three-dimensional shapes for the purpose of recognition. 1. Three criteria, accessibility, scope and uniqueness, and stability and sensitivity, are presented for judging the usefulness of a representation for shape recognition. 2. Three aspects of a representation9s design are considered, (i) the representation9s coordinate system, (ii) its primitives, which are the primary units of shape information used in the representation, and (iii) the organization the representation imposes on the information in its descriptions. 3. In terms of these design issues and the criteria presented, a shape representation for recognition should: (i) use an object-centred coordinate system, (ii) include volumetric primitives of varied sizes, and (iii) have a modular organization. A representation based on a shape9s natural axes (for example the axes identified by a stick figure) follows directly from these choices. 4. The basic process for deriving a shape description in this representation must involve: (i) a means for identifying the natural axes of a shape in its image and (ii) a mechanism for transforming viewer-centred axis specifications to specifications in an object-centred coordinate system. 5. Shape recognition involves: (i) a collection of stored shape descriptions, and (ii) various indexes into the collection that allow a newly derived description to be associated with an appropriate stored description. The most important of these indexes allows shape recognition to proceed conservatively from the general to the specific based on the specificity of the information available from the image. 6. New constraints supplied by a conservative recognition process can be used to extract more information from the image. A relaxation process for carrying out this constraint analysis is described.

...read moreread less

2,256 citations

Proceedings Article•DOI•

Recognizing human action in time-sequential images using hidden Markov model

[...]

Junji Yamato, J. Ohya, K. Ishii

15 Jun 1992

TL;DR: The recognition rate is improved by increasing the number of people used to generate the training data, indicating the possibility of establishing a person-independent action recognizer.

...read moreread less

Abstract: A human action recognition method based on a hidden Markov model (HMM) is proposed. It is a feature-based bottom-up approach that is characterized by its learning capability and time-scale invariability. To apply HMMs, one set of time-sequential images is transformed into an image feature vector sequence, and the sequence is converted into a symbol sequence by vector quantization. In learning human action categories, the parameters of the HMMs, one per category, are optimized so as to best describe the training sequences from the category. To recognize an observed sequence, the HMM which best matches the sequence is chosen. Experimental results for real time-sequential images of sports scenes show recognition rates higher than 90%. The recognition rate is improved by increasing the number of people used to generate the training data, indicating the possibility of establishing a person-independent action recognizer. >

...read moreread less

1,477 citations

"Human motion analysis: a review" refers background or methods in this paper

...In our later discussion, human body motion is addressed by the movement of the limbs and hands [ 50 , 28,6,33], such as the velocities of the hand or limb segments, or the angular velocity of various body parts....
[...]
...HMM has been very popular in speech recognition, but only recently has it been adopted for recognition of human motion sequences in computer vision [ 50 ]....
[...]
...The work by Yamato et al. [ 50 ] is perhaps the first one on recognition of human action in this category....
[...]

Journal Article•DOI•

Visual motion perception.

[...]

S. Gunnar O. Johansson

01 Jun 1975-Scientific American

TL;DR: The author uses projective relations as the theoretical foundation of his investigations of visual space and motion and concludes that during locomotion the components of the human visual environment are interpreted as rigid structures in relative motion.

...read moreread less

Abstract: In this article the author uses projective relations as the theoretical foundation of his investigations of visual space and motion. Several laboratory experiments involving perceptual vector analysis and its geometric basis are described. In most of the experiments the visual stimuli consisted of computer-controlled patterns displayed on a televisionlike screen and projected into the eyes of subjects by means of a collimating device that removed parallax as well as the possibility of seeing the screen. A common characteristics of the experiments was that the observer was evidently not free to choose between a Euclidean interpretation of the changing geometry of the figure in the display and a projective interpretation. For example, the observer could not persuade himself that what he was seeing was simply a square growing larger and smaller in the same visual plane; his visual system insisted on telling him that he was seeing a square of constant size approaching and receding. Hence he perceived rigid motion in depth, rotation in a specific slant, bending in depth and so on, paired with the highest possible degree of object constancy. Further experiments were conducted to determine if the principles of perceptual analysis hold true for the more complex paterns of motions encountered in everyday life. These experiments led to the conclusion that during locomotion the components of the human visual environment are interpreted as rigid structures in relative motion.

...read moreread less

930 citations