scispace - formally typeset
Open AccessProceedings ArticleDOI

Imitating human dance motions through motion structure analysis

Reads0
Chats0
TLDR
This paper presents a method for importing human dance motion into humanoid robots through visual observation and tries to make a humanoid dance these original or generated motions using inverse-kinematics and dynamic balancing techniques.
Abstract
This paper presents a method for importing human dance motion into humanoid robots through visual observation. The human motion data is acquired from a motion capture system consisting of 8 cameras and 8 PC clusters. Then the whole motion sequence is divided into motion elements and clustered into groups according to the correlation of end-effector trajectories. We call these segments 'motion primitives'. New dance motions are generated by concatenating these motion primitives. We are also trying to make a humanoid dance these original or generated motions using inverse-kinematics and dynamic balancing techniques.

read more

Content maybe subject to copyright    Report

Imitating Human Dance Motions
through Motion Structure Analysis
Atsushi Nakazawa Shinichiro Nakaoka
Katsushi Ikeuchi
Kazuhito Yokoi
††
Japan Science and Technology Corporation
Institute of Industrial Science, The University of Tokyo
††
Intelligent Systems Institute, National Institute of Advanced Industrial Science and Technology
Abstract
This paper presents the method for importing human
dance motion into humanoid robots through visual obser-
vation. The human motion data is acquired from a motion
capture system consisting of 8 cameras and 8 PC clsters.
Then the whole motion sequence is divided into some mo-
tion elements and clusterd into some groups according
to the correlation of end-effectors’ trajectories. We call
these segments as ’motion primitives’. New dance mo-
tions are generated by concatenating these motion primi-
tives. We are also trying to make a humanoid dance these
original or generated motions using inverse-kinematics
and dynamic balancing technique.
Keywords: human motion, humanoid robot, motion prim-
itive, motion capture data
1 Introduction
Importing human motions into a robot through visual ob-
servation is one of the final problem in the humanoid
robot studies [1]. This technology enables robots to im-
itate human motions easily, and be useful for program-
ming the skill of the robot which works around our lifes-
pace. This also interests us from AI aspects because we
must have acquired our motion skills in similar way.
Many studies have been done for this issue. The Jenkins’
approach starts with analyzing the silhouette of a motion
of movements [2]. The human hands’ movements are di-
vided into some basic motions (lines, circles, etc) and
its parameters. Then whole human movements are de-
scribed with these basic motions. Finally original motion
is re-generated from the sequence of the basic motions.
Inamura et.al proposes the idea of the ’minesis’, that are
the sets of basic movement of the human joint angle [5].
The human joint angle movements are divided into the
list of minesis and its parameters. Our group have pro-
posed the idea of ’learning from observation’, such as
APO (Assembly Plan from Observation) and Attention
Point Analysis [4][3]. A robot observes and imitates a
human performing an assembly task by analyzing the tra-
jectory of human hand movements and the contact states
Figure 1: Overview of our project.
of the objects. Basically, all these concepts indicate that
the human motions consist of some variations of simple
motions. It is natural to think that the human motion con-
sists of the limited number of the basic motions, not made
from scratch. We also employ this idea for this study. We
call these basic motions as ”motion primitives”.
We tried to apply this idea for importing human dance
motions into humanoid robots. Our project overview is
shown in Fig.1. The dance motions are one of the good
example of whole body motion, and its characteristic is
having the scenario. This means they must have a struc-
ture of the motion primivies. Our first try is to detect both
ones, the motion primitives and the structure. To gener-
ate the robot movement from the motion primitives, we
developed enhanced methods: the concatenation of the
motion primitives and the modification technique for a
humanoid robot.
2 Acquisition of the human motions
The human dance motion is acquired by the motion cap-
ture system that consists of 8 cameras (SONY DXC-
9000) and PC clusters (Pentium III-800MHz Dual). The
cameras are arranged to surround a person and PCs can
Proceedings of the 2002 IEEE/RSJ
Intl. Conference on Intelligent Robots and Systems
EPFL, Lausanne, Switzerland • October 2002
0-7803-7398-7/02/$17.00 ©2002 IEEE
2539

Figure 2: Acquire images from eight cameras.
acquire image frames in the size of 720x480 pixels at
near 30Hz. All internal clock of PCs are rectified by NTP
protocol in advance, and acquisition time of each image
frames are also recorded in millisecond’s accuracy. We
also acquire blur-free images of moving objects by using
the frame-shutter functionality with which this camera is
equipped (Fig.2). All cameras’ calibration parameters are
acquired, then the geometrical relations between cameras
and the internal parameter of each cameras can be deter-
mined.
The human motion acquisition is carried out by attach-
ing lighting markers on the desirable positions of human
body. During the actor performing the dance motion, all
PCs only acquires the multi-viewpoint images and after
that, depth measurement is done by matching markers be-
tween the images [6].
3 Analysis: detecting ’motion primitives’
Our aim of motion analysis is to detect similar motion
elements (’motion primitives’) and describe the whole
dance motion with the sequence of them. According to
the analysis result, we can recognize the structure of the
dance motion, such as the same motion segments, itera-
tive motion sequence or other kinds of regularities. Fur-
thermore, it becomes possible to detect the mutual rela-
tions of different dancing by comparing the primitive mo-
tion of them.
To detect the motion primitives, we paid our attention to
the local velocity minimun frames of the end effectors
(hands and feet). Because it represents the human motion
segment: the start points and the end points of the motion
primitives. Many researchers notice this value for motion
segment points [7] [8].
To evaluate the similarities of the motion segments, we
used the DP distance of the target points’ trajectories in
3D space. According to this value, the motion segments
are clustered into some groups. Consequently, same mo-
tions of the target points has same labels. They are regis-
tered as the ’minimum motion primitives’.
In the structural motions such as the dances, much longer
motion sequences (the regularities of the minimum mo-
tion primitives) can be seen. Our algorithm can detect
these primitive motion patterns and detect final motion
primitives by equalizing the same motion sequences.
3.1 The motion analysis algorithm
We use 15 measurement points for analysis: hands(L,R),
elbows(L,R), shoulders(L,R), head, hip, body center,
waists(L,R), thighs(L,R) and feet(L,R). The analysis
pipeline consists of following steps (Fig.3).
( 1 ) Define the body center coordinate system.
We define the body center coordinate system which set
the X-axis as the direction of the waist and Z-axis as the
perpendicular direction. In order to detect the symme-
try of the right and left arms/feet movements, we used
the symmetry coordinate system for right/left half of the
body portions.
( 2 ) Coordinates conversion of target points.
The target points (both hands and feet) is changed into a
body center coordinate system.
( 3 ) Preliminary segmentation.
Calculate the velocities of the target points and detect the
local minimum. Gaussian filter is applied in advance to
prevent segmentation errors.
( 4 ) Evaluate the correlation between the segments.
Evaluate the correlation between the target points’ trajec-
tories according to the DP matching distance, which is
calculated by following equations.
Assume that segment m, n are described as
V
m
= {vm
1
,vm
2
, ..., vm
im
|vm
i
R
3
},
V
n
= {vn
1
,vn
2
, ..., vn
in
|vn
i
R
3
}, then the
distance between these segments D(M, N) can be
calculated with following :
D(m, n)=S(V
m
,V
n
)
S(k,l)=d
k,l
+ min(S
k,l1
,S
k1,l1
,S
k1,l
)
d
i,j
= |vm
i
vn
j
|
( 5 ) Cluster and label the segments.
The detected segments are clustered with nearest neigh-
bor algorithm. So the segments in which the target points
passes the similar locus have the same labels. The sym-
metrical motions are also detectable because we use the
symmetrical coordinate system for right and left side of
the body. Using these preliminary analysis procedures,
whole motion sequence is segmented and clustered into
the segments in which a target point draws same trajec-
tory. We call these segments as the ’minimum motion
segments’. Figure 4 shows the preliminal analysis result
of the Japanese folk dance ’Soran-Bushi’. The minimum
motion segments represent very simple motions such as
”Swing down the left arm”, ”Steps forward the right leg”.
2540

(1)Define the body center
coordinate system
Target
points
(3)Segment motion by
detecting velocity
change
(5)label each segments
on the correlation
(4)Evaluate the
correlation of each
segments' trajectory
(2)Convert the traget points
into the body
center coord system
Figure 3: The preliminary motion analysis algorithm.
In additions, much long motion units exist for dance mo-
tions. To find these ones, following steps are applied for
extracting the frequently appearing sequence of the min-
imum motion segments.
( 6 ) Find the frequently appearing minimum motion seg-
ment sequences.
From the labeled segment sequence within the same por-
tions, frequently appearing sequences are detected by us-
ing the apriori algorithm [9]. These results are registered
as ’higher level motion segments’. This processing is per-
formed to the segment sequence of all possible length in
a part. Consequently, we can acquire multi-length and
multi-hierarchical motion segments.
( 7 ) Find the motion primitives among the different target
points.
In order to find the motion primitive of the whole body
portion, the correlations between different target points
are evaluated. For any level motion primitives, coinci-
dence probabilities between the primitives of different
target points are calculated. If this value is higher than
the threshold, they have a relation and be defined as
co occurrence(X
A
Y
B
)=
f(X
A
Y
B
)
2
f
X
(X
A
)f
Y
(Y
B
)
>thresh
where f(p) : the frequence of the label p.
( 8 ) Equalization.
Finally, the motion segment sequences that are labeled
to the same motion are equalized for its 3D trajectories.
In this process, the DP matching result is used to find
the temporal matching points. The equalized results are
preserved as the final primitive motion of this motion se-
quence.
Figure 5 shows the final analysis result of the ”Soran
(1)Set up the support leg
and Translation
(2)Translate the waist
body center and neck points
(3)Rotate the Coord. Systems
Rotation
Interpolation
Interpolation
(4)Interpolate the arms and legs
Figure 6: The motion generation algorithm.
Figure 7: Generated Movements of the two motion prim-
itives in which two different folk dances ”Soran Bushi”
and ”Harukoma”.
Bushi”. We can notice following feathers:
(a) The whole dance motion consists of the iterative mo-
tion primitives and unique motion segments.
(b) Iterative motion primitives are connected by a unique
primitive motion.
(c) The longer unique primitive motion sequences exist
in a part of whole dance motion.
According to these results, we understand the structure of
this dance. This consists of some variations of iterative
motion primitives and unique motion sequences.
4 Generate new motions from motion primi-
tives
As described in the last section, dance motions consist of
the iterative motion primitives and unique motions that
connect the iterative motions. We also noticed that this
structure exists not only in this example but almost in
most of Japanese folk dances. According to this fact, new
dance motion can be generated by concatenating the mo-
tion primitives. For this issue, the study for the human
motion planning is a great help. Horgan and Flash pro-
posed the idea of the minimum jerk model for planning
the arm movements. In this theory, human arm passes to
satisfy the equations:C
j
=
1
2
t
f
0
{(
d
3
x
dt
3
)
2
+(
d
3
y
dt
3
)
2
}dt.
2541

Left
hand
Right
hand
Left
foot
Right
foot
Pattern iteration
Symmetric patterns
Coincident patterns
Time(frame)
Figure 4: The preliminary detection result of the minimum segments and the correlation between the minimum segments.
Left
Hand
Right
Hand
Left
foot
Right
foot
Time(frame)
Figure 5: The analysis result of the ”Soran Bushi”. Colored portions indicates that they has high relations.
Where t
f
is the interval time of the motion primitive.
Similar theories are also proposed by Uno and Kawato
[8], they proposes the minimum torque change model.
These theories indicate that we can generate new motions
from the border conditions (the posture parameters at the
start and the end). Our motion generation algorithm is
shown in Fig.6. We employed the minimum joint angle
jerk model because its simple and useful enough to apply
to our purpose.
4.1 The motion generation algorithm
Assume that two motion primitives are selected to be con-
catenated. The transition of these motions is generated
with following steps (Fig. 6).
( 1 ) Set up a support leg during the transition. The latter
primitive is translated so that this leg comes to a same
position.
( 2 ) Calculate the positions of the unsupported foot,
waist, body and neck during transitions. We employed
2nd order polynomials to keep the continuity of the posi-
tion and velocity.
v(t)=a(t
T
2
)
2
+ b
wherev(0) = v
0
,v(T )=v
1
x(t)=x(0) +
v(t)dt
( 3 ) Linear interpolation is applied for the waist and neck
coordinate system with following steps.
(a) Assume that the coordinate systems of the start and
end time as
0
R
1
,
0
R
2
. Then the rotation matrix between
these coordinates is calculated
1
R
2
=
0
R
1
1
0
R
2
.
(b) Convert
1
R
2
into the quaternion description Q
12
=
{x, y, z, w}
(c) Interpolating rotation matrix is calculated as Q
12
(t)=
{x, y, z, w(t)}, where w(t)=w t/T .
(d) Convert Q
12
(t) into the rotation matrix R
12
(t).
(e) Interpolated coordinate system is acquired as
0
R(t)=
0
R
1
R
12
(t)
( 4 ) For the movements of the arms and the feet, the min-
imum joint angle jerk model is employed for interpola-
tion. We assume each portion has 4-DOF(2-DOF on a
shoulder,thigh, 2-DOF on a elbow,knee), the joint angle
parameter vectors during transition as
θ(t)=(θ
1
(t)
2
(t)
3
(t)
4
(t)). To generate the tran-
sition motions is to determine these functions. We de-
fine each ones as the 5 order polynomials (θ
n
(t)=
a
n
t
5
+ b
n
t
4
+ c
n
t
3
+ d
n
t
2
+ d
n
t + e
n
and set the border
conditions θ
4
(0)) and θ
1
(T ). Finally, each parameters
are determined to minimize the joint angles’ jerk during
transitions :
t=T
t=0
n
θ
n
(τ )
3
3
min.
The duration T is determined by the equalization of the
former and latter segments’ time. Figure 7 shows the in-
terpolation of the motion primitives from Japanese Fork
Dances ’Soran Bushi’ and ’Harukoma’.
2542

5 Presenting dance motions by a humanoid
robot
In this section, we show a method to import the dance
motions into a humanoid robot.
A similar study has been done by Pollard et.al [10].
They used robot arms that has the same DOF of humans’
ones. In our study, we employed 28-DOF whole body hu-
manoid robot HRP-1S [12] and try to imitate whole body
motions. As the first trial, we have assumed that the feet
are fixed and imitate upper body motions. Althogh only
imitating hands movements, we noticed the whole body
balancing control is necessary to keep standing this robot.
For this issue, we propose a new method which enables
robot to imitate human dance motions as similar as pos-
sible while keeping its body standing, through the dance
motion structure analysis we presented the last sections.
For these trials, we used OpenHRP simulator [11] and
HRP-1S virtual model. These ones enable to test whether
our data works on the real humanoid robot.
5.1 Acquiring joint angles; limiting them and their
velocities
The joint angles of dance motions can be solved by using
original motion capture data, simple inverse kinematics
algorithm and humanoid robots connection models. But
these angle values cannot be imported directly because of
these restrictions : the singularity and the limits of joint
angle/joint angle velocity.
Pollard et.at proposed a method to solve these problems.
On their methods, joint angle values are deformed so that
they become within the limits, by applying this filter like
the PD control model:
˙
θ
i
= θ
i
θ
i1
, (1)
¨
θ
F,i+1
=2
K
s
(
˙
θ
i
˙
θ
F,i
+ K
s
(θ
i
θ
F,i
), (2)
˙
θ
F,i+1
= max(
˙
θ
L
,min(
˙
θ
U
,
˙
θ
F,i
+
¨
θ
F,i+1
)) (3)
θ
F,i+1
= θ
F,i
+
˙
θ
F,i+1
(4)
˙
θ
i
= θ
i
θ
i+1
(5)
¨
θ
B,i1
=2
K
s
(
˙
θ
i
˙
θ
B,i
+ K
s
(θ
i
θ
B,i
) (6)
˙
θ
B,i1
= max(
˙
θ
L
,min(
˙
θ
U
,
˙
θ
B,i
+
¨
θ
B,i1
)) (7)
θ
B,i1
= θ
B,i
+
˙
θ
B,i1
(8)
θ
V
=0.5(θ
F,i
+ θ
B,i
) (9)
where θ
i
is the original joint angle,
˙
θ
L
and
˙
θ
U
are the
lower and upper velocity limits. Equations(1)(4) are
solved from the start frame to the end, and equations
(5)(8) are solved backward. The final joint angle θ
V,i
is the average of the forward and backward passes (9).
Using this method, the joint angle velocities are kept
within the given limits. The problem of singularity is also
resolved at the same time because a part of the sequence
near the singular point can be regarded as the area where
the velocity is very fast. As a result, a gimbal locked part
is decomposed to properly interpolated angles.
5.2 Keeping balance
Although we are assuming that the robot’s feet are fixed
while dancing, the balancing problem still exists. When
the robot swings its arm in a wide arc, (Fig.8-a), it cannot
keep balance and falls down (Fig.8-c). To keep the robot
standing during the whole dance sequence, its ZMP (Zero
Moment Point), which indicates a balanced force point
existed between the robot and ground, must be within a
support area enclosed by its soles [13]. The movements
of the arms are main factors to move ZMP outside the
support area because the portions are far from feet contact
points (’the fulcrum’ of the body). As a result, the robot
must fall down like Fig.8-c.
In order to keep ZMP within the support area, the motion
must be modified to compensate for the ZMP trajectory.
On the Pollard’s filter, the stiffness parameter Ks con-
trols the motion dynamics. When Ksis reduced, the joint
angle accelerations are limited and the whole motion be-
comes loose and compact. Then the ZMP position is kept
so that the robot can remain standing. But too small Ks
results in a much different motion from the original ones
(Fig.8-b). So finding most suitable and optimal Ks is
very important for good motion imitation while keeping
balance.
The first idea of our method is maximum Ks during the
whole motion sequence as follows:
( 1 ) Detect frames where ZMP is outside the support area.
( 2 ) Reduce Ks value on the primitive segments which
includes ZMP deviation.
( 3 ) Iterate the above process until ZMP is inside over the
whole motion sequence.
In this process, Ks values are optimized and we achieve
a proper motion between similarity and balancing.
We have simulated the motions generated by this method
and they have realized balance keeping. Then we have
experimented with the real robot (Fig.8-d). It has com-
pleted the whole motion standing by itself.
5.3 Clarify the dance motions’ poses
The problem of the last algorithm is making the mo-
tion ’ambiguous’. This is because such kinds of filter
loses the ’Stop Motions’ of the dance motions. As de-
scribed in section 3, the local minimum frames of the tar-
get points’ velocity represent the borders ofprimitive mo-
tions, and the dancer takes some particular and important
body poses on these frames.
Based on this idea, we have proposed a method for clar-
2543

Citations
More filters
Journal ArticleDOI

From human to humanoid locomotion--an inverse optimal control approach

TL;DR: This paper applies inverse optimal control to establish a model of human overall locomotion path generation to given target positions and orientations, based on newly collected motion capture data, to establish optimal control models that can be used to control robot motion.
Proceedings ArticleDOI

Generating whole body motions for a biped humanoid robot from captured human dances

TL;DR: The process to generate whole body motions which can be performed by an actual biped humanoid robot is described, and the Japanese folk dance, 'Jongara-bushi', was successfully performed by HRP-1S.
Journal ArticleDOI

The minimalist grammar of action

TL;DR: This study presents a biologically inspired generative grammar of action, which employs the structure-building operations and principles of Chomsky's Minimalist Programme as a reference model and shows, how the tool role and the affected-object role of an entity within an action drives the derivation of the action syntax in this grammar.
Journal ArticleDOI

A Language for Human Action

TL;DR: This work focuses on building a language that maps to the lower-level sensory and motor languages and to the higher-level natural language, and an empirically demonstrated human activity language provides sensory-motor-grounded representations for understanding human actions.
References
More filters
Journal ArticleDOI

Formation and control of optimal trajectory in human multijoint arm movement

TL;DR: In this article, the authors proposed a mathematical model which accounts for formation of hand trajectories by defining an objective function, a measure of performance for any possible movement: square of the rate of change of torque integrated over the entire movement.
Proceedings ArticleDOI

Adapting human motion for the control of a humanoid robot

TL;DR: A set of techniques for limiting human motion of upper body gestures to that achievable by a Sarcos humanoid robot located at ATR is explored.
Journal ArticleDOI

Towards a system for the interpretation of moving light displays

TL;DR: An algorithm is presented which segments the points of an MLD of a wire-frame man into body parts, and the relationship of this algorithm to previous theories of MLD perception and actual human performance is discussed.
Proceedings ArticleDOI

Virtual humanoid robot platform to develop controllers of real humanoid robots without porting

TL;DR: A virtual humanoid robot platform (V-HRP for short) on which one can develop the identical controller for avirtual humanoid robot and its real counterpart and employing ART-Linux on which real-time processing is available at the user level.
Related Papers (5)
Frequently Asked Questions (11)
Q1. What are the contributions mentioned in the paper "Imitating human dance motions through motion structure analysis" ?

This paper presents the method for importing human dance motion into humanoid robots through visual observation. 

The authors use 15 measurement points for analysis: hands(L,R), elbows(L,R), shoulders(L,R), head, hip, body center, waists(L,R), thighs(L,R) and feet(L,R). 

The joint angles of dance motions can be solved by using original motion capture data, simple inverse kinematics algorithm and humanoid robots connection models. 

In order to detect the symmetry of the right and left arms/feet movements, the authors used the symmetry coordinate system for right/left half of the body portions.( 

The authors also acquire blur-free images of moving objects by using the frame-shutter functionality with which this camera is equipped (Fig.2). 

As described in the last section, dance motions consist of the iterative motion primitives and unique motions that connect the iterative motions. 

The authors define the body center coordinate system which set the X-axis as the direction of the waist and Z-axis as the perpendicular direction. 

The problem of singularity is also resolved at the same time because a part of the sequence near the singular point can be regarded as the area where the velocity is very fast. 

To generate the robot movement from the motion primitives, the authors developed enhanced methods: the concatenation of the motion primitives and the modification technique for a humanoid robot. 

From the labeled segment sequence within the same portions, frequently appearing sequences are detected by using the apriori algorithm [9]. 

In the structural motions such as the dances, much longer motion sequences (the regularities of the minimum motion primitives) can be seen.