scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Optimization of parametrised kicking motion for humanoid soccer player

14 May 2014-pp 241-246
TL;DR: This paper presents a detailed description of the optimization process used to increase the kicking skills of humanoid players, an evolution of skills parameters guided by a evaluation function that aims to send the ball as far as possible.
Abstract: This paper presents a detailed description of the optimization process used to increase the kicking skills of humanoid players. The kicking movement consists of making a step forward to put the support foot next to the ball to kick, then to execute the kicking motion by rocking the leg that must strike the ball. The rocking motion of the leg passes through three positions; the first position is the leg being raised backward, the second is the leg in position of strike, and the third is the leg forward to sweep the ball. The optimization process is an evolution of skills parameters guided by a evaluation function that aims to send the ball as far as possible. To optimize the kick a rocking motion of the torso is also introduced. The parameters used for each one of the three kicking positions are the angle of the torso, the angle of the foot sole, the longitudinal and the vertical coordinates of the foot toe. Thanks to this technique it is possible to increase the distance of the kicked ball by 34%.

Summary (3 min read)

Introduction

  • While competitions involve multiple humanoids, individual agents specialization and optimization are an issue for the enhancement of team play.
  • It can enable higher scoring, but it is also a way to pass the ball between players, which can produce lively and interesting game strategies.
  • The planning of the ZMP trajectory inside the support polygon was employed for biped gaits, while applying a preview controller for increased stability [8], [9].
  • Their aim is not to design accurate nor flexible kicking movements, but to kick and send the ball as far as possible from a fixed position.

II. DESCRIPTION OF THE KICKING MOVEMENT

  • A parametrised kick was designed to enable robots to kick the ball far away.
  • This provides the kicking foot with more kinetic energy at the time of striking the ball since the distance between the hip and the foot is increased.
  • The projection of the center of mass remains at the same position.
  • Put the kicking foot forward to accompany the kick movement, while rocking the trunk backward.
  • The final position of the leg is called the forward position.

TIME DECOMPOSITION OF KICKING MOVEMENT

  • Actually the kicking trajectory is defined thanks to a set of three leg configurations defined in the Cartesian space, which makes 12 parameters in total.
  • The inverse kinematics is the same as the one used for the locomotion gaits [16] by the French team L3M-SIM for its participations in the 3D-SSL competitions.
  • The backward-to-strike swing trajectory and the strike-to-forward swing trajectories of foot toe and foot rotation are interpolated linearly, and the time of execution was reduced as much as possible to generate the highest acceleration at the time of hitting the ball.
  • The authors did not implement any COM or ZMP based stabilizer, since it does not matter if the robots falls after striking the ball away.

III. EVOLUTION PROCESS

  • The evolution process to find stronger kicks is based on the Confident Local Optimization technique (CLOP) [15].
  • As usual, the process aims to make a set of bounded parameters evolve through a fitness function.
  • L defines input parameters with their bounds and H defines the history set where previous results are collected.
  • Since the objective consists of finding stable moves, the authors believe that the smooth optimization of expert parameters is a promising policy.
  • The SUCCESS RATE allows keeping the process evolution close to stable regions (no fall).

VALUES OF PARAMETERS RESULTING FROM MANUAL TUNING

  • S′, m′ and σ′ from ν′), that identify the number of success s, the average distance m between the final ball and the kicker’s positions and its standard deviation σ.
  • In order to stay close to real robot motion, the authors limit the falls thanks to the SUCCESS RATE threshold.
  • Nevertheless, kicks that produce a 0.25 fall rate or less can be selected if they generate more powerful moves.

IV. EXPERIMENTS

  • The humanoid robot used is the NAO robot of the RoboCup 3D Soccer Simulation League (Fig. 5).
  • Table II gives the parameter values obtained through manual tuning.
  • The distance between ankles in the initial position is 0.1[m].
  • Before initiating the kicking motion, the robot bends its torso 17[deg] laterally on the side of the support foot.
  • The authors did not take into account the x-z parameters of the backward and forward positions because they are assumed to have less influence on the result, and because a reduced set of parameters is better to speed up the optimization process.

V. RESULTS

  • Tests are carried out on a single computer that runs the simulation server rcssserver3d [18], their parametrised player agent rcssagent3d-l3m [19] and a coach that is responsible for starting each trial.
  • Figure 3 shows the results for 500 iterations of C2.
  • It shows the evolution turning from unstable (at the beginning) to stable (at the end) while kicking results appear to be better than previous ones over the successive iterations.
  • After the evolution process, resulting parameters are tested carefully.

PARAMETERS SETS CLASSIFICATION AND EVOLUTIONS RESULTS

  • For each experiment, the initial position of the robot on the field is considered as the origin of the global coordinate system.
  • Corresponding averages and standard deviations are detailed in table IV.
  • It shows that C1 resulting parameters produce a stronger kick that sends the ball 1.34 times farther than expert parameters.
  • Similar results are obtained for the C3 parameters.
  • By using C2 parameters, the lateral deviation of the ball can be reduced as the related standard deviation is 2 times less than C1, and 4.7 times less than C3 results.

VI. DISCUSSION

  • This study shows that the proposed evolution process based on the CLOP technique is useful to increase the kicking range of the ball.
  • Each position was defined by two angles, torso and foot sole, and two coordinates of the foot toe, namely x and z.
  • The positions of the toe in the backward and the forward positions were not used as evolving parameters in the optimization process.
  • These position parameters could be added in the process to change the curvature of the swing trajectory, which may have an influence on the kick strongness.

VII. CONCLUSION

  • This paper presented an optimization process designed to increase the kicking range of humanoid players in the 3D-soccer simulation league.
  • The proposed anytime process allowed improving the values tuned manually to win 34% ball kicking range.
  • The optimization process was divided into a two-classes sequential process, that produces more accurate motion.
  • More flexible motion remains to be develop, to produce adaptable kicks capable of sending the ball at various distances.
  • This could be achieved by varying parameters such as the foot sole angle.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Optimization of Parametrised Kicking Motion for Humanoid Soccer
Player
Nicolas Jouandeau
a
and Vincent Hugel
b
Abstract This paper presents a detailed description of the
optimization process used to increase the kicking skills of
humanoid players. The kicking movement consists of making a
step forward to put the support foot next to the ball to kick, then
to execute the kicking motion by rocking the leg that must strike
the ball. The rocking motion of the leg passes through three
positions; the first position is the leg being raised backward,
the second is the leg in position of strike, and the third is the
leg forward to sweep the ball. The optimization process is an
evolution of skills parameters guided by a evaluation function
that aims to send the ball as far as possible. To optimize the kick
a rocking motion of the torso is also introduced. The parameters
used for each one of the three kicking positions are the angle
of the torso, the angle of the foot sole, the longitudinal and the
vertical coordinates of the foot toe. Thanks to this technique it
is possible to increase the distance of the kicked ball by 34%.
I. INTRODUCTION
While competitions involve multiple humanoids, individ-
ual agents specialization and optimization are an issue for
the enhancement of team play. As most RoboCup leagues
are based on soccer games to develop robotics, walking and
kicking represent two essential skills that playing robots need
to master on the field. Walking can be enough to score,
because of the possibility of dribbling the ball towards the
opponent’s goal. Kicking the ball is a very important action
in competitive collective soccer games. It can enable higher
scoring, but it is also a way to pass the ball between players,
which can produce lively and interesting game strategies.
The easiest way to design kicking movements consists
of defining fixed keyframes using interpolations techniques
(SPL team B-Human [1], SPL team HTWK [2]). Instead
of conducting the interpolation inside the joint space, the
trajectories can be determined inside the Cartesian space
(SPL team Nao Devils [3]), which enables flexibility in the
design.
Some methods deal with building omnidirectional kicks
while keeping quasi-static balance during the motion, i.e.
keeping the projection of the COM within the support poly-
gon. For example Xu et al. [4] used a geometric calculation
in the horizontal plane of the farthest-backward reachable
point from the hitting point according to the desired kicking
direction. Other RoboCup participants like Ferreira et al.
from the FC Portugal team [5] built flexible trajectories
thanks to path planning based on B
´
ezier curves for the 3D
a
N. Jouandeau is with the Advanced Computer Science Lab (LIASD),
Universit
´
e Paris 8, 2 rue de la Libert
´
e, 93526 Saint-Denis Cedex 02, France,
n at ai.univ-paris8.fr
b
V. Hugel is with the Engineering System Lab (LISV), Uni-
versit
´
e de Versailles, 10/12 av. Europe, 78140 V
´
elizy - France,
hugel at lisv.uvsq.fr
simulation league. M
¨
uller et al. from B-Human [6] also made
use of pieces of B
´
ezier curves in the Cartesian space to
ensure smooth trajectories in the desired direction.
More recent work developments use the Zero Moment
Point (ZMP) [7] to keep the dynamic balance of the robot’s
executing a kick. The planning of the ZMP trajectory inside
the support polygon was employed for biped gaits, while
applying a preview controller for increased stability [8],
[9]. Yi et al. designed a walk-kick technique using the
ZMP preview controller, and applied it on different robotic
platforms [10]. Wenk et al. implemented inverse dynamics
on the NAO robot to plan the ZMP trajectory, and compared
both ZMP planning methods, namely the preview controller
and the Linear Quadratic Regulation (LQR) method [11].
In his master thesis, Buckley [12] presented an interesting
literature review of human kicking and animated kicking
movements. Thanks to parametrised cubic spline trajectories,
he proposed to model the swing kick as a pendulum to benefit
from the potential energy due to gravity by generating high
linear acceleration at the end of the rotated part of the swing
movement.
Other developments include the design of a controlled-
kicking engine that can adapt to a variety of distances angles
through a decision method that can select from among a
large set of possible kicks [13], and reinforcement learning
techniques to deal with penalty kick scenarios [14].
In this work, our aim is not to design accurate nor flexible
kicking movements, but to kick and send the ball as far
as possible from a fixed position. To achieve this goal, we
parametrised the kick with eight parameters, including the
rotation angle of the torso and the rotation angle of the
foot sole. The rocking of the torso and the rocking of the
kicking foot sole must help to transmit more kinetic energy
to the ball, and send it even farther away. To find optimized
values for the kicking trajectory parameters, we defined
an automatized evolution process that is adapted from the
Confident Local Optimization techniques (CLOP) [15].
The paper is organized as follows. Section II details the
decomposition of the strong kicking movement we developed
for the 3D-SSL league using the NAO humanoid model.
Section III presents the optimization process designed to
make the kick stronger than the manual kick. Section IV
details the experimental results. Section V discusses the
results before the conclusion.
II. DESCRIPTION OF THE KICKING MOVEMENT
A parametrised kick was designed to enable robots to kick
the ball far away. Strong kick capabilities offer a serious

Fig. 1. Decomposition of kicking movement. Initial standing position. End of preparation step. Lift off of leg and body. Backward leg position. Striking
leg position. Forward leg position.
advantage to the kick-off team that can send the ball deep
into the opponent field.
The parametrised kick consists of the following phases
(Fig. 1):
sway hips to transfer the load above the kicking foot,
then lift, swing, and put down the supporting foot. This
enables to put the supporting foot next to the ball. Then
transfer the load to the supporting foot while tilting the
trunk laterally and externally.
lift the kicking foot while raising the body. The support
leg is extended to move the hips higher. This provides
the kicking foot with more kinetic energy at the time
of striking the ball since the distance between the hip
and the foot is increased. The projection of the center
of mass remains at the same position.
put the kicking foot backward while rocking the trunk
forward. The final position of the leg is called the
backward position.
put the kicking foot in the position of kick, i.e. the foot
toes in contact with the ball, while reaching a fixed
inclination of the trunk.
put the kicking foot forward to accompany the kick
movement, while rocking the trunk backward. The final
position of the leg is called the forward position.
The rocking of the trunk is possible thanks to the actuation
of the hip of the supporting foot. This enables to increase
the velocity of the kicking foot at the time of hitting the ball,
therefore transmitting a larger amount of kinetic energy to the
ball, which permits to send the ball farther away. The kicking
movement is parametrised insofar as the positions of the
kicking leg in the backward position, in the kicking position
and in the forward position can be tuned to optimize the
distance covered by the ball after the strike. The parameters
TABLE I
TIME DECOMPOSITION OF KICKING MOVEMENT
Kick phase Time [sec]
sway hips to transfer load to kicking foot 0.22
lift, swing, put down the other foot 0.30
tilt the body sideways/transfer load above support foot 0.60
raise the body and the kicking foot 0.40
put the kicking foot backward 0.30
sweep the kicking foot until the striking position 0.11
sweep the kicking foot forward 0.11
Full movement 2.04
for each position are:
1) the longitudinal offset of the foot toes, relative to the
position of the support foot,
2) the vertical offset of the foot toes, relative to the
position of the support foot,
3) the rotation angle of the torso in the sagittal plane,
4) the rotation angle of the kicking foot sole in the sagittal
plane.
Actually the kicking trajectory is defined thanks to a set of
three leg configurations defined in the Cartesian space, which
makes 12 parameters in total. The inverse kinematics is the
same as the one used for the locomotion gaits [16] by the
French team L3M-SIM for its participations in the 3D-SSL
competitions. The backward-to-strike swing trajectory and
the strike-to-forward swing trajectories of foot toe and foot
rotation are interpolated linearly, and the time of execution
was reduced as much as possible to generate the highest
acceleration at the time of hitting the ball.
We assume that the robot’s feet are placed adequately close
to the ball before triggering the kicking action. We did not
implement any COM or ZMP based stabilizer, since it does
not matter if the robots falls after striking the ball away. We

assume that the manual tuning of the trajectory before the
strike is enough to prevent the robot from falling too early.
III. EVOLUTION PROCESS
The evolution process to find stronger kicks is based on
the Confident Local Optimization technique (CLOP) [15].
This technique was already used with success by Jouandeau
et al. for the simultaneous evolution of morphological and
walking parameters to design morphed players adapted for
efficient walk [17] in the framework of the RoboCup 3D
Simulation Soccer League.
As usual, the process aims to make a set of bounded
parameters evolve through a fitness function. The evolution
process repeats the Generate-And-Test algorithm presented
in Alg. 1. At each step, the generation of new parameter
values is defined with a black box optimizer that takes into
account results history (thus trying to maximize future Test
results). The evolution process is mainly built on the filling
of L and H sets over tests. L defines input parameters with
their bounds and H defines the history set where previous
results are collected. At each step of the evolution, the
optimization process computes an accurate score over all
previous steps that are estimated less confident than the mean
of all steps. This iterative maximum-optimization process has
been experimentally shown to be less time consuming than
classical regression methods for smooth problem optimiza-
tion; it does not need any reference execution as samples are
iteratively selected according to their average win rate. Since
the objective consists of finding stable moves, we believe that
the smooth optimization of expert parameters is a promising
policy. One evolution step corresponds to k trials. At the
end of each step, the decision function pickOut states if
the result is better, equivalent or worst than the best known
result. Initiated with expert knowledge, ν
0
stores the best
known values. The evolution process is implemented as an
anytime interruptible algorithm. When the time elapsed is
considered as sufficient to produce an interesting solution,
the evolution process is instantaneously interrupted and the
resulting input parameters are returned.
Applied to kick optimization, the pickOut function
(Alg. 2) differentiates the strongest kick according to some
constant values. All these evaluation functions make use
of the three constant values called α, β and γ (respect.
equal to 3 1 and 0.7) α is the nearness factor from
the best known result β is the similarity factor from the
nearness factor test with a different threshold γ is the
stability factor, that is narrowed by the previous best known
result. The SUCCESS RAT E allows keeping the process
evolution close to stable regions (no fall). It was fixed to
0.75. This success rate prevents m from taking the value of 0.
The average and standard deviations are computed between
successful trials. In the strong kick evolution, the goal is to
kick as far as possible. The parametrised type T is not used
here but allows to address other types of evolution.
Inside the Test function, the pickOut function com-
pares ν with ν
0
and returns three possible values BEST
Algorithm 1 evolution < T > (k, L, pickOut)
1: ν
0
expertKnowledge
2: H
3: while not-interrupted do
4: p Generate < T > (H, L)
5: ν multipleTrials < T > (p, k)
6: (ν
0
, H) Test < T > (ν, ν
0
, p, H, pickOut)
7: end while
8: return paramsFrom < T > (ν
0
)
TABLE II
VALUES OF PARAMETERS RESULTING FROM MANUAL TUNING
Parameter Value Range
min max
Backward position
longitudinal toe offset 0.140[m] 0.240 0.040
vertical toe offset 0.025[m] 0.001 0.300
angle of torso 50.0[deg] 80.0 0.0
angle of foot sole 73.0[deg] 0.0 110.0
Position at strike
longitudinal toe offset 0.220[m] 0.010 0.300
vertical toe offset 0.007[m] 0.001 0.100
angle of torso 15.0[deg] 45.0 45.0
angle of foot sole 40.0[deg] 0.0 110.0
Forward Position (end of sweeping motion)
longitudinal toe offset 0.300[m] 0.100 0.400
vertical toe offset 0.210[m] 0.050 0.300
angle of torso 45.0[deg] 0.0 80.0
angle of foot sole 65.0[deg] 110.0 110.0
(which implies ν
0
ν), EQUAL and WORST. H is updated
at each evolution iteration.
Algorithm 2 pickOut < T > ((s, m, σ), (s
0
, m
0
, σ
0
))
1: if s < SU CCESS RAT E then return WORST;
2: else if m m
0
+ ασ
0
< 0 then return WORST;
3: else if m m
0
+ βσ
0
< 0 then return EQUAL;
4: else if σ γσ
0
< 0 then return BEST;
5: else if m m
0
< 0 then return EQUAL;
6: else return BEST;
The pickOut function (Alg. 2) uses the 3 subsets s, m
and σ from ν (respect. s
0
, m
0
and σ
0
from ν
0
), that identify
the number of success s, the average distance m between the
final ball and the kicker’s positions and its standard deviation
σ. In the test steps of Alg. 2, s and m are considered
independently. In order to stay close to real robot motion, we
limit the falls thanks to the SUCCESS RAT E threshold.
Nevertheless, kicks that produce a 0.25 fall rate or less can
be selected if they generate more powerful moves.
IV. EXPERIMENTS
The humanoid robot used is the NAO robot of the
RoboCup 3D Soccer Simulation League (Fig. 5).
Table II gives the parameter values obtained through
manual tuning. These parameters allow executing a kick that
can send the ball up to 7.5[m]. The duration of the kick is
2.04[sec]. The distance between ankles in the initial position
is 0.1[m]. Before initiating the kicking motion, the robot
bends its torso 17[deg] laterally on the side of the support
foot.

Fig. 2. A kicking motion with simspark.
We tried 3 evolutions named C1, C2, and C3:
C1 : is composed of all 6 angular parameters, i.e. angles
of torso and foot sole for the backward, strike and
forward positions,
C2 : is composed of the x-and-z toe offsets at strike
only,
C3 : is composed of all the 8 kicking parameters defined
for C1 and C2.
C1 and C3 are applied over the standard parameters. C2 is
complementary to C1. After C1, we optimized C2 to produce
a more precise kick in a reduced parametrised space. We did
not take into account the x-z parameters of the backward
and forward positions because they are assumed to have
less influence on the result, and because a reduced set of
parameters is better to speed up the optimization process.
V. RESULTS
Tests are carried out on a single computer that runs the
simulation server rcssserver3d [18], our parametrised player
agent rcssagent3d-l3m [19] and a coach that is responsible
for starting each trial. To compare the results, all evolution
processes were stopped after 48 hours, which makes approx-
imately one thousand iterations.
Table III gives the parameter values obtained for each
evolution. Figure 3 shows the results for 500 iterations of C2.
The best results are represented in green, equivalent results in
blue and worst results in red. Figure 3 shows the distribution
of values in the sub-space (xToe, zToe) and the nature
of the results. Concentration values close to xToe= 0.2 at
the bottom show the mutual dependence between these two
parameters in characterizing nice kicking moves. Figure 4
shows the distribution during the first 100 iterations, and
Fig. 5 during the last 200 iterations. It shows the evolution
turning from unstable (at the beginning) to stable (at the end)
while kicking results appear to be better than previous ones
over the successive iterations.
xToe varies between 0.01 and 0.3[m] on the horizon-
tal x-axis. zToe varies between 0.001 and 0.1[m] on the
vertical y-axis. It shows that the optimization process leads
to a confined area around [0.2 ; 0.01].
After the evolution process, resulting parameters are tested
carefully. Figure 6 shows the final ball coordinates over 50
runs in the global coordinate system for each class C1, C2,
TABLE III
PARAMETERS SETS CLASSIFICATION AND EVOLUTIONS RESULTS
Classes Parameter description Parameter Position
C3
C1
angle of torso angT0
Backward
angle of foot sole angFS0
angle of torso angT2
Forward
angle of foot sole angFS2
angle of torso angT1 At strike
angle of foot sole angFS1 (middle of
C2
longitudinal toe offset xToe sweeping
vertical toe offset zToe motion)
Parameter C1 C2 C3
angT0 43.5[deg] 58.6[deg]
angFS0 98.1[deg] 96.5[deg]
angT1 28.5[deg] 14.2[deg]
angFS1 31.2[deg] 92.1[deg]
xToe 0.212[m] 0.266[m]
zToe 0.004[m] 0.084[m]
angT2 62.5[deg] 70.4[deg]
angFS2 56.6[deg] 104.1[deg]
and C3, and for expert parameters. For each experiment, the
initial position of the robot on the field is considered as the
origin of the global coordinate system.
Corresponding averages and standard deviations are de-
tailed in table IV. It shows that C1 resulting parameters
produce a stronger kick that sends the ball 1.34 times farther
than expert parameters. Similar results are obtained for the
C3 parameters. By using C2 parameters, the lateral deviation
of the ball can be reduced as the related standard deviation
is 2 times less than C1, and 4.7 times less than C3 results. It
shows that dividing the C3 evolution into C1 followed by C2
produces a more accurate motion, considering a comparable
optimization process with an equivalent number of iterations.
VI. DISCUSSION
This study shows that the proposed evolution process
based on the CLOP technique is useful to increase the
kicking range of the ball. The kick swing was decomposed
into three positions. Each position was defined by two angles,
torso and foot sole, and two coordinates of the foot toe,
namely x and z. The evolution process took into account
eight parameters, including the angle parameters of the three
positions, and the x,z parameters of the strike position. The
positions of the toe in the backward and the forward positions
were not used as evolving parameters in the optimization
process. These position parameters could be added in the
process to change the curvature of the swing trajectory, which
may have an influence on the kick strongness.
In addition, cubic splines trajectories instead of linear
interpolation for the swing motion could be useful to get
a highest velocity at the moment of hitting the ball.
The process assumed that the robot accurately placed
its feet in a fixed position with respect to the ball. In
particular the lateral position of the foot toe remains fixed.
The variation of the lateral coordinate of the foot toe was not
taken into account in the evolution process, which explains
the deviation of the ball from the longitudinal axis of the
robot. However the lateral position of the toe can be adjusted

to modify the kicking direction so as to hit the ball in the
center, which can provide more flexibility to the kicking
motion.
Fig. 3. Evolution of xToe and zToe values [m] of Position at strike over
500 iterations of C2.
Fig. 4. Evolution of xToe and zToe values [m] of Position at strike
during the first 100 iterations of C2.
The swing motion includes a rocking motion of the torso
and a rotation of the foot sole with the aim to increase
the velocity of the foot when striking the ball. The rocking
motions come from the activation of the hip pitch joint
and the ankle pitch joint. The hip of the NAO robot also
features a yaw joint at 45[deg]. The lateral component of
this joint could be used in conjunction with the pitch joint
to increase the velocity of the swing motion at the time of
strike. Then the swing motion will not be limited to the
sagittal plane but will follow a 3D curved trajectory. This is
Fig. 5. Evolution of xToe and zToe values [m] of Position at strike
during the last 200 iterations of C2.
Fig. 6. Horizontal ball position coordinates [m] after a kick with expert,
C1, C2 and C3 evolutions.
a future improvement that requires a specific calculation to
synchronize both the movements of the kicking leg and the
torso because the yaw joints of the NAO are coupled.
VII. CONCLUSION
This paper presented an optimization process designed
to increase the kicking range of humanoid players in the
3D-soccer simulation league. The proposed anytime process
allowed improving the values tuned manually to win 34%
ball kicking range. The optimization process was divided
into a two-classes sequential process, that produces more
accurate motion. More flexible motion remains to be develop,
to produce adaptable kicks capable of sending the ball
at various distances. This could be achieved by varying
parameters such as the foot sole angle.

Citations
More filters
Proceedings ArticleDOI
01 Aug 2022
TL;DR: A hierarchical framework is proposed that leverages deep reinforcement learning to train a robust motion control policy that can track arbitrary motions and a planning policy to decide the desired kicking motion to shoot a soccer ball to a target.
Abstract: We address the problem of enabling quadrupedal robots to perform precise shooting skills in the real world using reinforcement learning. Developing algorithms to enable a legged robot to shoot a soccer ball to a given target is a challenging problem that combines robot motion control and planning into one task. To solve this problem, we need to consider the dynamics limitation and motion stability during the control of a dynamic legged robot. Moreover, we need to consider motion planning to shoot the hard-to-model deformable ball rolling on the ground with uncertain friction to a desired location. In this paper, we propose a hierarchical framework that leverages deep reinforcement learning to train (a) a robust motion control policy that can track arbitrary motions and (b) a planning policy to decide the desired kicking motion to shoot a soccer ball to a target. We deploy the proposed framework on an A1 quadrupedal robot and enable it to accurately shoot the ball to random targets in the real world.

12 citations

Book ChapterDOI
27 Jul 2017
TL;DR: It is shown that a model-free approach to learn behaviors in joint space can be successfully used to utilize toes of a humanoid robot to learn different kick behaviors on simulated Nao robots with toes in the RoboCup 3D soccer simulator.
Abstract: In this paper we show that a model-free approach to learn behaviors in joint space can be successfully used to utilize toes of a humanoid robot. Keeping the approach model-free makes it applicable to any kind of humanoid robot, or robot in general. Here we focus on the benefit on robots with toes which is otherwise more difficult to exploit. The task has been to learn different kick behaviors on simulated Nao robots with toes in the RoboCup 3D soccer simulator. As a result, the robot learned to step on its toe for a kick that performs 30% better than learning the same kick without toes.

11 citations

Journal ArticleDOI
TL;DR: The proposed method presents a kicking strategy during walking for humanoid soccer robots that is a model-free and based on dynamic programming and has significantly improved the team overall performance and robots ability to kick.
Abstract: Nowadays, humanoid soccer serves as a benchmark for artificial intelligence and robotic problems. The factors such as the kicking speed and the number of kicks by robot soccer players are the most significant aims that the participating teams are pursued in the RoboCup 3D Soccer Simulation League. The proposed method presents a kicking strategy during walking for humanoid soccer robots. Achieving an accurate and powerful kicking while robots are moving requires a dynamic optimization of the speed and motion parameters of the robot. In this paper, a curved motion path has been designed based on the robot position relative to the ball and the goal. Ultimately, the robot will be able to kick at the goal by walking along this curve path. The speed and angle of the walking robot are set towards the ball with regard to the robots curved motion path. After the final step of the robot, the accurate and effective adjustment of these two parameters ensures that the robot is located in the ideal position to perform the perfect kick. Due to the noise and walking condition of the robot, it is essential that the speed and angle of motion to be measured more accurately. For this purpose, we use a reinforcement learning model to adjust the robots step size and so does achieve the optimal value of two abovementioned parameters. Using reinforcement learning, robot would learn to pursue an optimal policy to correctly kick towards designated points. Therefore, the proposed method is a model-free and based on dynamic programming. The experiments reveal that the proposed method has significantly improved the team overall performance and robots ability to kick. Our proposed method has been 9.32% successful on average and outperformed the UTAustinVilla agent in terms of goal-scoring time in a non-opponent simulator.

8 citations

01 Jan 2017

6 citations

Dissertation
01 Jun 2013
TL;DR: A transition gait that morphs the walk directly into the kick back swing pose is developed and the act of kicking itself is explored both analytically and empirically, and solutions are provided that are versatile and powerful.
Abstract: Striker speed and accuracy in the RoboCup (SPL) international robot soccer league is becoming increasingly important as the level of play rises. Competition around the ball is now decided in a matter of seconds. Therefore, eliminating any wasted actions or motions is crucial when attempting to kick the ball. It is common to see a discontinuity between walking and kicking where a robot will return to an initial pose in preparation for the kick action. In this thesis we explore the removal of this behaviour by developing a transition gait that morphs the walk directly into the kick back swing pose. The solution presented here is targeted towards the use of the Aldebaran walk for the Nao robot. The solution we develop involves the design of a central pattern generator to allow for controlled steps with realtime accuracy, and a phase locked loop method to synchronise with the Aldebaran walk so that precise step length control can be activated when required. An open loop trajectory mapping approach is taken to the walk that is stabilized statically through the use of a phase varying joint holding torque technique. We also examine the basic princples of open loop walking, focussing on the commonly overlooked frontal plane motion. The act of kicking itself is explored both analytically and empirically, and solutions are provided that are versatile and powerful. Included as an appendix, the broader matter of striker behaviour (process of goal scoring) is reviewed and we present a velocity control algorithm that is very accurate and efficient in terms of speed of execution.

5 citations

References
More filters
Proceedings ArticleDOI
10 Nov 2003
TL;DR: A new method of a biped walking pattern generation by using a preview control of the zero-moment point (ZMP) is introduced and a preview controller can be used to compensate the ZMP error caused by the difference between a simple model and the precise multibody model.
Abstract: We introduce a new method of a biped walking pattern generation by using a preview control of the zero-moment point (ZMP). First, the dynamics of a biped robot is modeled as a running cart on a table which gives a convenient representation to treat ZMP. After reviewing conventional methods of ZMP based pattern generation, we formalize the problem as the design of a ZMP tracking servo controller. It is shown that we can realize such controller by adopting the preview control theory that uses the future reference. It is also shown that a preview controller can be used to compensate the ZMP error caused by the difference between a simple model and the precise multibody model. The effectiveness of the proposed method is demonstrated by a simulation of walking on spiral stairs.

2,090 citations


"Optimization of parametrised kickin..." refers methods in this paper

  • ...The planning of the ZMP trajectory inside the support polygon was employed for biped gaits, while applying a preview controller for increased stability [8], [9]....

    [...]

Journal ArticleDOI
TL;DR: The paper gives an in-depth discussion of source results concerning ZMP, paying particular attention to some delicate issues that may lead to confusion if this method is applied in a mechanistic manner onto irregular cases of artificial gait, i.e. in the case of loss of dynamic balance of a humanoid robot.
Abstract: This paper is devoted to the permanence of the concept of Zero-Moment Point, widelyknown by the acronym ZMP. Thirty-five years have elapsed since its implicit presentation (actually before being named ZMP) to the scientific community and thirty-three years since it was explicitly introduced and clearly elaborated, initially in the leading journals published in English. Its first practical demonstration took place in Japan in 1984, at Waseda University, Laboratory of Ichiro Kato, in the first dynamically balanced robot WL-10RD of the robotic family WABOT. The paper gives an in-depth discussion of source results concerning ZMP, paying particular attention to some delicate issues that may lead to confusion if this method is applied in a mechanistic manner onto irregular cases of artificial gait, i.e. in the case of loss of dynamic balance of a humanoid robot. After a short survey of the history of the origin of ZMP a very detailed elaboration of ZMP notion is given, with a special review concerning “boundary cases” when the ZMP is close to the edge of the support polygon and “fictious cases” when the ZMP should be outside the support polygon. In addition, the difference between ZMP and the center of pressure is pointed out. Finally, some unresolved or insufficiently treated phenomena that may yield a significant improvement in robot performance are considered.

2,011 citations


"Optimization of parametrised kickin..." refers methods in this paper

  • ...More recent work developments use the Zero Moment Point (ZMP) [7] to keep the dynamic balance of the robot’s executing a kick....

    [...]

  • ...The planning of the ZMP trajectory inside the support polygon was employed for biped gaits, while applying a preview controller for increased stability [8], [9]....

    [...]

  • ...Yi et al. designed a walk-kick technique using the ZMP preview controller, and applied it on different robotic platforms [10]....

    [...]

  • ...Wenk et al. implemented inverse dynamics on the NAO robot to plan the ZMP trajectory, and compared both ZMP planning methods, namely the preview controller and the Linear Quadratic Regulation (LQR) method [11]....

    [...]

  • ...We did not implement any COM or ZMP based stabilizer, since it does not matter if the robots falls after striking the ball away....

    [...]

Proceedings ArticleDOI
03 May 2010
TL;DR: An algorithm, Reinforcement Learning with Decision Trees (RL-DT), that uses decision trees to learn the model by generalizing the relative effect of actions across states, and which is effective on an Aldebaran Nao humanoid robot scoring goals in a penalty kick scenario.
Abstract: Reinforcement learning (RL) algorithms have long been promising methods for enabling an autonomous robot to improve its behavior on sequential decision-making tasks. The obvious enticement is that the robot should be able to improve its own behavior without the need for detailed step-by-step programming. However, for RL to reach its full potential, the algorithms must be sample efficient: they must learn competent behavior from very few real-world trials. From this perspective, model-based methods, which use experiential data more efficiently than model-free approaches, are appealing. But they often require exhaustive exploration to learn an accurate model of the domain. In this paper, we present an algorithm, Reinforcement Learning with Decision Trees (RL-DT), that uses decision trees to learn the model by generalizing the relative effect of actions across states. The agent explores the environment until it believes it has a reasonable policy. The combination of the learning approach with the targeted exploration policy enables fast learning of the model. We compare RL-DT against standard model-free and model-based learning methods, and demonstrate its effectiveness on an Aldebaran Nao humanoid robot scoring goals in a penalty kick scenario.

96 citations


"Optimization of parametrised kickin..." refers background in this paper

  • ...Other developments include the design of a controlledkicking engine that can adapt to a variety of distances angles through a decision method that can select from among a large set of possible kicks [13], and reinforcement learning techniques to deal with penalty kick scenarios [14]....

    [...]

Book ChapterDOI
29 Sep 2004
TL;DR: A new multi-agent simulation system, called Spark, for physical agents in three-dimensional environments, which implemented a flexible application framework and exhausted the idea of replaceable components in the resulting system.
Abstract: In this paper we describe a new multi-agent simulation system, called Spark, for physical agents in three-dimensional environments. Our goal in creating Spark was to provide a great amount of flexibility for creating new types of agents and simulations. To achieve this, we implemented a flexible application framework and exhausted the idea of replaceable components in the resulting system. In comparison to specialized simulators, users can effortlessly create new simulations by using a scene description language. Spark is a powerful and flexible tool to state different multi-agent research questions. It is used as official simulator for the first three-dimensional RoboCup Simulation League competition. We present the concepts we used to achieve the flexibility in our system and show how we seamlessly integrated the different subsystems into one user-friendly framework.

52 citations


"Optimization of parametrised kickin..." refers methods in this paper

  • ...Tests are carried out on a single computer that runs the simulation server rcssserver3d [18], our parametrised player agent rcssagent3d-l3m [19] and a coach that is responsible for starting each trial....

    [...]

Book ChapterDOI
01 Jan 2011
TL;DR: A motion engine that translates motions into joint angles by using trajectories is presented, defined as a set of Bezier curves that can be changed online to allow adjusting, for example, a kicking motion precisely to the actual position of the ball.
Abstract: Complex motions like kicking a ball into the goal are becoming more important in RoboCup leagues such as the Standard Platform League. Thus, there is a need for motion sequences that can be parameterized and changed dynamically. This paper presents a motion engine that translates motions into joint angles by using trajectories. These motions are defined as a set of Bezier curves that can be changed online to allow adjusting, for example, a kicking motion precisely to the actual position of the ball. During the execution, motions are stabilized by the combination of center of mass balancing and a gyro feedback-based closed-loop PID controller.

41 citations

Frequently Asked Questions (12)
Q1. What could be used to increase the velocity of the ball?

In addition, cubic splines trajectories instead of linear interpolation for the swing motion could be useful to get a highest velocity at the moment of hitting the ball. 

Actually the kicking trajectory is defined thanks to a set of three leg configurations defined in the Cartesian space, which makes 12 parameters in total. 

The authors did not take into account the x-z parameters of the backward and forward positions because they are assumed to have less influence on the result, and because a reduced set of parameters is better to speed up the optimization process. 

Since the objective consists of finding stable moves, the authors believe that the smooth optimization of expert parameters is a promising policy. 

However the lateral position of the toe can be adjustedto modify the kicking direction so as to hit the ball in the center, which can provide more flexibility to the kicking motion. 

Inside the Test function, the pickOut function compares ν with ν′ and returns three possible values BESTAlgorithm 1 evolution < T > (k, L, pickOut) 1: ν′ ← expertKnowledge 2: H ← ∅ 3: while not-interrupted do 4: p← Generate < T > (H, L) 5: ν ← multipleTrials < T > (p, k) 6: (ν′, H) ← Test < T > (ν, ν′, p, H, pickOut) 7: end while 8: return paramsFrom < T > (ν′)(which implies ν′ ← ν), EQUAL and WORST. 

This enables to increase the velocity of the kicking foot at the time of hitting the ball, therefore transmitting a larger amount of kinetic energy to the ball, which permits to send the ball farther away. 

By using C2 parameters, the lateral deviation of the ball can be reduced as the related standard deviation is 2 times less than C1, and 4.7 times less than C3 results. 

The parametrised kick consists of the following phases (Fig. 1): • sway hips to transfer the load above the kicking foot,then lift, swing, and put down the supporting foot. 

When the time elapsed is considered as sufficient to produce an interesting solution, the evolution process is instantaneously interrupted and the resulting input parameters are returned. 

The positions of the toe in the backward and the forward positions were not used as evolving parameters in the optimization process. 

This iterative maximum-optimization process has been experimentally shown to be less time consuming than classical regression methods for smooth problem optimization; it does not need any reference execution as samples are iteratively selected according to their average win rate.