scispace - formally typeset
SciSpace - Your AI assistant to discover and understand research papers | Product Hunt

Journal ArticleDOI

A Cerebellar Internal Models Control Architecture for Online Sensorimotor Adaptation of a Humanoid Robot Acting in a Dynamic Environment

01 Jan 2020-Vol. 5, Iss: 1, pp 80-87

TL;DR: A novel methodology to artificially replicate these learning and adaptive principles into a robotic feedback controller that combines machine learning, artificial neural network, and computational neuroscience techniques to deal with all the nonlinearities and complexities that modern robotic systems could present.

Abstract-Humanoid robots are often supposed to operate in non-deterministic human environments, and as a consequence, the robust and gentle rejection of the external perturbations is extremely crucial. In this scenario, stable and accurate behavior is mostly solved through adaptive control mechanisms that learn an internal model to predict the consequences of the outgoing control signals. Evidences show that brain-based biological systems resolve this control issue by updating an appropriate internal model that is then used to direct the muscles activities. Inspired by the biological cerebellar internal models theory, that couples forward and inverse internal models into the biological motor control scheme, we propose a novel methodology to artificially replicate these learning and adaptive principles into a robotic feedback controller. The proposed cerebellar-like network combines machine learning, artificial neural network, and computational neuroscience techniques to deal with all the nonlinearities and complexities that modern robotic systems could present. Although the architecture is tested on the simulated humanoid iCub, it can be applied to different robotic systems without excessive customization, thanks to its neural network-based nature. During the experiments, the robot is requested to follow repeatedly a movement while it is interacting with two external systems. Four different internal model architectures are compared and tested under different conditions. The comparison of the performances confirmed the theories about internal models combinatory action. The combination of models together with the structural and learning features of the network, resulted in a benefit to the adaptation mechanism, but also the system response to nonlinearities, noise and external forces.

Topics: Humanoid robot (62%), iCub (56%), Internal model (55%), Adaptive control (55%), Neurorobotics (55%)

Summary (2 min read)

Introduction

  • Of artificial neural networks (ANNs) into nonlinear dynamical systems adaptive control were advantageous for reducing the effects of nonlinearities and uncertainties, and for handling high dimensional and continuous state space systems [11], [12], [13], [8], [14].
  • The structure of the paper is as follows: in section II the authors describe the overall control architecture, giving special focus to the cerebellar-like component; in section III, the experimental set up and results are presented.

A. Robot Plant

  • The humanoid iCub is a 53 degree of freedom (dof) robotic system equipped with several type of sensors, such as: encoders, accelerometers, gyroscopes, F/T sensors, digital cameras.
  • For the sake of simplicity, the overall system actuates seven motors of the right arm: four motors are kept constant to keep the arm upwards (i.e. elbow, shoulder roll, shoulder yaw and shoulder pitch), and N = 3 motors are controlled by 2377-3766 (c) 2019 IEEE.
  • Personal use is permitted, but republication/redistribution requires IEEE permission.
  • //www.ieee.org/publications_standards/publications/rights/index.html for more information, also known as See http.
  • The n-th actual motor state is read by the encoders and saved in the qn ∈ QN×1 angular position and q̇n ∈ Q̇N×1 angular velocity process variables.

C. Controller

  • The Controller once received the Q, Q̇ actual robot states computes the τntot ∈ τ totN×1 torque command to move each actuator to the qrn,q̇ r n desired state.
  • This subsystem is constituted by a static module based on classical control methods, and by two decentralized cerebellar-like neural networks (section II-D): inverse and forward models (blue boxes Fig.2.b).
  • The forward model corrective term is narrowed to the angular velocity, which is the feedback controller input.
  • (3) This quantity is corrected by the forward cerebellar-like module which predicts the consequence of the outgoing motor command and adds ∆q̇cn contribution to minimize the e fb n feedback error.

D. Cerebellar-like Network

  • The cerebellum is constituted of several micro-zones that plausibly correspond to the minimal ulm unit learning machine (Fig.3) [63].
  • //www.ieee.org/publications_standards/publications/rights/index.html for more information, also known as See http.
  • Cells, that in Marr’s opinion encode combinations of mossy fibers inputs [20]; the pc Purkinje cells (in green Fig.3), that modulated by the inferior olive axon and excited by the pf parallel fibers (in violet) projecting from the granule cells, they influence the activity of the dcn deep cerebellar nuclei (in blue).
  • These models are employed by the algorithm to make τ̂grn,g , ˆ̇q gr n,g local predictions of the control input (inverse MCC) and angular velocity (forward MCC) respectively.

III. RESULTS

  • Four architectures that differ in terms of internal models contributions are compared: (I) feedback controller; (II) feedback controller combined with inverse cerebellar-like network; (III) feedback controller combined with forward cerebellarlike network; (IV) feedback controller combined with inverse and forward cerebellar-like networks.
  • Due to the stochastic nature of the experiments, the recorded data are expressed as µ mean value and σ standard deviation of the 20 tests.
  • Personal use is permitted, but republication/redistribution requires IEEE permission.
  • In particular, thanks to the forward model action, architectures III and IV robustly reduce the effect of noise as suggested by [51].
  • It is worthwhile to mention that the feedback controller of the first joint is highly affected by the table weight, which slowly leads the joint towards the correct reference.

IV. CONCLUSIONS

  • Thus far, the authors have presented, tested, and compared four control architectures based on a versatile and real-time modeling structure that replicates the cerebellar internal models individual and combinatorial theories.
  • The experiments confirmed the theories about the internal model independent and combinatorial contribution.
  • In the proposed model, the learning rules that iteratively update the network weights are based on synaptic plasticities derived from computational neuroscience studies [42], [59].
  • Personal use is permitted, but republication/redistribution requires IEEE permission.
  • At the current state the cerebellar network can not generalize all the possible conditions.

Did you find this useful? Give us your feedback

...read more

Content maybe subject to copyright    Report

General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright
owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.
Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
You may not further distribute the material or use it for any profit-making activity or commercial gain
You may freely distribute the URL identifying the publication in the public portal
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately
and investigate your claim.
Downloaded from orbit.dtu.dk on: Aug 10, 2022
A Cerebellar Internal Models Control Architecture for Online Sensorimotor Adaptation
of a Humanoid Robot Acting in a Dynamic Environment
Capolei, Marie Claire; Andersen, Nils Axel; Lund, Henrik Hautop ; Falotico, Egidio; Tolu, Silvia
Published in:
IEEE Robotics and Automation Letters
Link to article, DOI:
10.1109/LRA.2019.2943818
Publication date:
2020
Document Version
Peer reviewed version
Link back to DTU Orbit
Citation (APA):
Capolei, M. C., Andersen, N. A., Lund, H. H., Falotico, E., & Tolu, S. (2020). A Cerebellar Internal Models
Control Architecture for Online Sensorimotor Adaptation of a Humanoid Robot Acting in a Dynamic Environment.
IEEE Robotics and Automation Letters, 5(1), 80-87. https://doi.org/10.1109/LRA.2019.2943818

2377-3766 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LRA.2019.2943818, IEEE Robotics
and Automation Letters
IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED SEPTEMBER, 2019 1
A Cerebellar Internal Models Control Architecture
for Online Sensorimotor Adaptation of a Humanoid
Robot Acting in a Dynamic Environment
Marie Claire Capolei
1
, Nils Axel Andersen
1
, Henrik Hautop Lund
1
, Egidio Falotico
2
, and Silvia Tolu
1
Abstract—Humanoid robots are often supposed to operate in
non-deterministic human environments, and as a consequence,
the robust and gentle rejection of the external perturbations
is extremely crucial. In this scenario, stable and accurate be-
havior is mostly solved through adaptive control mechanisms
that learn an internal model to predict the consequences of
the outgoing control signals. Evidences show that brain-based
biological systems resolve this control issue by updating an
appropriate internal model that is then used to direct the muscles
activities. Inspired by the biological cerebellar internal models
theory, that couples forward and inverse internal models into the
biological motor control scheme, we propose a novel methodology
to artificially replicate these learning and adaptive principles into
a robotic feedback controller. The proposed cerebellar-like net-
work combines machine learning, artificial neural network, and
computational neuroscience techniques to deal with all the non-
linearities and complexities that modern robotic systems could
present. Although the architecture is tested on the simulated
humanoid iCub, it can be applied to different robotic systems
without excessive customization, thanks to its neural network-
based nature. During the experiments, the robot is requested to
follow repeatedly a movement while it is interacting with two
external systems. Four different internal model architectures are
compared and tested under different conditions. The comparison
of the performances confirmed the theories about internal models
combinatory action. The combination of models together with
the structural and learning features of the network, resulted in a
benefit to the adaptation mechanism, but also the system response
to nonlinearities, noise and external forces.
Index Terms—Biomimetics, Neurorobotics, Model Learning for
Control, Learning and Adaptive Systems, Control Architectures
and Programming.
I. INTRODUCTION
M
ODERN robots are often mechanically complex, and
are embedded in unstructured non-deterministic envi-
ronments [1]. The accurate and stable motor control of such
systems is often challenging due to the unreliability of the
Manuscript received: June, 24, 2019; Revised July, 26, 2019; Accepted
September, 10, 2019.
This paper was recommended for publication by Editor Youngjin Choi upon
evaluation of the Associate Editor and Reviewers’ comments. This work has
received funding from the EU-H2020 Framework Program for Research and
Innovation under the specific grant agreement No. 785907 (Human Brain
Project SGA2), and from the Marie Curie project n. 705100 (Biomodular).
1
Marie Claire Capolei, Nils A. Andersen, Henrik Hautop Lund and Silvia
Tolu are with the Automation and control group, Department of Electri-
cal Engineering, Technical University of Denmark, Kgs. Lyngby, Denmark
{macca,naa,hhl,stolu}@elektro.dtu.dk
2
Egidio Falotico is with the BioRobotics Insti-
tute, Scuola Superiore Sant’Anna, Pontedera, Pisa, Italy
egidio.falotico@santannapisa.it
Digital Object Identifier (DOI): see top of this page.
hand engineered modeling strategies, which are too strict to
describe all the complexities and nonlinearities.
In this manuscript, we propose an online learning and
control algorithm to dynamically adapt the movements of a
robotic system acting in an uncertain non-deterministic envi-
ronment. In the design process, we assumed that: the Jacobian
poorly describes the actual robotic condition; one or more
unmodeled external objects interfere with the movement; the
state space system is multivariable and not fully observable;
the action/state space is continuous and high-dimensional. In
this view, the controller should improve the tracking accuracy
of each actuator, and minimize the effects of noise through
force-based control input.
Traditionally, uncertain systems were learned by estimat-
ing open parameters of structured mathematical models [2].
Although this approach has been used for several years in
system identification and adaptive control, fitting the parame-
ters of fixed structure with training data can lead to different
drawbacks, such as: physical inconsistency [3]; unmodeled
behavior; persistent excitation issues [4]; and unstable reaction
to high estimation error.
In the last decades, due to the advancement in artificial
intelligence, a large number of non-parametric approaches
have been proposed to solve the aforementioned problems [5],
[6], [7], [8], [9], [10]. For instance, the introduction of artificial
neural networks (ANNs) into nonlinear dynamical systems
adaptive control were advantageous for reducing the effects
of nonlinearities and uncertainties, and for handling high di-
mensional and continuous state space systems [11], [12], [13],
[8], [14]. Although the structural versatility that distinguishes
ANNs, the continuous interaction within the robotic system
and the non-deterministic environment can be constrained by
the off-line training of the neural network.
The Autonomous Mental Development (AMD) theorists
claim that robots should learn and evolve their processing
through real-time interaction with the environment [15], [16].
In this view, model learning is not seen anymore as a summa-
tion of off-line learned experiences but as an online develop-
ment of the current knowledge of the system [17], [18]. These
theories have their foundation in studies of biological systems,
such as humans, especially infants. The advanced mechanisms
exploited by biological systems to explore their relation with
the surroundings, and control their own movements, motivated
several scientists towards a better understanding of the biolog-
ical motor control.
James S. Albus was the first person to propose a robotic

2377-3766 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LRA.2019.2943818, IEEE Robotics
and Automation Letters
2 IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED SEPTEMBER, 2019
control architecture enhanced by an artificial neural network
based on evidence of the central nervous system (CNS): the
”cerebellar model articulation controller” (CMAC) [19]. The
CMAC module was mainly inspired by the David Maar’s
theory [20] that depicts the cerebellum, a neural structure
located at the back of the brain, as ”language translator
between data in the cerebrum, and command sequences needed
by the muscles” [21].
In the last decades, several scientists have been attracted
by the fascinating mechanisms and functional roles of the
cerebellum in motor and cognitive tasks [22], [23], [24], [25],
[26], [27]. Among all the hypotheses, the scientific community
is highly supporting the involvement of the cerebellum in the
acquisition and maintenance of the internal models, mapping
the correlation within the body and the environment [28],
[29], [30], [31], i.e., forward and inverse models [32], [33]. If
confirmed, these assumptions would explain several complex
mechanisms underlying the neural control of movements [34].
The inverse model elaborates the motor command that leads
the system from the current state to a desired one [35]. Its
contribution enables fast and coordinated limb movements,
that are not achievable with pure feedback control, due to the
biological system dynamics [32]. Evidences show that some
of the motor deficits caused by cerebellar dysfunction, e.g.,
quick ballistic limbs movements and impaired muscle coordi-
nation [36], are due to the lack of feed forward contribution
in motor control, or rather the neural control loop is affected
by slow reaction time and sensory delay [34]. Although it is
controversial [37], [38], scientists argued that integrating the
efference copy signal of the delayed sensory feedback could
overcome these CNS transmission problems [39]. Different
prototypes of cerebellar control architecture based on the
inverse model theory has been proposed, such as: adaptive
filter models [40], [41]; spiking neural networks [42], [43];
combination of parametric adaptive control and machine learn-
ing techniques [44], [45].
The forward model describes the causal relationship be-
tween the outgoing motor command and system state. This
model results beneficial to predict those state transitions that
are not directly accessible [46]. Electrophysiological stud-
ies [47], [48], computational theories [28], [29], imaging and
lesion data [49], [50] suggest that the forward model could
explain pivotal cerebellar functions, such as error correction
and learning. Moreover, robotics experiments proved that
the forward model could play an important role in action
prediction, sensory discrepancy minimization, and noise can-
cellation [51], [52].
Inspired by the theory of coupled internal models [53], [54],
[55], [56], [57], [58], we propose a novel methodology to
replicate and exploit artificially the cerebellar internal models
learning and corrective action. In particular, we designed a
neural network that, through the combination of machine
learning, artificial neural network, and computational neu-
roscience techniques, replicates the functionality, learning,
modularity, and morphology of the cerebellar-circuit. This bio-
mimetic network is embedded in a feedback robotic control
architecture, and is intended to minimize modeling errors and
to constrain the effects of noise, uncertainties, and external dis-
a) b)
Joint 1
Joint 2
Joint 3
Fig. 1: Robotic plant: a) the humanoid iCub holding the
table-ball system in the Neurorobotics Platform; b) the three
controlled wrist joints: 1 pronosupination, 2 yaw, 3 pitch.
turbances. The network weights are defined by non-linear and
multidimensional learning functions that mimic the cerebellar
synaptic plasticities, as proposed by [59], [42]. The manuscript
presents the comparison of four adaptive control architectures
based on the cerebellar internal models theories. The control
system is tested on the virtual humanoid robot iCub [60]
embedded in the Neurorobotics Platform (Fig.1.a)[61], [62].
The architectures performance are evaluated under different
noise and external perturbation conditions. The study con-
firmed that the forward and inverse internal model coupling
shows improved performance respect to the independent in-
ternal models action. Moreover, the biologically plausible
weighting kernel together with the layered structure of the
cerebellar networks resulted beneficial to constrains the effects
of external perturbations and nonlinearities.
The structure of the paper is as follows: in section II we
describe the overall control architecture, giving special focus to
the cerebellar-like component; in section III, the experimental
set up and results are presented. The manuscript concludes
with the discussion of the main findings in comparison with
the literature and future directions.
II. MATERIALS AND METHODS
The robotic system, or rather Agent (Fig.2.a), consists of: a
Planner, which generates the Q
r
N×1
,
˙
Q
r
N×1
reference motors
angular positions and velocities (where N is the number of
controlled joints), that are sent to the controller; the Controller,
which elaborates the τ
tot
N×1
torque commands needed to move
the actuators to the Q
r
N×1
,
˙
Q
r
N×1
desired states; the Robotic
Plant, which includes the actuators and the proprioceptive
sensors employed to read the Q and
˙
Q actual angular positions
and velocities respectively. The Agent interacts with two
external systems, which in this manuscript are represented by
a table and a rolling ball (Fig.1.a).
A. Robot Plant
The humanoid iCub is a 53 degree of freedom (dof)
robotic system equipped with several type of sensors, such
as: encoders, accelerometers, gyroscopes, F/T sensors, digital
cameras. For the sake of simplicity, the overall system actuates
seven motors of the right arm: four motors are kept constant
to keep the arm upwards (i.e. elbow, shoulder roll, shoulder
yaw and shoulder pitch), and N = 3 motors are controlled by

2377-3766 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LRA.2019.2943818, IEEE Robotics
and Automation Letters
CAPOLEI et al.: A CEREBELLAR INTERNAL MODELS CONTROL ARCHITECTURE FOR ONLINE SENSORIMOTOR ADAPTATION OF A HUMANOID ROBOT 3
the proposed controller (namely wrist pronosupination, wrist
yaw and wrist pitch, Fig.1.b). The n-th actual motor state is
read by the encoders and saved in the q
n
Q
N×1
angular
position and ˙q
n
˙
Q
N×1
angular velocity process variables.
B. Planner
The Planner plans the q
n
r
Q
r
N×1
, ˙q
r
n
˙
Q
r
N×1
reference
trajectories, or rather it generates oscillator movements,
q
r
n
= A
n
· sin(2πft + ϕ
n
) , (1)
˙q
r
n
= 2πfA
n
· cos(2πft + ϕ
n
) , (2)
with fixed temporal frequency f = 0.25Hz, A
n
amplitude
and ϕ
n
phase,
A
1×N
=
A
1
, A
2
, A
3
=
0.1727, 0.1363, 0.0345
rad
ϕ
1×N
=
ϕ
1
, ϕ
2
, ϕ
3
=
0.5π, 0.5π, 0.0
rad.
C. Controller
The Controller once received the Q,
˙
Q actual robot states
computes the τ
n
tot
τ
tot
N×1
torque command to move each
actuator to the q
r
n
, ˙q
r
n
desired state. This subsystem is consti-
tuted by a static module based on classical control methods,
and by two decentralized cerebellar-like neural networks (sec-
tion II-D): inverse and forward models (blue boxes Fig.2.b).
The inverse cerebellar-like module adds τ
c
n
τ
c
N×1
feed-forward corrective torque command to the τ
fb
n
, τ
fb
N×1
feedback controller motor input [63], [64], while the forward
module applies ˙q
c
n
˙q
c
state-specific adjustment to the
feedback loop [65], [66], [58]. In this initial design, the
AGENT
Controller
Feedback
Controller
+
Inverse Model
+
-
Actuator n
Forward Model
𝜏
tot
q
PF
e
fb
𝜏
fb
𝚫𝜏
c
+
Granular Layer
Cerebellum
𝚫q
c
Granular Layer
𝜏
PF
Cerebellum
Sensor n
noise
noise
e
tot
b)
teaching signal
teaching signal
corrective action
Cerebellum-like
Networks
Feedback
Controller
Robotic Plant
Sensor N
Sensor n
Sensor 1
Actuator N
Actuator n
Actuator 1
Planner
EXTERNAL
SYSTEM
a)
q
r
.
q
.
ϵ
ϵ
+
q
r
q
q
-
Fig. 2: Control architecture scheme for N actuated joints: a)
main components communication, and b) controller block.
forward model corrective term is narrowed to the angular
velocity, which is the feedback controller input.
In the details of Fig.2.b, the closed-loop computes the e
fb
n
e
fb
N×1
feedback angular velocity error of the n-th motor,
e
fb
n
= ˙q
r
n
˙q
n
. (3)
This quantity is corrected by the forward cerebellar-like
module which predicts the consequence of the outgoing motor
command and adds ˙q
c
n
contribution to minimize the e
fb
n
feedback error. The e
tot
total error,
e
tot
n
= e
fb
n
+ ˙q
c
n
, (4)
it is then employed by both the feedback controller to
compute the feedback torque command τ
fb
n
, according to
the proportional-integrative-derivative (PID) independent joint
control law, and by the inverse cerebellar-like model to com-
pute the corrective torque τ
c
n
τ
c
N×1
, that minimizes both
the e
tot
and the
n
angular position error,
n
= q
r
n
q
n
. (5)
The total control input sent to the motors is the result of a
feed-forward compensation [40],
τ
tot
= τ
fb
+ τ
c
. (6)
On a final note, the PID regulator K gains are tuned to
weakly operate in linearized conditions which exclude the
disturbance of the ball and sensory noise,
K
P
=
K
P
1
, K
P
2
, K
P
3
=
2.9000, 2.3000, 2.3500
K
I
=
K
I
1
, K
I
2
, K
I
3
=
1.9400, 1.9000, 1.9000
K
D
=
K
D
1
, K
D
2
, K
D
3
=
0.0050, 0.0001, 0.0004
.
D. Cerebellar-like Network
The cerebellum is constituted of several micro-zones that
plausibly correspond to the minimal ulm unit learning ma-
chine (Fig.3) [63]. Each ulm presents similar internal micro-
circuitry, but it differs from the others in terms of external
connectivity. There are two main type of axons that connect
each ulm to the outside: the mf mossy fibers (in magenta
Fig.3), which project signals regarding the position, velocity
and direction of the limbs movements [68]; the climbing fibers
(in red), that project from the io inferior olive nucleus the
signal encoding the error [47], [69]. These axons transmits
the information to two main groups of cells: the Gr granule
ulm n
ccm N
ccm 1
Mossy Fibers
(Inputs)
DCN
Corrective Action
(Output)
Parallel Fibers
Gr
Gr
Gr
Gr
io
Pc
io
Pc
Fig. 3: Canonical cerebellar circuit in analogy with [67].

2377-3766 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LRA.2019.2943818, IEEE Robotics
and Automation Letters
4 IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED SEPTEMBER, 2019
DCN
Pcs
MCC Forward
MCC Inverse
ulm n
ulm N
LWPR
mf
𝜏1
Inputs
mf
𝜏n
mf
𝜏N
mf
N
mf
1
Dimensionality
reduction
and mapping
Prediction
Adjustment
Output
Activation
mf
n
ulm 1
ccm 1
ccm M
ccm m
ulm n
ulm N
ulm 1
ccm 1
ccm M
ccm m
mf
1
mf
n
mf
N
𝚫𝜏
1
C
𝚫𝜏
n
C
w()
w()
w()
w()
w()
w()
Grs
w()
w()
k(∑)
io
io
mf
f( )
f( )
f( )
f( )
f( )
f( )
f( )
f( )
mf
mf
mf
mf
mf
mf
mf
mf
Inputs
Dimensionality
reduction
and mapping
Prediction Adjustment
Output Activation
k(∑)
k(∑)
k(∑)
Granule Cell Inferior Olive Purkinje Cell Mossy Fiber
Deep Cerebellar Nuclei Efference copy Synaptic Weight
𝚫𝜏
N
C
𝚫
1
C
𝚫
n
C
𝚫
N
C
Forward
Inverse
(a)
(b)
Fig. 4: Cerebellar-like neural network scheme: (a) structural
modular partition of the inverse and forward module; (b)
details of the networks.
cells, that in Marr’s opinion encode combinations of mossy
fibers inputs [20]; the pc Purkinje cells (in green Fig.3), that
modulated by the inferior olive axon and excited by the pf
parallel fibers (in violet) projecting from the granule cells,
they influence the activity of the dcn deep cerebellar nuclei
(in blue). The dcn is inhibited by the pc and excited by both
the io and mf, and it is responsible for the final processing of
the signal that is sent outside the cerebellar circuit.
In the proposed model (Fig.4.a), each ulm (light blue box)
processes the information of the n-th controlled object (where
n=1,...,N). Accordingly, the dcn of the n-th ulm outputs the
˙q
c
n
and τ
c
n
cerebellar corrections. Each ulm is divided
into M sub-modules representing the ccm canonical cerebellar
microcircuit (yellow boxes in Fig.4.a). Each ccm encodes kine-
matic and/or dynamic features of the n-th controlled object,
such as angular position and velocity. The N ulm together
compose the MCC Modular Cerebellar Circuit mapping the
inverse and forward models of the robotic system (green boxes
in Fig.4.a).
Hereafter for the sake of simplicity, the variable x gen-
erally recalls the signals ˙q
n
and τ
n
propagating inside the
two separated networks, and w generally recalls the specific
network weight. The mossy fibers of the inverse MCC transmit
information about the actual and reference angular velocity of
all the controlled joints,
MF
inv
2N×1
=
mf
inv
1
, ... , mf
inv
2N
T
=
=
˙q
r
1
, ... , ˙q
r
N
, ˙q
1
, ... , ˙q
N
T
,
(7)
while the mossy fibers of the forward MCC project the
signal encoding the reference angular velocities and the latest
control inputs (6),
MF
frw
2N×1
=
mf
frw
1
, ... , mf
frw
2N
T
=
=
˙q
r
1
, ... , ˙q
r
N
, τ
tot
1
(t 1), ... , τ
tot
N
(t 1)
T
.
(8)
The mossy fibers signals are then mapped and exploited to
predict the τ
tot
control input (inverse MCC) and ˙q system
state (forward MCC). As proposed by [44], the granule layer
is represented by the Locally Weighted Projection Regression
algorithm (LWPR) [70]. The LWPR is a fast on-line nonlinear
function approximation algorithm suitable for the reduction of
high dimensional state space system. To replicate the efference
copy theory [39], [71], the LWPR uses a copy of the outgoing
τ
tot
(inverse MCC) and actual ˙q (forward MCC) as modulatory
signals (in cyan Fig.4) to create and train on-line G local linear
models, or rather Gr
g
granule cells (where g=1,...,G). These
models are employed by the algorithm to make ˆτ
g r
n,g
,
ˆ
˙q
g r
n,g
local
predictions of the control input (inverse MCC) and angular
velocity (forward MCC) respectively. The final output of the
granular-parallel fibers layer (in violet Fig.4.b) is the weighted
mean of all the linear models ( refer to [70] for the complete
set of formulas),
ˆx
pf
n
=
P
g =G
g =1
w
g r
n,g
· ˆx
g r
n,g
P
g =G
g =1
w
g r
n,g
. (9)
The w
pfpc
[42] synaptic strengths of the pf-pc parallel
fibers-Purkinje cells connections (Table I) is modulated by the
io inferior olive transmitting the error signals (3,4,5) (in red
Fig.4.b),
io
inv
n
=
io
inv
n,1
, io
inv
n,2
T
=
n
, e
tot
n
T
, (10)
io
frw
n
=
h
io
frw
n,1
, io
frw
n,2
i
T
=
n
, e
fb
n
T
. (11)
The Purkinje cell output signal (in green Fig.4.b) is the
result of the ˆx
pf
modulated LWPR prediction (9),
x
pc
n,m
= w
pfpc
n,m
(t, io
n,m
) · ˆx
pf
n
. (12)
Respect to [44], [52], both the MF mossy fibers input
vectors and the x
pc
Purkinje cells signals are reformulated:
the x
pf
is represented by the final LWPR prediction and not
by the linear combination of the network weights; the x
pc
is
the result of a biologically plausible learning rule function of
the error (10,11), instead of the direct proportion of the error

Citations
More filters

Journal ArticleDOI
Qiang Bai1, Shaobo Li1, Jing Yang1, Qisong Song1, Zhiang Li1, Xingxing Zhang1 
TL;DR: According to the inherent defects of vision, this paper summarizes the research achievements of tactile feedback in the fields of target recognition and robot grasping and finds that the combination of vision and tactile feedback can improve the success rate and robustness of robot grasping.
Abstract: With the rapid development of machine learning, its powerful function in the machine vision field is increasingly reflected. The combination of machine vision and robotics to achieve the same precise and fast grasping as that of humans requires high-precision target detection and recognition, location and reasonable grasp strategy generation, which is the ultimate goal of global researchers and one of the prerequisites for the large-scale application of robots. Traditional machine learning has a long history and good achievements in the field of image processing and robot control. The CNN (convolutional neural network) algorithm realizes training of large-scale image datasets, solves the disadvantages of traditional machine learning in large datasets, and greatly improves accuracy, thereby positioning CNNs as a global research hotspot. However, the increasing difficulty of labeled data acquisition limits their development. Therefore, unsupervised learning, self-supervised learning and reinforcement learning, which are less dependent on labeled data, have also undergone rapid development and achieved good performance in the fields of image processing and robot capture. According to the inherent defects of vision, this paper summarizes the research achievements of tactile feedback in the fields of target recognition and robot grasping and finds that the combination of vision and tactile feedback can improve the success rate and robustness of robot grasping. This paper provides a systematic summary and analysis of the research status of machine vision and tactile feedback in the field of robot grasping and establishes a reasonable reference for future research.

16 citations


Journal ArticleDOI
11 Sep 2020
TL;DR: A control framework that ensures natural movements in articulated soft robots, implementing specific functionalities of the human central nervous system, i.e., learning by repetition, after-effect on known and unknown trajectories, anticipatory behavior, its reactive re-planning, and state covariation in precise task execution is introduced.
Abstract: Human beings can achieve a high level of motor performance that is still unmatched in robotic systems. These capabilities can be ascribed to two main enabling factors: (i) the physical proprieties of human musculoskeletal system, and (ii) the effectiveness of the control operated by the central nervous system. Regarding point (i), the introduction of compliant elements in the robotic structure can be regarded as an attempt to bridge the gap between the animal body and the robot one. Soft articulated robots aim at replicating the musculoskeletal characteristics of vertebrates. Yet, substantial advancements are still needed under a control point of view, to fully exploit the new possibilities provided by soft robotic bodies. This paper introduces a control framework that ensures natural movements in articulated soft robots, implementing specific functionalities of the human central nervous system, i.e., learning by repetition, after-effect on known and unknown trajectories, anticipatory behavior, its reactive re-planning, and state covariation in precise task execution. The control architecture we propose has a hierarchical structure composed of two levels. The low level deals with dynamic inversion and focuses on trajectory tracking problems. The high level manages the degree of freedom redundancy, and it allows to control the system through a reduced set of variables. The building blocks of this novel control architecture are well-rooted in the control theory, which can furnish an established vocabulary to describe the functional mechanisms underlying the motor control system. The proposed control architecture is validated through simulations and experiments on a bio-mimetic articulated soft robot.

3 citations


Posted Content
TL;DR: This work proposes a novel fully spiking neural system that relies on a forward predictive learning by means of a cellular cerebellar model and predicts sensory corrections in input to a differential mappingSpiking neural network during a visual servoing task of a robot arm manipulator.
Abstract: The cerebellum plays a distinctive role within our motor control system to achieve fine and coordinated motions. While cerebellar lesions do not lead to a complete loss of motor functions, both action and perception are severally impacted. Hence, it is assumed that the cerebellum uses an internal forward model to provide anticipatory signals by learning from the error in sensory states. In some studies, it was demonstrated that the learning process relies on the joint-space error. However, this may not exist. This work proposes a novel fully spiking neural system that relies on a forward predictive learning by means of a cellular cerebellar model. The forward model is learnt thanks to the sensory feedback in task-space and it acts as a Smith predictor. The latter predicts sensory corrections in input to a differential mapping spiking neural network during a visual servoing task of a robot arm manipulator. In this paper, we promote the developed control system to achieve more accurate target reaching actions and reduce the motion execution time for the robotic reaching tasks thanks to the cerebellar predictive capabilities.

2 citations


Cites background or methods from "A Cerebellar Internal Models Contro..."

  • ...Most of the studies consider the cerebellar learning relying on error in joint space [7]–[10]....

    [...]

  • ...In [10], the introduced model is based on the adaptive filter theory....

    [...]


Journal ArticleDOI
TL;DR: A robust control method based on neural network disturbance observer was proposed to improve the effect of time-varying system parameters and external disturbances on the control system performance and provides a reference for the multi DOF robot to achieve high-precision tracking in complex and changeable environment.
Abstract: A multi degree of freedom (DOF) robot is a complex and variable nonlinear system, and its control performance is affected by the inherent parameters of the model itself, friction, external disturbance, and other factors. A robust control method based on neural network disturbance observer was proposed in this study to improve the effect of time-varying system parameters and external disturbances on the control system performance. A new dynamic model of robot error was constructed by analyzing the characteristics of the robot system model. The total disturbance of the system was observed and compensated online on the basis of the neural network observer, and the effectiveness of the control method was verified through simulation. Results demonstrate that the robust adaptive control method with neural network disturbance observer reduces the maximum angular displacement error by 2.7 times and the maximum angular velocity tracking error by 2.14 times compared with the control method without observer when model parameter perturbation and external disturbance are found in the system. The maximum angular velocity error is 4 and 88.6 times lower than proportional derivative (PD) compensation control and traditional sliding mode control, respectively. The neural network disturbance observer can accurately track the total disturbance of the system. The input torque of the proposed control method has a small peak torque, which is 1/8 and 1/2 times lower than the sliding mode control and PD compensation control, respectively, and the control curve of the proposed control method is relatively smooth. The proposed method provides a reference for the multi DOF robot to achieve high-precision tracking in complex and changeable environment.

1 citations


Cites methods from "A Cerebellar Internal Models Contro..."

  • ...Scholars had applied fuzzy control, neural network control and so on to the robot, and introduced an intelligent control method of the robot [19-21]....

    [...]


Journal ArticleDOI
Abstract: The cerebellum is a neural structure essential for learning, which is connected via multiple zones to many different regions of the brain, and is thought to improve human performance in a large range of sensory, motor and even cognitive processing tasks. An intriguing possibility for the control of complex robotic systems would be to develop an artificial cerebellar chip with multiple zones that could be similarly connected to a variety of subsystems to optimize performance. The novel aim of this paper, therefore, is to propose and investigate a multizone cerebellar chip applied to a range of tasks in robot adaptive control and sensorimotor processing. The multizone cerebellar chip was evaluated using a custom robotic platform consisting of an array of tactile sensors driven by dielectric electroactive polymers mounted upon a standard industrial robot arm. The results demonstrate that the performance in each task was improved by the concurrent, stable learning in each cerebellar zone. This paper, therefore, provides the first empirical demonstration that a synthetic, multizone, cerebellar chip could be embodied within existing robotic systems to improve performance in a diverse range of tasks, much like the cerebellum in a biological system.

1 citations


References
More filters

Proceedings Article
01 Jan 2009
TL;DR: This paper discusses how ROS relates to existing robot software frameworks, and briefly overview some of the available application software which uses ROS.
Abstract: This paper gives an overview of ROS, an opensource robot operating system. ROS is not an operating system in the traditional sense of process management and scheduling; rather, it provides a structured communications layer above the host operating systems of a heterogenous compute cluster. In this paper, we discuss how ROS relates to existing robot software frameworks, and briefly overview some of the available application software which uses ROS.

7,367 citations


"A Cerebellar Internal Models Contro..." refers methods in this paper

  • ...The software is based on the ROS messaging framework [74]...

    [...]


Journal ArticleDOI
TL;DR: A detailed theory of cerebellar cortex is proposed whose consequence is that the cerebellum learns to perform motor skills and two forms of input—output relation are described, both consistent with the cortical theory.
Abstract: 1. A detailed theory of cerebellar cortex is proposed whose consequence is that the cerebellum learns to perform motor skills. Two forms of input-output relation are described, both consistent with the cortical theory. One is suitable for learning movements (actions), and the other for learning to maintain posture and balance (maintenance reflexes). 2. It is known that the cells of the inferior olive and the cerebellar Purkinje cells have a special one-to-one relationship induced by the climbing fibre input. For learning actions, it is assumed that: (a) each olivary cell responds to a cerebral instruction for an elemental movement. Any action has a defining representation in terms of elemental movements, and this representation has a neural expression as a sequence of firing patterns in the inferior olive; and (b) in the correct state of the nervous system, a Purkinje cell can initiate the elemental movement to which its corresponding olivary cell responds. 3. Whenever an olivary cell fires, it sends an impulse (via the climbing fibre input) to its corresponding Purkinje cell. This Purkinje cell is also exposed (via the mossy fibre input) to information about the context in which its olivary cell fired; and it is shown how, during rehearsal of an action, each Purkinje cell can learn to recognize such contexts. Later, when the action has been learnt, occurrence of the context alone is enough to fire the Purkinje cell, which then causes the next elemental movement. The action thus progresses as it did during rehearsal. 4. It is shown that an interpretation of cerebellar cortex as a structure which allows each Purkinje cell to learn a number of contexts is consistent both with the distributions of the various types of cell, and with their known excitatory or inhibitory natures. It is demonstrated that the mossy fibre-granule cell arrangement provides the required pattern discrimination capability. 5. The following predictions are made. (a) The synapses from parallel fibres to Purkinje cells are facilitated by the conjunction of presynaptic and climbing fibre (or post-synaptic) activity. Reprinted with permission of The Physiological Society, Oxford, England. (b) No other cerebellar synapses are modifiable. (c) Golgi cells are driven by the greater of the inputs from their upper and lower dendritic fields. 6. For learning maintenance reflexes, 2(a) and 2 (b) are replaced by 2’. Each olivary cell is stimulated by one or more receptors, all of whose activities are usually reduced by the results of stimulating the corresponding Purkinje cell. 7. It is shown that if (2’) is satisfied, the circuit receptor → olivary cell → Purkinje cell → effector may be regarded as a stabilizing reflex circuit which is activated by learned mossy fibre inputs. This type of reflex has been called a learned conditional reflex, and it is shown how such reflexes can solve problems of maintaining posture and balance. 8. 5(a), and either (2) or (2’) are essential to the theory: 5(b) and 5(c) are not absolutely essential, and parts of the theory could survive the disproof of either.

2,993 citations


"A Cerebellar Internal Models Contro..." refers background or methods in this paper

  • ...encode combinations of mossy fibers inputs [20]; the pc Purkinje cells (in green Fig....

    [...]

  • ...The CMAC module was mainly inspired by the David Maar’s theory [20] that depicts the cerebellum, a neural structure located at the back of...

    [...]


Journal ArticleDOI
TL;DR: This work shows that the optimal strategy in the face of uncertainty is to allow variability in redundant (task-irrelevant) dimensions, and proposes an alternative theory based on stochastic optimal feedback control, which emerges naturally from this framework.
Abstract: A central problem in motor control is understanding how the many biomechanical degrees of freedom are coordinated to achieve a common goal. An especially puzzling aspect of coordination is that behavioral goals are achieved reliably and repeatedly with movements rarely reproducible in their detail. Existing theoretical frameworks emphasize either goal achievement or the richness of motor variability, but fail to reconcile the two. Here we propose an alternative theory based on stochastic optimal feedback control. We show that the optimal strategy in the face of uncertainty is to allow variability in redundant (task-irrelevant) dimensions. This strategy does not enforce a desired trajectory, but uses feedback more intelligently, correcting only those deviations that interfere with task goals. From this framework, task-constrained variability, goal-directed corrections, motor synergies, controlled parameters, simplifying rules and discrete coordination modes emerge naturally. We present experimental results from a range of motor tasks to support this theory.

2,531 citations


"A Cerebellar Internal Models Contro..." refers background or methods in this paper

  • ...Although it is controversial [37], [38], scientists argued that integrating the efference copy signal of the delayed sensory feedback could overcome these CNS transmission problems [39]....

    [...]

  • ...To replicate the efference copy theory [39], [71], the LWPR uses a copy of the outgoing τ tot (inverse MCC) and actual q̇ (forward MCC) as modulatory signals (in cyan Fig....

    [...]


Book
01 Jun 1984

2,507 citations


Journal ArticleDOI
TL;DR: It is demonstrated that, in order for the learning process to be stable, pattern storage must be accomplished principally by weakening synaptic weights rather than by strengthening them.
Abstract: A comprehensive theory of cerebellar function is presented, which ties together the known anatomy and physiology of the cerebellum into a pattern-recognition data processing system. The cerebellum is postulated to be functionally and structurally equivalent to a modification of the classical Perceptron pattern-classification device. It is suggested that the mossy fiber → granule cell → Golgi cell input network performs an expansion recoding that enhances the pattern-discrimination capacity and learning speed of the cerebellar Purkinje response cells. Parallel fiber synapses of the dendritic spines of Purkinje cells, basket cells, and stellate cells are all postulated to be specifically variable in response to climbing fiber activity. It is argued that this variability is the mechanism of pattern storage. It is demonstrated that, in order for the learning process to be stable, pattern storage must be accomplished principally by weakening synaptic weights rather than by strengthening them.

2,317 citations


"A Cerebellar Internal Models Contro..." refers background in this paper

  • ...the brain, as “language translator between data in the cerebrum, and command sequences needed by the muscles” [21]....

    [...]


Frequently Asked Questions (1)
Q1. What contributions have the authors mentioned in the paper "A cerebellar internal models control architecture for online sensorimotor adaptation of a humanoid robot acting in a dynamic environment" ?

Inspired by the biological cerebellar internal models theory, that couples forward and inverse internal models into the biological motor control scheme, the authors propose a novel methodology to artificially replicate these learning and adaptive principles into a robotic feedback controller. During the experiments, the robot is requested to follow repeatedly a movement while it is interacting with two external systems.