scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Cerebellar Internal Models Control Architecture for Online Sensorimotor Adaptation of a Humanoid Robot Acting in a Dynamic Environment

01 Jan 2020-Vol. 5, Iss: 1, pp 80-87
TL;DR: A novel methodology to artificially replicate these learning and adaptive principles into a robotic feedback controller that combines machine learning, artificial neural network, and computational neuroscience techniques to deal with all the nonlinearities and complexities that modern robotic systems could present.
Abstract: -Humanoid robots are often supposed to operate in non-deterministic human environments, and as a consequence, the robust and gentle rejection of the external perturbations is extremely crucial. In this scenario, stable and accurate behavior is mostly solved through adaptive control mechanisms that learn an internal model to predict the consequences of the outgoing control signals. Evidences show that brain-based biological systems resolve this control issue by updating an appropriate internal model that is then used to direct the muscles activities. Inspired by the biological cerebellar internal models theory, that couples forward and inverse internal models into the biological motor control scheme, we propose a novel methodology to artificially replicate these learning and adaptive principles into a robotic feedback controller. The proposed cerebellar-like network combines machine learning, artificial neural network, and computational neuroscience techniques to deal with all the nonlinearities and complexities that modern robotic systems could present. Although the architecture is tested on the simulated humanoid iCub, it can be applied to different robotic systems without excessive customization, thanks to its neural network-based nature. During the experiments, the robot is requested to follow repeatedly a movement while it is interacting with two external systems. Four different internal model architectures are compared and tested under different conditions. The comparison of the performances confirmed the theories about internal models combinatory action. The combination of models together with the structural and learning features of the network, resulted in a benefit to the adaptation mechanism, but also the system response to nonlinearities, noise and external forces.

Summary (2 min read)

Introduction

  • Of artificial neural networks (ANNs) into nonlinear dynamical systems adaptive control were advantageous for reducing the effects of nonlinearities and uncertainties, and for handling high dimensional and continuous state space systems [11], [12], [13], [8], [14].
  • The structure of the paper is as follows: in section II the authors describe the overall control architecture, giving special focus to the cerebellar-like component; in section III, the experimental set up and results are presented.

A. Robot Plant

  • The humanoid iCub is a 53 degree of freedom (dof) robotic system equipped with several type of sensors, such as: encoders, accelerometers, gyroscopes, F/T sensors, digital cameras.
  • For the sake of simplicity, the overall system actuates seven motors of the right arm: four motors are kept constant to keep the arm upwards (i.e. elbow, shoulder roll, shoulder yaw and shoulder pitch), and N = 3 motors are controlled by 2377-3766 (c) 2019 IEEE.
  • Personal use is permitted, but republication/redistribution requires IEEE permission.
  • //www.ieee.org/publications_standards/publications/rights/index.html for more information, also known as See http.
  • The n-th actual motor state is read by the encoders and saved in the qn ∈ QN×1 angular position and q̇n ∈ Q̇N×1 angular velocity process variables.

C. Controller

  • The Controller once received the Q, Q̇ actual robot states computes the τntot ∈ τ totN×1 torque command to move each actuator to the qrn,q̇ r n desired state.
  • This subsystem is constituted by a static module based on classical control methods, and by two decentralized cerebellar-like neural networks (section II-D): inverse and forward models (blue boxes Fig.2.b).
  • The forward model corrective term is narrowed to the angular velocity, which is the feedback controller input.
  • (3) This quantity is corrected by the forward cerebellar-like module which predicts the consequence of the outgoing motor command and adds ∆q̇cn contribution to minimize the e fb n feedback error.

D. Cerebellar-like Network

  • The cerebellum is constituted of several micro-zones that plausibly correspond to the minimal ulm unit learning machine (Fig.3) [63].
  • //www.ieee.org/publications_standards/publications/rights/index.html for more information, also known as See http.
  • Cells, that in Marr’s opinion encode combinations of mossy fibers inputs [20]; the pc Purkinje cells (in green Fig.3), that modulated by the inferior olive axon and excited by the pf parallel fibers (in violet) projecting from the granule cells, they influence the activity of the dcn deep cerebellar nuclei (in blue).
  • These models are employed by the algorithm to make τ̂grn,g , ˆ̇q gr n,g local predictions of the control input (inverse MCC) and angular velocity (forward MCC) respectively.

III. RESULTS

  • Four architectures that differ in terms of internal models contributions are compared: (I) feedback controller; (II) feedback controller combined with inverse cerebellar-like network; (III) feedback controller combined with forward cerebellarlike network; (IV) feedback controller combined with inverse and forward cerebellar-like networks.
  • Due to the stochastic nature of the experiments, the recorded data are expressed as µ mean value and σ standard deviation of the 20 tests.
  • Personal use is permitted, but republication/redistribution requires IEEE permission.
  • In particular, thanks to the forward model action, architectures III and IV robustly reduce the effect of noise as suggested by [51].
  • It is worthwhile to mention that the feedback controller of the first joint is highly affected by the table weight, which slowly leads the joint towards the correct reference.

IV. CONCLUSIONS

  • Thus far, the authors have presented, tested, and compared four control architectures based on a versatile and real-time modeling structure that replicates the cerebellar internal models individual and combinatorial theories.
  • The experiments confirmed the theories about the internal model independent and combinatorial contribution.
  • In the proposed model, the learning rules that iteratively update the network weights are based on synaptic plasticities derived from computational neuroscience studies [42], [59].
  • Personal use is permitted, but republication/redistribution requires IEEE permission.
  • At the current state the cerebellar network can not generalize all the possible conditions.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright
owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.
Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
You may not further distribute the material or use it for any profit-making activity or commercial gain
You may freely distribute the URL identifying the publication in the public portal
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately
and investigate your claim.
Downloaded from orbit.dtu.dk on: Aug 10, 2022
A Cerebellar Internal Models Control Architecture for Online Sensorimotor Adaptation
of a Humanoid Robot Acting in a Dynamic Environment
Capolei, Marie Claire; Andersen, Nils Axel; Lund, Henrik Hautop ; Falotico, Egidio; Tolu, Silvia
Published in:
IEEE Robotics and Automation Letters
Link to article, DOI:
10.1109/LRA.2019.2943818
Publication date:
2020
Document Version
Peer reviewed version
Link back to DTU Orbit
Citation (APA):
Capolei, M. C., Andersen, N. A., Lund, H. H., Falotico, E., & Tolu, S. (2020). A Cerebellar Internal Models
Control Architecture for Online Sensorimotor Adaptation of a Humanoid Robot Acting in a Dynamic Environment.
IEEE Robotics and Automation Letters, 5(1), 80-87. https://doi.org/10.1109/LRA.2019.2943818

2377-3766 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LRA.2019.2943818, IEEE Robotics
and Automation Letters
IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED SEPTEMBER, 2019 1
A Cerebellar Internal Models Control Architecture
for Online Sensorimotor Adaptation of a Humanoid
Robot Acting in a Dynamic Environment
Marie Claire Capolei
1
, Nils Axel Andersen
1
, Henrik Hautop Lund
1
, Egidio Falotico
2
, and Silvia Tolu
1
Abstract—Humanoid robots are often supposed to operate in
non-deterministic human environments, and as a consequence,
the robust and gentle rejection of the external perturbations
is extremely crucial. In this scenario, stable and accurate be-
havior is mostly solved through adaptive control mechanisms
that learn an internal model to predict the consequences of
the outgoing control signals. Evidences show that brain-based
biological systems resolve this control issue by updating an
appropriate internal model that is then used to direct the muscles
activities. Inspired by the biological cerebellar internal models
theory, that couples forward and inverse internal models into the
biological motor control scheme, we propose a novel methodology
to artificially replicate these learning and adaptive principles into
a robotic feedback controller. The proposed cerebellar-like net-
work combines machine learning, artificial neural network, and
computational neuroscience techniques to deal with all the non-
linearities and complexities that modern robotic systems could
present. Although the architecture is tested on the simulated
humanoid iCub, it can be applied to different robotic systems
without excessive customization, thanks to its neural network-
based nature. During the experiments, the robot is requested to
follow repeatedly a movement while it is interacting with two
external systems. Four different internal model architectures are
compared and tested under different conditions. The comparison
of the performances confirmed the theories about internal models
combinatory action. The combination of models together with
the structural and learning features of the network, resulted in a
benefit to the adaptation mechanism, but also the system response
to nonlinearities, noise and external forces.
Index Terms—Biomimetics, Neurorobotics, Model Learning for
Control, Learning and Adaptive Systems, Control Architectures
and Programming.
I. INTRODUCTION
M
ODERN robots are often mechanically complex, and
are embedded in unstructured non-deterministic envi-
ronments [1]. The accurate and stable motor control of such
systems is often challenging due to the unreliability of the
Manuscript received: June, 24, 2019; Revised July, 26, 2019; Accepted
September, 10, 2019.
This paper was recommended for publication by Editor Youngjin Choi upon
evaluation of the Associate Editor and Reviewers’ comments. This work has
received funding from the EU-H2020 Framework Program for Research and
Innovation under the specific grant agreement No. 785907 (Human Brain
Project SGA2), and from the Marie Curie project n. 705100 (Biomodular).
1
Marie Claire Capolei, Nils A. Andersen, Henrik Hautop Lund and Silvia
Tolu are with the Automation and control group, Department of Electri-
cal Engineering, Technical University of Denmark, Kgs. Lyngby, Denmark
{macca,naa,hhl,stolu}@elektro.dtu.dk
2
Egidio Falotico is with the BioRobotics Insti-
tute, Scuola Superiore Sant’Anna, Pontedera, Pisa, Italy
egidio.falotico@santannapisa.it
Digital Object Identifier (DOI): see top of this page.
hand engineered modeling strategies, which are too strict to
describe all the complexities and nonlinearities.
In this manuscript, we propose an online learning and
control algorithm to dynamically adapt the movements of a
robotic system acting in an uncertain non-deterministic envi-
ronment. In the design process, we assumed that: the Jacobian
poorly describes the actual robotic condition; one or more
unmodeled external objects interfere with the movement; the
state space system is multivariable and not fully observable;
the action/state space is continuous and high-dimensional. In
this view, the controller should improve the tracking accuracy
of each actuator, and minimize the effects of noise through
force-based control input.
Traditionally, uncertain systems were learned by estimat-
ing open parameters of structured mathematical models [2].
Although this approach has been used for several years in
system identification and adaptive control, fitting the parame-
ters of fixed structure with training data can lead to different
drawbacks, such as: physical inconsistency [3]; unmodeled
behavior; persistent excitation issues [4]; and unstable reaction
to high estimation error.
In the last decades, due to the advancement in artificial
intelligence, a large number of non-parametric approaches
have been proposed to solve the aforementioned problems [5],
[6], [7], [8], [9], [10]. For instance, the introduction of artificial
neural networks (ANNs) into nonlinear dynamical systems
adaptive control were advantageous for reducing the effects
of nonlinearities and uncertainties, and for handling high di-
mensional and continuous state space systems [11], [12], [13],
[8], [14]. Although the structural versatility that distinguishes
ANNs, the continuous interaction within the robotic system
and the non-deterministic environment can be constrained by
the off-line training of the neural network.
The Autonomous Mental Development (AMD) theorists
claim that robots should learn and evolve their processing
through real-time interaction with the environment [15], [16].
In this view, model learning is not seen anymore as a summa-
tion of off-line learned experiences but as an online develop-
ment of the current knowledge of the system [17], [18]. These
theories have their foundation in studies of biological systems,
such as humans, especially infants. The advanced mechanisms
exploited by biological systems to explore their relation with
the surroundings, and control their own movements, motivated
several scientists towards a better understanding of the biolog-
ical motor control.
James S. Albus was the first person to propose a robotic

2377-3766 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LRA.2019.2943818, IEEE Robotics
and Automation Letters
2 IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED SEPTEMBER, 2019
control architecture enhanced by an artificial neural network
based on evidence of the central nervous system (CNS): the
”cerebellar model articulation controller” (CMAC) [19]. The
CMAC module was mainly inspired by the David Maar’s
theory [20] that depicts the cerebellum, a neural structure
located at the back of the brain, as ”language translator
between data in the cerebrum, and command sequences needed
by the muscles” [21].
In the last decades, several scientists have been attracted
by the fascinating mechanisms and functional roles of the
cerebellum in motor and cognitive tasks [22], [23], [24], [25],
[26], [27]. Among all the hypotheses, the scientific community
is highly supporting the involvement of the cerebellum in the
acquisition and maintenance of the internal models, mapping
the correlation within the body and the environment [28],
[29], [30], [31], i.e., forward and inverse models [32], [33]. If
confirmed, these assumptions would explain several complex
mechanisms underlying the neural control of movements [34].
The inverse model elaborates the motor command that leads
the system from the current state to a desired one [35]. Its
contribution enables fast and coordinated limb movements,
that are not achievable with pure feedback control, due to the
biological system dynamics [32]. Evidences show that some
of the motor deficits caused by cerebellar dysfunction, e.g.,
quick ballistic limbs movements and impaired muscle coordi-
nation [36], are due to the lack of feed forward contribution
in motor control, or rather the neural control loop is affected
by slow reaction time and sensory delay [34]. Although it is
controversial [37], [38], scientists argued that integrating the
efference copy signal of the delayed sensory feedback could
overcome these CNS transmission problems [39]. Different
prototypes of cerebellar control architecture based on the
inverse model theory has been proposed, such as: adaptive
filter models [40], [41]; spiking neural networks [42], [43];
combination of parametric adaptive control and machine learn-
ing techniques [44], [45].
The forward model describes the causal relationship be-
tween the outgoing motor command and system state. This
model results beneficial to predict those state transitions that
are not directly accessible [46]. Electrophysiological stud-
ies [47], [48], computational theories [28], [29], imaging and
lesion data [49], [50] suggest that the forward model could
explain pivotal cerebellar functions, such as error correction
and learning. Moreover, robotics experiments proved that
the forward model could play an important role in action
prediction, sensory discrepancy minimization, and noise can-
cellation [51], [52].
Inspired by the theory of coupled internal models [53], [54],
[55], [56], [57], [58], we propose a novel methodology to
replicate and exploit artificially the cerebellar internal models
learning and corrective action. In particular, we designed a
neural network that, through the combination of machine
learning, artificial neural network, and computational neu-
roscience techniques, replicates the functionality, learning,
modularity, and morphology of the cerebellar-circuit. This bio-
mimetic network is embedded in a feedback robotic control
architecture, and is intended to minimize modeling errors and
to constrain the effects of noise, uncertainties, and external dis-
a) b)
Joint 1
Joint 2
Joint 3
Fig. 1: Robotic plant: a) the humanoid iCub holding the
table-ball system in the Neurorobotics Platform; b) the three
controlled wrist joints: 1 pronosupination, 2 yaw, 3 pitch.
turbances. The network weights are defined by non-linear and
multidimensional learning functions that mimic the cerebellar
synaptic plasticities, as proposed by [59], [42]. The manuscript
presents the comparison of four adaptive control architectures
based on the cerebellar internal models theories. The control
system is tested on the virtual humanoid robot iCub [60]
embedded in the Neurorobotics Platform (Fig.1.a)[61], [62].
The architectures performance are evaluated under different
noise and external perturbation conditions. The study con-
firmed that the forward and inverse internal model coupling
shows improved performance respect to the independent in-
ternal models action. Moreover, the biologically plausible
weighting kernel together with the layered structure of the
cerebellar networks resulted beneficial to constrains the effects
of external perturbations and nonlinearities.
The structure of the paper is as follows: in section II we
describe the overall control architecture, giving special focus to
the cerebellar-like component; in section III, the experimental
set up and results are presented. The manuscript concludes
with the discussion of the main findings in comparison with
the literature and future directions.
II. MATERIALS AND METHODS
The robotic system, or rather Agent (Fig.2.a), consists of: a
Planner, which generates the Q
r
N×1
,
˙
Q
r
N×1
reference motors
angular positions and velocities (where N is the number of
controlled joints), that are sent to the controller; the Controller,
which elaborates the τ
tot
N×1
torque commands needed to move
the actuators to the Q
r
N×1
,
˙
Q
r
N×1
desired states; the Robotic
Plant, which includes the actuators and the proprioceptive
sensors employed to read the Q and
˙
Q actual angular positions
and velocities respectively. The Agent interacts with two
external systems, which in this manuscript are represented by
a table and a rolling ball (Fig.1.a).
A. Robot Plant
The humanoid iCub is a 53 degree of freedom (dof)
robotic system equipped with several type of sensors, such
as: encoders, accelerometers, gyroscopes, F/T sensors, digital
cameras. For the sake of simplicity, the overall system actuates
seven motors of the right arm: four motors are kept constant
to keep the arm upwards (i.e. elbow, shoulder roll, shoulder
yaw and shoulder pitch), and N = 3 motors are controlled by

2377-3766 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LRA.2019.2943818, IEEE Robotics
and Automation Letters
CAPOLEI et al.: A CEREBELLAR INTERNAL MODELS CONTROL ARCHITECTURE FOR ONLINE SENSORIMOTOR ADAPTATION OF A HUMANOID ROBOT 3
the proposed controller (namely wrist pronosupination, wrist
yaw and wrist pitch, Fig.1.b). The n-th actual motor state is
read by the encoders and saved in the q
n
Q
N×1
angular
position and ˙q
n
˙
Q
N×1
angular velocity process variables.
B. Planner
The Planner plans the q
n
r
Q
r
N×1
, ˙q
r
n
˙
Q
r
N×1
reference
trajectories, or rather it generates oscillator movements,
q
r
n
= A
n
· sin(2πft + ϕ
n
) , (1)
˙q
r
n
= 2πfA
n
· cos(2πft + ϕ
n
) , (2)
with fixed temporal frequency f = 0.25Hz, A
n
amplitude
and ϕ
n
phase,
A
1×N
=
A
1
, A
2
, A
3
=
0.1727, 0.1363, 0.0345
rad
ϕ
1×N
=
ϕ
1
, ϕ
2
, ϕ
3
=
0.5π, 0.5π, 0.0
rad.
C. Controller
The Controller once received the Q,
˙
Q actual robot states
computes the τ
n
tot
τ
tot
N×1
torque command to move each
actuator to the q
r
n
, ˙q
r
n
desired state. This subsystem is consti-
tuted by a static module based on classical control methods,
and by two decentralized cerebellar-like neural networks (sec-
tion II-D): inverse and forward models (blue boxes Fig.2.b).
The inverse cerebellar-like module adds τ
c
n
τ
c
N×1
feed-forward corrective torque command to the τ
fb
n
, τ
fb
N×1
feedback controller motor input [63], [64], while the forward
module applies ˙q
c
n
˙q
c
state-specific adjustment to the
feedback loop [65], [66], [58]. In this initial design, the
AGENT
Controller
Feedback
Controller
+
Inverse Model
+
-
Actuator n
Forward Model
𝜏
tot
q
PF
e
fb
𝜏
fb
𝚫𝜏
c
+
Granular Layer
Cerebellum
𝚫q
c
Granular Layer
𝜏
PF
Cerebellum
Sensor n
noise
noise
e
tot
b)
teaching signal
teaching signal
corrective action
Cerebellum-like
Networks
Feedback
Controller
Robotic Plant
Sensor N
Sensor n
Sensor 1
Actuator N
Actuator n
Actuator 1
Planner
EXTERNAL
SYSTEM
a)
q
r
.
q
.
ϵ
ϵ
+
q
r
q
q
-
Fig. 2: Control architecture scheme for N actuated joints: a)
main components communication, and b) controller block.
forward model corrective term is narrowed to the angular
velocity, which is the feedback controller input.
In the details of Fig.2.b, the closed-loop computes the e
fb
n
e
fb
N×1
feedback angular velocity error of the n-th motor,
e
fb
n
= ˙q
r
n
˙q
n
. (3)
This quantity is corrected by the forward cerebellar-like
module which predicts the consequence of the outgoing motor
command and adds ˙q
c
n
contribution to minimize the e
fb
n
feedback error. The e
tot
total error,
e
tot
n
= e
fb
n
+ ˙q
c
n
, (4)
it is then employed by both the feedback controller to
compute the feedback torque command τ
fb
n
, according to
the proportional-integrative-derivative (PID) independent joint
control law, and by the inverse cerebellar-like model to com-
pute the corrective torque τ
c
n
τ
c
N×1
, that minimizes both
the e
tot
and the
n
angular position error,
n
= q
r
n
q
n
. (5)
The total control input sent to the motors is the result of a
feed-forward compensation [40],
τ
tot
= τ
fb
+ τ
c
. (6)
On a final note, the PID regulator K gains are tuned to
weakly operate in linearized conditions which exclude the
disturbance of the ball and sensory noise,
K
P
=
K
P
1
, K
P
2
, K
P
3
=
2.9000, 2.3000, 2.3500
K
I
=
K
I
1
, K
I
2
, K
I
3
=
1.9400, 1.9000, 1.9000
K
D
=
K
D
1
, K
D
2
, K
D
3
=
0.0050, 0.0001, 0.0004
.
D. Cerebellar-like Network
The cerebellum is constituted of several micro-zones that
plausibly correspond to the minimal ulm unit learning ma-
chine (Fig.3) [63]. Each ulm presents similar internal micro-
circuitry, but it differs from the others in terms of external
connectivity. There are two main type of axons that connect
each ulm to the outside: the mf mossy fibers (in magenta
Fig.3), which project signals regarding the position, velocity
and direction of the limbs movements [68]; the climbing fibers
(in red), that project from the io inferior olive nucleus the
signal encoding the error [47], [69]. These axons transmits
the information to two main groups of cells: the Gr granule
ulm n
ccm N
ccm 1
Mossy Fibers
(Inputs)
DCN
Corrective Action
(Output)
Parallel Fibers
Gr
Gr
Gr
Gr
io
Pc
io
Pc
Fig. 3: Canonical cerebellar circuit in analogy with [67].

2377-3766 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LRA.2019.2943818, IEEE Robotics
and Automation Letters
4 IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED SEPTEMBER, 2019
DCN
Pcs
MCC Forward
MCC Inverse
ulm n
ulm N
LWPR
mf
𝜏1
Inputs
mf
𝜏n
mf
𝜏N
mf
N
mf
1
Dimensionality
reduction
and mapping
Prediction
Adjustment
Output
Activation
mf
n
ulm 1
ccm 1
ccm M
ccm m
ulm n
ulm N
ulm 1
ccm 1
ccm M
ccm m
mf
1
mf
n
mf
N
𝚫𝜏
1
C
𝚫𝜏
n
C
w()
w()
w()
w()
w()
w()
Grs
w()
w()
k(∑)
io
io
mf
f( )
f( )
f( )
f( )
f( )
f( )
f( )
f( )
mf
mf
mf
mf
mf
mf
mf
mf
Inputs
Dimensionality
reduction
and mapping
Prediction Adjustment
Output Activation
k(∑)
k(∑)
k(∑)
Granule Cell Inferior Olive Purkinje Cell Mossy Fiber
Deep Cerebellar Nuclei Efference copy Synaptic Weight
𝚫𝜏
N
C
𝚫
1
C
𝚫
n
C
𝚫
N
C
Forward
Inverse
(a)
(b)
Fig. 4: Cerebellar-like neural network scheme: (a) structural
modular partition of the inverse and forward module; (b)
details of the networks.
cells, that in Marr’s opinion encode combinations of mossy
fibers inputs [20]; the pc Purkinje cells (in green Fig.3), that
modulated by the inferior olive axon and excited by the pf
parallel fibers (in violet) projecting from the granule cells,
they influence the activity of the dcn deep cerebellar nuclei
(in blue). The dcn is inhibited by the pc and excited by both
the io and mf, and it is responsible for the final processing of
the signal that is sent outside the cerebellar circuit.
In the proposed model (Fig.4.a), each ulm (light blue box)
processes the information of the n-th controlled object (where
n=1,...,N). Accordingly, the dcn of the n-th ulm outputs the
˙q
c
n
and τ
c
n
cerebellar corrections. Each ulm is divided
into M sub-modules representing the ccm canonical cerebellar
microcircuit (yellow boxes in Fig.4.a). Each ccm encodes kine-
matic and/or dynamic features of the n-th controlled object,
such as angular position and velocity. The N ulm together
compose the MCC Modular Cerebellar Circuit mapping the
inverse and forward models of the robotic system (green boxes
in Fig.4.a).
Hereafter for the sake of simplicity, the variable x gen-
erally recalls the signals ˙q
n
and τ
n
propagating inside the
two separated networks, and w generally recalls the specific
network weight. The mossy fibers of the inverse MCC transmit
information about the actual and reference angular velocity of
all the controlled joints,
MF
inv
2N×1
=
mf
inv
1
, ... , mf
inv
2N
T
=
=
˙q
r
1
, ... , ˙q
r
N
, ˙q
1
, ... , ˙q
N
T
,
(7)
while the mossy fibers of the forward MCC project the
signal encoding the reference angular velocities and the latest
control inputs (6),
MF
frw
2N×1
=
mf
frw
1
, ... , mf
frw
2N
T
=
=
˙q
r
1
, ... , ˙q
r
N
, τ
tot
1
(t 1), ... , τ
tot
N
(t 1)
T
.
(8)
The mossy fibers signals are then mapped and exploited to
predict the τ
tot
control input (inverse MCC) and ˙q system
state (forward MCC). As proposed by [44], the granule layer
is represented by the Locally Weighted Projection Regression
algorithm (LWPR) [70]. The LWPR is a fast on-line nonlinear
function approximation algorithm suitable for the reduction of
high dimensional state space system. To replicate the efference
copy theory [39], [71], the LWPR uses a copy of the outgoing
τ
tot
(inverse MCC) and actual ˙q (forward MCC) as modulatory
signals (in cyan Fig.4) to create and train on-line G local linear
models, or rather Gr
g
granule cells (where g=1,...,G). These
models are employed by the algorithm to make ˆτ
g r
n,g
,
ˆ
˙q
g r
n,g
local
predictions of the control input (inverse MCC) and angular
velocity (forward MCC) respectively. The final output of the
granular-parallel fibers layer (in violet Fig.4.b) is the weighted
mean of all the linear models ( refer to [70] for the complete
set of formulas),
ˆx
pf
n
=
P
g =G
g =1
w
g r
n,g
· ˆx
g r
n,g
P
g =G
g =1
w
g r
n,g
. (9)
The w
pfpc
[42] synaptic strengths of the pf-pc parallel
fibers-Purkinje cells connections (Table I) is modulated by the
io inferior olive transmitting the error signals (3,4,5) (in red
Fig.4.b),
io
inv
n
=
io
inv
n,1
, io
inv
n,2
T
=
n
, e
tot
n
T
, (10)
io
frw
n
=
h
io
frw
n,1
, io
frw
n,2
i
T
=
n
, e
fb
n
T
. (11)
The Purkinje cell output signal (in green Fig.4.b) is the
result of the ˆx
pf
modulated LWPR prediction (9),
x
pc
n,m
= w
pfpc
n,m
(t, io
n,m
) · ˆx
pf
n
. (12)
Respect to [44], [52], both the MF mossy fibers input
vectors and the x
pc
Purkinje cells signals are reformulated:
the x
pf
is represented by the final LWPR prediction and not
by the linear combination of the network weights; the x
pc
is
the result of a biologically plausible learning rule function of
the error (10,11), instead of the direct proportion of the error

Citations
More filters
Journal ArticleDOI
Qiang Bai1, Shaobo Li1, Jing Yang1, Qisong Song1, Zhiang Li1, Xingxing Zhang1 
TL;DR: According to the inherent defects of vision, this paper summarizes the research achievements of tactile feedback in the fields of target recognition and robot grasping and finds that the combination of vision and tactile feedback can improve the success rate and robustness of robot grasping.
Abstract: With the rapid development of machine learning, its powerful function in the machine vision field is increasingly reflected. The combination of machine vision and robotics to achieve the same precise and fast grasping as that of humans requires high-precision target detection and recognition, location and reasonable grasp strategy generation, which is the ultimate goal of global researchers and one of the prerequisites for the large-scale application of robots. Traditional machine learning has a long history and good achievements in the field of image processing and robot control. The CNN (convolutional neural network) algorithm realizes training of large-scale image datasets, solves the disadvantages of traditional machine learning in large datasets, and greatly improves accuracy, thereby positioning CNNs as a global research hotspot. However, the increasing difficulty of labeled data acquisition limits their development. Therefore, unsupervised learning, self-supervised learning and reinforcement learning, which are less dependent on labeled data, have also undergone rapid development and achieved good performance in the fields of image processing and robot capture. According to the inherent defects of vision, this paper summarizes the research achievements of tactile feedback in the fields of target recognition and robot grasping and finds that the combination of vision and tactile feedback can improve the success rate and robustness of robot grasping. This paper provides a systematic summary and analysis of the research status of machine vision and tactile feedback in the field of robot grasping and establishes a reasonable reference for future research.

54 citations

Journal ArticleDOI
11 Sep 2020
TL;DR: A control framework that ensures natural movements in articulated soft robots, implementing specific functionalities of the human central nervous system, i.e., learning by repetition, after-effect on known and unknown trajectories, anticipatory behavior, its reactive re-planning, and state covariation in precise task execution is introduced.
Abstract: Human beings can achieve a high level of motor performance that is still unmatched in robotic systems. These capabilities can be ascribed to two main enabling factors: (i) the physical proprieties of human musculoskeletal system, and (ii) the effectiveness of the control operated by the central nervous system. Regarding point (i), the introduction of compliant elements in the robotic structure can be regarded as an attempt to bridge the gap between the animal body and the robot one. Soft articulated robots aim at replicating the musculoskeletal characteristics of vertebrates. Yet, substantial advancements are still needed under a control point of view, to fully exploit the new possibilities provided by soft robotic bodies. This paper introduces a control framework that ensures natural movements in articulated soft robots, implementing specific functionalities of the human central nervous system, i.e., learning by repetition, after-effect on known and unknown trajectories, anticipatory behavior, its reactive re-planning, and state covariation in precise task execution. The control architecture we propose has a hierarchical structure composed of two levels. The low level deals with dynamic inversion and focuses on trajectory tracking problems. The high level manages the degree of freedom redundancy, and it allows to control the system through a reduced set of variables. The building blocks of this novel control architecture are well-rooted in the control theory, which can furnish an established vocabulary to describe the functional mechanisms underlying the motor control system. The proposed control architecture is validated through simulations and experiments on a bio-mimetic articulated soft robot.

9 citations

Journal ArticleDOI
TL;DR: In this article, a multizone cerebellar chip with multiple zones that could be similarly connected to a variety of subsystems to optimize performance was proposed and evaluated using a custom robotic platform consisting of an array of tactile sensors driven by dielectric electroactive polymers mounted upon a standard industrial robot arm.
Abstract: The cerebellum is a neural structure essential for learning, which is connected via multiple zones to many different regions of the brain, and is thought to improve human performance in a large range of sensory, motor and even cognitive processing tasks. An intriguing possibility for the control of complex robotic systems would be to develop an artificial cerebellar chip with multiple zones that could be similarly connected to a variety of subsystems to optimize performance. The novel aim of this paper, therefore, is to propose and investigate a multizone cerebellar chip applied to a range of tasks in robot adaptive control and sensorimotor processing. The multizone cerebellar chip was evaluated using a custom robotic platform consisting of an array of tactile sensors driven by dielectric electroactive polymers mounted upon a standard industrial robot arm. The results demonstrate that the performance in each task was improved by the concurrent, stable learning in each cerebellar zone. This paper, therefore, provides the first empirical demonstration that a synthetic, multizone, cerebellar chip could be embodied within existing robotic systems to improve performance in a diverse range of tasks, much like the cerebellum in a biological system.

3 citations

Proceedings ArticleDOI
18 Jul 2022
TL;DR: In this paper , a co-optimization algorithm of muscle arrangement and activation is proposed to construct constraint force field in the workspace of a musculoskeletal robot with human-mimetic muscle.
Abstract: Robots with high-precision motion and operation ability are of great application significance. By referring to the biomechanical structure and neural control mechanism of human motion system, the research of musculoskeletal robot system with rigid-flexible coupling characteristics is one of the important ways to improve the operation flexibility and control robustness of robot. Inspired by the equilibrium point hypothesis proposed in neuroscience, this paper proposes a co-optimization algorithm of muscle arrangement and activation to construct constraint force field in the workspace of musculoskeletal robot. When the muscle arrangement is rough due to the insufficient precision of the mechanical structure, the musculoskeletal robot can maintain accurate motion with the help of constraint force field by adopting the optimized constant activation. Experiments are carried out on a musculoskeletal robot model with human-mimetic muscle to demonstrate the effectiveness of the proposed algorithm in movement accuracy, noise robustness and generalization. This work may be of great significance for the further introduction of constraint force field into hardware system of musculoskeletal robot.

3 citations

References
More filters
Journal ArticleDOI
TL;DR: This paper presents a Lyapunov analysis suggesting that the condition of strictly positive realness (SPR) associated with the tracking error dynamics is a sufficient condition for asymptotic stability of the closed-loop dynamics of FEL.

141 citations


"A Cerebellar Internal Models Contro..." refers background in this paper

  • ...In the last decades, due to the advancement in artificial intelligence, a large number of non-parametric approaches have been proposed to solve the aforementioned problems [5]–[10]....

    [...]

Journal ArticleDOI
TL;DR: A simplified model of the cerebellum was developed to explore its potential for adaptive, predictive control based on delayed feedback information and uses a temporally asymmetric form of plasticity for the parallel fiber synapses on Purkinje cells.
Abstract: A simplified model of the cerebellum was developed to explore its potential for adaptive, predictive control based on delayed feedback information. An abstract representation of a single Purkinje cell with multistable properties was interfaced, using a formalized premotor network, with a simulated single degree-of-freedom limb. The limb actuator was a nonlinear spring-mass system based on the nonlinear velocity dependence of the stretch reflex. By including realistic mossy fiber signals, as well as realistic conduction delays in afferent and efferent pathways, the model allowed the investigation of timing and predictive processes relevant to cerebellar involvement in the control of movement. The model regulates movement by learning to react in an anticipatory fashion to sensory feedback. Learning depends on training information generated from corrective movements and uses a temporally asymmetric form of plasticity for the parallel fiber synapses on Purkinje cells.

132 citations


"A Cerebellar Internal Models Contro..." refers methods in this paper

  • ...Different prototypes of cerebellar control architecture based on the inverse model theory has been proposed, such as: adaptive filter models [40], [41]; spiking neural networks [42], [43]; combination of parametric adaptive control and machine learning techniques [44], [45]....

    [...]

Journal ArticleDOI
TL;DR: A robust adaptive controller to NN learning errors is proposed, using a sign or saturation switching function in the control law, which leads to global asymptotic stability and zero convergence of control errors.
Abstract: Presents an approach and a systematic design methodology to adaptive motion control based on neural networks (NNs) for high-performance robot manipulators, for which stability conditions and performance evaluation are given. The neurocontroller includes a linear combination of a set of off-line trained NNs, and an update law of the linear combination coefficients to adjust robot dynamics and payload uncertain parameters. A procedure is presented to select the learning conditions for each NN in the bank. The proposed scheme, based on fixed NNs, is computationally more efficient than the case of using the learning capabilities of the neural network to be adapted, as that used in feedback architectures that need to propagate back control errors through the model to adjust the neurocontroller. A practical stability result for the neurocontrol system is given. That is, we prove that the control error converges asymptotically to a neighborhood of zero, whose size is evaluated and depends on the approximation error of the NN bank and the design parameters of the controller. In addition, a robust adaptive controller to NN learning errors is proposed, using a sign or saturation switching function in the control law, which leads to global asymptotic stability and zero convergence of control errors. Simulation results showing the practical feasibility and performance of the proposed approach to robotics are given.

129 citations


"A Cerebellar Internal Models Contro..." refers background in this paper

  • ...In the last decades, due to the advancement in artificial intelligence, a large number of non-parametric approaches have been proposed to solve the aforementioned problems [5], [6], [7], [8], [9], [10]....

    [...]

  • ...For instance, the introduction of artificial neural networks (ANNs) into nonlinear dynamical systems adaptive control were advantageous for reducing the effects of nonlinearities and uncertainties, and for handling high dimensional and continuous state space systems [11], [12], [13], [8], [14]....

    [...]

Journal ArticleDOI
TL;DR: The equilibrium-point hypothesis suggests that action and perception are accomplished in a common spatial frame of reference selected by the brain from a set of available frames, and this approach is extended to sense of effort, kinesthetic illusions, phantom limb and phantom body phenomena.
Abstract: According to a view that has dominated the field for over a century, the brain programs muscle commands and uses a copy of these commands [efference copy (EC)] to adjust not only resulting motor action but also ongoing perception. This view was helpful in formulating several classical problems of action and perception: (1) the posture-movement problem of how movements away from a stable posture can be made without evoking resistance of posture-stabilizing mechanisms resulting from intrinsic muscle and reflex properties; (2) the problem of kinesthesia or why our sense of limb position is good despite ambiguous positional information delivered by proprioceptive and cutaneous signals; (3) the problem of visual space constancy or why the world is perceived as stable while its retinal image shifts following changes in gaze. On closer inspection, the EC theory actually does not solve these problems in a physiologically feasible way. Here solutions to these problems are proposed based on the advanced formulation of the equilibrium-point hypothesis that suggests that action and perception are accomplished in a common spatial frame of reference selected by the brain from a set of available frames. Experimental data suggest that the brain is also able to translate or/and rotate the selected frame of reference by modifying its major attributes-the origin, metrics and orientation-and thus substantially influence action and perception. Because of this ability, such frames are called physical to distinguish them from symbolic or mathematical frames that are used to describe system behavior without influencing this behavior. Experimental data also imply that once a frame of reference is chosen, its attributes are modified in a feedforward way, thus enabling the brain to act in an anticipatory and predictive manner. This approach is extended to sense of effort, kinesthetic illusions, phantom limb and phantom body phenomena. It also addresses the question of why retinal images of objects are sensed as objects located in the external, physical world, rather than in internal representations of the brain.

114 citations


"A Cerebellar Internal Models Contro..." refers background in this paper

  • ...Although it is controversial [37], [38], scientists argued that integrating the efference copy signal of the delayed sensory feedback could overcome these CNS transmission problems [39]....

    [...]

Journal ArticleDOI
TL;DR: Abnormalities of the triphasic pattern and kinematic parameters are consistent with a disturbed cerebellar timing function in essential tremor, and could indicate a basic pathophysiological mechanism underlying this disorder.
Abstract: Background: Clinical characteristics reminiscent of cerebellar tremor occur in patients with advanced essential tremor. Ballistic movements are known to be abnormal in cerebellar disease. The hypothesis was proposed that ballistic movements are abnormal in essential tremor, reflecting cerebellar dysfunction. Objective: To elucidate the role of the cerebellum in the pathophysiology of essential tremor. Methods: Kinematic parameters and the triphasic electromyographic (EMG) components of ballistic flexion elbow movements were analysed in patients assigned to the following groups: healthy controls (n = 14), pure essential postural tremor (ETPT; n = 17), and essential tremor with an additional intention tremor component (ETIT; n = 15). Results: The main findings were a delayed second agonist burst (AG2) and a relatively shortened deceleration phase compared with acceleration in both the essential tremor groups. These abnormalities were most pronounced in the ETIT group, which had additional prolongation of the first agonist burst (AG1) and a delayed antagonist burst (ANT). Conclusions: Abnormalities of the triphasic pattern and kinematic parameters are consistent with a disturbed cerebellar timing function in essential tremor. These abnormalities were most pronounced in the ETIT group. The cerebellar dysfunction in essential tremor could indicate a basic pathophysiological mechanism underlying this disorder. ETPT and ETIT may represent two expressions within a continuous spectrum of cerebellar dysfunction in relation to the timing of muscle activation during voluntary movements.

110 citations


"A Cerebellar Internal Models Contro..." refers background in this paper

  • ..., quick ballistic limbs movements and impaired muscle coordination [36], are due to the lack of feed forward contribution in motor control, or rather the neural control loop is affected by slow reaction time and sensory delay [34]....

    [...]

Frequently Asked Questions (1)
Q1. What contributions have the authors mentioned in the paper "A cerebellar internal models control architecture for online sensorimotor adaptation of a humanoid robot acting in a dynamic environment" ?

Inspired by the biological cerebellar internal models theory, that couples forward and inverse internal models into the biological motor control scheme, the authors propose a novel methodology to artificially replicate these learning and adaptive principles into a robotic feedback controller. During the experiments, the robot is requested to follow repeatedly a movement while it is interacting with two external systems.