scispace - formally typeset
Search or ask a question

Showing papers presented at "Simulation of Adaptive Behavior in 2008"


Book ChapterDOI
07 Jul 2008
TL;DR: A new incremental scheme based on multi-objective evolution that is able to automatically switch between each sub-task resolution and does not require to order them is proposed.
Abstract: Evolutionary algorithms have been successfully used to create controllers for many animats. However, intuitive fitness functions like the survival time of the animat, often do not lead to interesting results because of the bootstrap problem, arguably one of the main challenges in evolutionary robotics: if all the individuals perform equally poorly, the evolutionary process cannot start. To overcome this problem, many authors defined ordered sub-tasks to bootstrap the process, leading to an incremental evolution scheme. Published methods require a deep knowledge of the underlying structure of the analyzed task, which is often not available to the experimenter. In this paper, we propose a new incremental scheme based on multi-objective evolution. This process is able to automatically switch between each sub-task resolution and does not require to order them. The proposed method has been successfully tested on the evolution of a neuro-controller for a complex-light seeking simulated robot, involving 8 sub-tasks.

60 citations


Book ChapterDOI
07 Jul 2008
TL;DR: The iLQG framework is combined with learning the forward dynamics for a simulated arm with two limbs and six antagonistic muscles, and it is demonstrated how the approach can compensate for complex dynamic perturbations in an online fashion.
Abstract: Optimal feedback control has been proposed as an attractive movement generation strategy in goal reaching tasks for anthropomorphic manipulator systems. Recent developments, such as the iterative Linear Quadratic Gaussian (iLQG) algorithm, have focused on the case of non-linear, but still analytically available, dynamics. For realistic control systems, however, the dynamics may often be unknown, difficult to estimate, or subject to frequent systematic changes. In this paper, we combine the iLQG framework with learning the forward dynamics for a simulated arm with two limbs and six antagonistic muscles, and we demonstrate how our approach can compensate for complex dynamic perturbations in an online fashion.

52 citations


Book ChapterDOI
07 Jul 2008
TL;DR: An adaptive strategy for a group of robots engaged in the localization of multiple targets is presented, inspired by chemotaxis behavior in bacteria, and the algorithmic parameters are updated using a distributed implementation of the Particle Swarm Optimization technique.
Abstract: We present an adaptive strategy for a group of robots engaged in the localization of multiple targets. The robotic search algorithm is inspired by chemotaxis behavior in bacteria, and the algorithmic parameters are updated using a distributed implementation of the Particle Swarm Optimization technique. We explore the efficacy of the adaptation, the impact of using local fitness measurements to improve global fitness, and the effect of different particle neighborhood sizes on performance. The robustness of the approach in non-static environments is tested in a time-varying scenario.

41 citations


Book ChapterDOI
07 Jul 2008
TL;DR: Results show that the gregarious behavior of the animals must be increased during stress for control by undirected stimuli to be successful, and that the presence of social amplification of stress allows for robust, low-stress control by controlling only a fraction of the herd.
Abstract: We study socialcontrol of a cow herd in which some of the animals are controlled by a sensing and actuation device mounted on the cow. The control is social in that it aims at exploiting the existing gregarious behavior of the animals, rather than controlling each individual directly. As a case study we consider the open-loop control of the herd's position using location-dependent stimuli. We propose a hybrid dynamical model for capturing the dynamics of the animals during periods of grazing and periods of stress. We assume that stress can either be induced by the sensing and actuation device or by social amplification due to observing/overhearing nearby stressed congeners. The dynamics of the grazing part of the proposed model have been calibrated using experimental data from 10 free-ranging cows, and various assumptions on the animal behavior under stress are investigated by a parameter sweep on the hybrid model. Results show that the gregarious behavior of the animals must be increased during stress for control by undirected stimuli to be successful. We also show that the presence of social amplification of stress allows for robust, low-stress control by controlling only a fraction of the herd.

39 citations


Book ChapterDOI
07 Jul 2008
TL;DR: It is suggested that stability of coordination is a general property of a certain class of interactively coupled dynamical systems, and that psychological explanations of an individual's sensitivity to social contingency need to take into account the role of the interaction process.
Abstract: We used an evolutionary robotics methodology to generate pairs of simulated agents capable of reliably establishing and maintaining a coordination pattern under noisy conditions. Unlike previous related work, agents were only evolved for this ability and not for their capacity to discriminate social contingency (i.e., a live responsive partner) from non-contingent engagements (i.e., a recording). However, when they were made to interact with a recording of their partner made during a successful previous interaction, the coordination pattern could not be established. An analysis of the system's underlying dynamics revealed (i) that stability of the coordination pattern requires ongoing mutuality of interaction, and (ii) that the interaction process is not only constituted by, but also constitutive of, individual behavior. We suggest that this stability of coordination is a general property of a certain class of interactively coupled dynamical systems, and conclude that psychological explanations of an individual's sensitivity to social contingency need to take into account the role of the interaction process.

38 citations


Book ChapterDOI
07 Jul 2008
TL;DR: The study shows that teamwork requires neither individual recognition nor inter-individual differences, and as such might contribute to the ongoing debate on the role of such characteristics for the division of labour in social insects.
Abstract: In social insect colonies, many tasks are performed by higher-order entities, such as groups and teams whose task solving capacities transcend those of the individual participants. In this paper, we investigate the emergence of such higher-order entities using a colony of up to 12 physical robots. We report on an experimental study in which the robots engage in a range of different activities, including exploration, path formation, recruitment, self-assembly and group transport. Once the robots start interacting with each other and with their environment, they self-organise into teams in which distinct roles are performed concurrently. The system displays a dynamical hierarchy of teamwork, the cooperating elements of which comprise higher-order entities. The study shows that teamwork requires neither individual recognition nor inter-individual differences, and as such might contribute to the ongoing debate on the role of such characteristics for the division of labour in social insects.

34 citations


Book ChapterDOI
07 Jul 2008
TL;DR: A novel computational model is presented that incorporates a biologically plausible hypothesis on the functions that the main nuclei of the amygdala might play in first and second order classical conditioning tasks.
Abstract: The mechanisms underlying learning in classical conditioning experiments play a key role in many learning processes of real organisms. This paper presents a novel computational model that incorporates a biologically plausible hypothesis on the functions that the main nuclei of the amygdala might play in first and second order classical conditioning tasks. The model proposes that in these experiments the first and second order conditioned stimuli (CS) are associated both (a) with the unconditioned stimuli (US) within the basolateral amygdala (BLA), and (b) directly with the unconditioned responses (UR) through the connections linking the lateral amygdala (LA) to the central nucleus of amygdala (CeA). The model, embodied in a simulated robotic rat, is validated by reproducing the results of first and second order conditioning experiments of both sham-lesioned and BLA-lesioned real rats.

27 citations


Book ChapterDOI
07 Jul 2008
TL;DR: The extended homeostatic networks designed to exploit the internal dynamics of a neural network in the absence of sensory input perform much better and are more adaptive to morphological disruptions that have never been experienced before by the agents.
Abstract: This study presents an extended model of homeostatic adaptation designed to exploit the internal dynamics of a neural network in the absence of sensory input. In order to avoid typical convergence to asymptotic states under these conditions plastic changes in the network are induced in evolved neurocontrollers leading to a renewal of dynamics that may favour sensorimotor adaptation. Other measures are taken to avoid loss of internal variability (as caused, for instance, by synaptic strength saturation). The method allows the generation of reliable adaptation to morphological disruptions in a simple simulated vehicle using a homeostatic neurocontroller that has been selected to behave homeostatically while performing the desired behaviour but non-homeostatically in other circumstances. The performance is compared with simple homeostatic neural controllers that have only been selected for a positive link between internal and behavioural stability. The extended homeostatic networks perform much better and are more adaptive to morphological disruptions that have never been experienced before by the agents.

26 citations


Book ChapterDOI
07 Jul 2008
TL;DR: This work employs non-linear dimensionality reduction to extract a canonical latent space that captures some of the essential topology of the unobserved task space and identifies suitable parametrisation of movements with control policies such that they are easily modulated to generate novel movements from the same class.
Abstract: We propose a novel methodology for learning and synthesising whole classes of high dimensional movements from a limited set of demonstrated examples that satisfy some underlying 'latent' low dimensional task constraints. We employ non-linear dimensionality reduction to extract a canonical latent space that captures some of the essential topology of the unobserved task space. In this latent space, we identify suitable parametrisation of movements with control policies such that they are easily modulated to generate novel movements from the same class and are robust to perturbations. We evaluate our method on controlled simulation experiments with simple robots (reaching and periodic movement tasks) as well as on a data set of very high-dimensional human (punching) movements. We verify that we can generate a continuum of new movements from the demonstrated class from only a few examples in both robotic and human data.

22 citations


Book ChapterDOI
07 Jul 2008
TL;DR: A simple, but flexible control method inspired by a biological adaptation mechanism is proposed that can be applied well to the control of a robot with multi-DOF.
Abstract: Controlling a highly dynamics and unknown system by existing control methods would be difficult because of its complexity. Recent biological studies reveal that animals utilize biological fluctuations to achieve adaptability to the environment and high flexibility. In this paper, we propose a simple, but flexible control method inspired by a biological adaptation mechanism. The proposed method is then applied to control robotic arms. The results of simulation indicated that our proposed method can be applied well to the control of a robot with multi-DOF.

20 citations


Book ChapterDOI
07 Jul 2008
TL;DR: This work proposes a novel bio-inspired integrated neural-network architecture that on one side uses attention to guide and furnish the parameters to action, and on the other side uses the effects of action to train the task-oriented top-down attention components of the system.
Abstract: The active vision and attention-for-action frameworks propose that in organisms attention and perception are closely integrated with action and learning. This work proposes a novel bio-inspired integrated neural-network architecture that on one side uses attention to guide and furnish the parameters to action, and on the other side uses the effects of action to train the task-oriented top-down attention components of the system. The architecture is tested both with a simulated and a real camera-arm robot engaged in a reaching task. The results highlight the computational opportunities and difficulties deriving from a close integration of attention, action and learning.

Book ChapterDOI
07 Jul 2008
TL;DR: It is argued that there is only one meaningful unsupervised learning process that can be applied to a vast data stream: adaptive compression.
Abstract: The purpose of this paper is to outline a new formulation of statistical learning that will be more useful and relevant to the field of robotics. The primary motivation for this new perspective is the mismatch between the form of data assumed by current statistical learning algorithms, and the form of data that is actually generated by robotic systems. Specifically, robotic systems generate a vast unlabeled data stream, while most current algorithms are designed to handle limited numbers of discrete, labeled, independent and identically distributed samples. We argue that there is only one meaningful unsupervised learning process that can be applied to a vast data stream: adaptive compression. The compression rate can be used to compare different techniques, and statistical models obtained through adaptive compression should also be useful for other tasks.

Book ChapterDOI
07 Jul 2008
TL;DR: It is demonstrated that it is possible to evolve globally stable neurocontrollers containing a single basin of attraction, which nevertheless sustain multiple modes of behaviour, and results are suggested that this globally stable regime may constitute an evolvable and dynamically rich subset of recurrent neural network configurations, especially in larger networks.
Abstract: Recent artificial neural networks for machine learning have exploited transient dynamics around globally stable attractors, inspired by the properties of cortical microcolumns. Here we explore whether similarly constrained neural network controllers can be exploited for embodied, situated adaptive behaviour. We demonstrate that it is possible to evolve globally stable neurocontrollers containing a single basin of attraction, which nevertheless sustain multiple modes of behaviour. This is achieved by exploiting interaction between environmental input and transient dynamics. We present results that suggest that this globally stable regime may constitute an evolvable and dynamically rich subset of recurrent neural network configurations, especially in larger networks. We discuss the issue of scalability and the possibility that there may be alternative adaptive behaviour tasks that are more `attractor hungry'.

Book ChapterDOI
07 Jul 2008
TL;DR: A heuristic for the Euclidean Steiner tree problem which is NP-hard, which is the problem of connecting objects in a plane efficiently is presented and is found applicable to even hard problems and reliably deliver approximated solutions.
Abstract: It is becoming state-of-the-art to form large-scale multi-agent systems or artificial swarms showing adaptive behavior by constructing high numbers of cooperating, embodied, mobile agents (robots). For the sake of space- and cost-efficiency such robots are typically miniaturized and equipped with only few sensors and actuators resulting in rather simple devices. In order to overcome these constraints, bio-inspired concepts of self-organization and emergent properties are applied. Thus, accuracy is usually not a trait of such systems, but robustness and fault tolerance are. It turns out that they are applicable to even hard problems and reliably deliver approximated solutions. Based on these principles we present a heuristic for the Euclidean Steiner tree problem which is NP-hard. Basically, it is the problem of connecting objects in a plane efficiently. The proposed system is investigated from two different viewpoints: computationally and behaviorally. While the performance is, as expected, clearly suboptimal but still reasonably well, the system is adaptive and robust.

Book ChapterDOI
07 Jul 2008
TL;DR: The results showed that although certain segregation between the lower sensory-motor level and the higher cognitive level enhance the task performance, meta-level cognition is significantly supported by the embodiment and the lower level sensory-Motor properties.
Abstract: The current paper studies possible neuronal mechanisms for meta-level cognition of rule switching. In contrast to the conventional approach of hand-designing the cognitive functions, our study employs evolutional processes to search for neuronal mechanisms accounting for meta-level cognitive functions required in the investigated robotic tasks. Our repeated simulation experiments showed that the different rules are embedded in separate self-organized attractors, while rule switching is enabled by the transitions among attractors. Furthermore, the results showed that although certain segregation between the lower sensory-motor level and the higher cognitive level enhance the task performance, meta-level cognition is significantly supported by the embodiment and the lower level sensory-motor properties.

Book ChapterDOI
07 Jul 2008
TL;DR: The results show that the location of the robot is well predictable from the activity of a population of model place cells, thus the model is suitable to be used as a basic building block of location-based navigation strategies.
Abstract: A computer model of learning and representing spatial locations is studied. The model builds on biological constraints and assumptions drawn from the anatomy and physiology of the hippocampal formation of the rat. The emphasis of the presented research is on the usability of a computer model originally proposed to describe episodic memory capabilities of the hippocampus in a spatial task. In the present model two modalities --- vision and path integration --- are contributing to the recognition of a given place. We study how place cell activity emerges due to Hebbian learning in the model hippocampus as a result of random exploration of the environment. The model is implemented in the Webots mobile robotics simulation software. Our results show that the location of the robot is well predictable from the activity of a population of model place cells, thus the model is suitable to be used as a basic building block of location-based navigation strategies. However, some properties of the stored memories strongly resembles that of episodic memories, which do not match special spatial requirements.

Book ChapterDOI
07 Jul 2008
TL;DR: Extensions to the method, which can improve its robustness to severe motor noise and to major disruptions such as being displaced along its route, are investigated.
Abstract: This paper presents an investigation into the robustness to motor noise of an insect-inspired visual navigation method that links together local view-based navigation in a series of visual locales automatically defined by the method. The method is tested in the real world using specialist robotic equipment that allows a controllable level of motor noise to be used. Extensions to the method, which can improve its robustness to severe motor noise and to major disruptions such as being displaced along its route, are investigated.

Book ChapterDOI
07 Jul 2008
TL;DR: New computational descriptions of the tasks performed by CIP as a fundamental relay station between the visual cortex and the visuomotor areas downstream are offered.
Abstract: The information flow along the dorsal visual stream of the primate brain is being thoroughly studied in neuroscience, and this research is being used in artificial intelligence applications. The knowledge regarding one of its most critical stages though, the posterior intraparietal area CIP, remains relatively undeveloped. This paper offers new computational descriptions of the tasks performed by CIP as a fundamental relay station between the visual cortex and the visuomotor areas downstream. Analytical expressions of the transfer functions realized by surface and axes orientation selective neurons (SOS and AOS) of CIP are derived and discussed.

Book ChapterDOI
07 Jul 2008
TL;DR: In this study, the robustness of the extended BRL is investigated through further experiments and shows higher robustness and relearning ability against an environmental change as compared to the standard BRL.
Abstract: We have developed a new reinforcement learning (RL) technique called Bayesian-discrimination-function-based reinforcement learning (BRL). BRL is unique, in that it does not have state and action spaces designed by a human designer, but adaptively segments them through the learning process. Compared to other standard RL algorithms, BRL has been proven to be more effective in handling problems encountered by multi-robot systems (MRS), which operate in a learning environment that is naturally dynamic. Furthermore, we have developed an extended form of BRL in order to improve the learning efficiency. Instead of generating a random action when a robot functioning within the framework of the standard BRL encounters an unknown situation, the extended BRL generates an action determined by linear interpolation among the rules that have high similarity to the current sensory input. In this study, we investigate the robustness of the extended BRL through further experiments. In both physical experiments and computer simulations, the extended BRL shows higher robustness and relearning ability against an environmental change as compared to the standard BRL.

Book ChapterDOI
07 Jul 2008
TL;DR: A scheme to design a robot in which a partner can teach interaction rules through interaction and a robot can acquire a rule adaptively through interaction without explicit teaching is proposed.
Abstract: We aim to realize human-robot social game interaction as a kind of communication. We proposed a hypothetical development of social game interaction between an infant and a care-giver from a mechanism-sided standpoint, based on developmental psychology. Social games have rules, specific relationship between action and response. Applying the hypothesis, we also propose a scheme to design a robot in which a partner can teach interaction rules through interaction. To investigate the proposed scheme, we built a dynamic model which realizes imitation and ruled interaction and switches them observing partner's response. In the experiment, the partner can teach and the robot can acquire a rule adaptively through interaction without explicit teaching and subsequently it is also achieved about another rule without reset.

Book ChapterDOI
07 Jul 2008
TL;DR: Results show that inclusion of internal drive levels inValence system input significantly improves performance and a valence system based purely on internal drives outperforms a system that is additionally based on perceptual input.
Abstract: We compare the performance of drive- versus perception-based motivational systems in an unstable environment. We investigate the hypothesis that valence systems (systems that evaluate positive and negative nature of events) that are based on internal physiology will have an advantage over systems that are based purely on external sensory input. Results show that inclusion of internal drive levels in valence system input significantly improves performance. Furthermore, a valence system based purely on internal drives outperforms a system that is additionally based on perceptual input. We provide arguments for why this is so and relate our architecture to brain areas involved in animal learning.

Book ChapterDOI
07 Jul 2008
TL;DR: Evidence from behavioural tests indicates that robust controllers evolved with neural noise are more robust and may still function in the absence of noise, and a general hypothesis is proposed according to which evolution implicitly selects neural systems that operate in noise-resistant landscapes which are hard to bifurcate and/or bifurst while retaining functionality.
Abstract: Continuous-time recurrent neural networks affected by random additive noise are evolved to produce phototactic behaviour in simulated mobile agents The resulting neurocontrollers are evaluated after evolution against perturbations and for different levels of neural noise Controllers evolved with neural noise are more robust and may still function in the absence of noise Evidence from behavioural tests indicates that robust controllers do not undergo noise-induced bifurcations or if they do, the transient dynamics remain functional A general hypothesis is proposed according to which evolution implicitly selects neural systems that operate in noise-resistant landscapes which are hard to bifurcate and/or bifurcate while retaining functionality

Proceedings Article
01 Jan 2008
TL;DR: Comparing three types of evolved agents with radically different embodiment (a simulated arm, a two-wheeled robot and an agent generating a velocity vector in Eu- clidean space) identifies differences in evolved behaviours and struc- tural invariants of the task across embodiments.
Abstract: We present the results from an evolutionary robotics simu- lation model of a recent unpublished experiment on human perceptual crossing in a minimal virtual two-dimensional environment. These ex- periments demonstrate that the participants reliably engage in rhythmic interaction with each other, moving along a line. Comparing three types of evolved agents with radically different embodiment (a simulated arm, a two-wheeled robot and an agent generating a velocity vector in Eu- clidean space), we identify differences in evolved behaviours and struc- tural invariants of the task across embodiments. The simulation results open an interesting perspective on the experimental study and generate hypotheses about the role of arm morphology for the behaviour observed.

Book ChapterDOI
07 Jul 2008
TL;DR: Homeotaxis subsumes the homeokinetic principle, extending it both in terms of scope (multi-agent self-organisation) and the state-space, and allows to select the best adaptive strategy for the considered system.
Abstract: We present a novel approach to self-organisation of coordinated behaviour among multiple resource-sharing agents. We consider a hierarchical multi-agent system comprising multiple energy-dependent agents split into local neighbourhoods, each with a dedicated controller, and a centralised coordinator dealing only with the controllers. The coordinated behaviour is required in order to achieve a balance between the overall resource consumption by the multi-agent collective and the stress on the community. Minimising the resource consumption increases the stress, while reducing the stress may lead to unrestricted and highly unpredictable demand, harming the individual agents in the long-run. We identify underlying forces in the system's dynamics, suggest a number of quantitative measures used to contrast different strategies, and introduce a novel strategy based on persistent sensorimotor time-loops: homeotaxis. Homeotaxis subsumes the homeokinetic principle, extending it both in terms of scope (multi-agent self-organisation) and the state-space, and allows to select the best adaptive strategy for the considered system.

Book ChapterDOI
07 Jul 2008
TL;DR: This work gives new experimental evidence for previously unobserved short-term adaptiveness in ant foraging and develops Ito diffusion models that explain the newly discovered behavior qualitatively and quantitatively.
Abstract: Ant foraging is a paradigmatic example of self-organized behavior. We give new experimental evidence for previously unobserved short-term adaptiveness in ant foraging and show that current mathematical foraging models cannot predict this behavior. As a true extension, we develop Ito diffusion models that explain the newly discovered behavior qualitatively and quantitatively. The theoretical analysis is supported by individual-based simulations. Our work shows that randomness is a key factor in allowing self-organizing systems to be adaptive. Implications for technical applications of Swarm Intelligence are discussed.

Book ChapterDOI
07 Jul 2008
TL;DR: It is suggested that animals and humans are fruitfully understood as representing their world as a set of chained predictions and proposed that generalization in artificial agents may benefit from a similar approach.
Abstract: Learning when and how to generalize knowledge from past experience to novel circumstances is a challenging problem many agents face. In animals, this generalization can be caused by mediated conditioning--when two stimuli gain a relationship through the mediation of a third stimulus. For example, in sensory preconditioning, if a light is always followed by a tone, and that tone is later paired with a shock, the light will come to elicit a fear reaction, even though the light was never directly paired with shock. In this paper, we present a computational model of mediated conditioning based on reinforcement learning with predictive representations. In the model, animals learn to predict future observations through the temporal-difference algorithm. These predictions are generated using both current observations and other predictions. The model was successfully applied to a range of animal learning phenomena, including sensory preconditioning, acquired equivalence, and mediated aversion. We suggest that animals and humans are fruitfully understood as representing their world as a set of chained predictions and propose that generalization in artificial agents may benefit from a similar approach.

Book ChapterDOI
07 Jul 2008
TL;DR: This paper investigates the feasibility and utility of an artificial model of anger and fear based on Interruption Theory of Emotions and shows that the model improves the adaptability of a group of agents by simultaneously optimizing multiple performance criterion.
Abstract: Emotions play several important roles in the cognition of human beings and other life forms, and are therefore a legitimate inspiration to provide adaptability and autonomy to situated agents. However, there is no unified theory of emotions and many discoveries are yet to be made in the applicability of emotions to situated agents. This paper investigates the feasibility and utility of an artificial model of anger and fear based on Interruption Theory of Emotions. This model detects and highlights situations for which an agent's decision-making mechanism is no longer pertinent. These situations are detected by analyzing discrepancies between the agent's actions and its intentions, making this model independent from the agent's environment and tasks. Collective foraging simulations are used to characterize the influence of the model. Results show that the model improves the adaptability of a group of agents by simultaneously optimizing multiple performance criterion.

Book ChapterDOI
07 Jul 2008
TL;DR: It is argued that off-line simulations permit not only to coordinate with the present, but with the future, too, and to act goal-directed.
Abstract: In a simulated guards-and-thieves scenario we study how the behavioral system of an autonomous agent, which consists of multiple perceptual and motor schemas endowed with anticipatory mechanisms, self-organizes for satisfying its drives. Furthermore, we study how schemas acquired for navigation can be re-used off-line, `in simulation', for forecasting future dangers, and planning trajectories leading to goal locations. We argue that off-line simulations permit not only to coordinate with the present, but with the future, too, and to act goal-directed.

Book ChapterDOI
07 Jul 2008
TL;DR: A sub-symbolic connectionist model in which a functionally compositional system self-organizes by learning a provided set of goal-directed actions is proposed, potentially explaining a possible continuous process underlying the transitions from rote knowledge to systematized knowledge.
Abstract: We propose a sub-symbolic connectionist model in which a functionally compositional system self-organizes by learning a provided set of goal-directed actions. This approach is compatible with an idea taken from usage-based accounts of the developmental learning of language, especially one theory of infants' acquisition process of symbols. The presented model potentially explains a possible continuous process underlying the transitions from rote knowledge to systematized knowledge by drawing an analogy to the formation process of a geometric regular arrangement of points. Based on the experimental results, the essential underlying process is discussed.

Book ChapterDOI
07 Jul 2008
TL;DR: It is shown that effective reinforcement learning is indeed possible, but only when stimuli are gated so as to occur as near-synchronous patterns of neural activity and when neuroanatomical constraints are imposed which predispose agents to exploratative behaviours.
Abstract: It has been shown recently that dopamine signalled modulation of spike timing-dependent synaptic plasticity (DA-STDP) can enable reinforcement learning of delayed stimulus-reward associations when both stimulus and reward are delivered at precisely timed intervals Here, we test whether a similar model can support learning in an embodied context, in which timing of both sensory input and delivery of reward depend on the agent's behaviour We show that effective reinforcement learning is indeed possible, but only when stimuli are gated so as to occur as near-synchronous patterns of neural activity and when neuroanatomical constraints are imposed which predispose agents to exploratative behaviours Extinction of learned responses in this model is subsequently shown to result from agent-environment interactions and not directly from any specific neural mechanism