scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Autonomous Mental Development in 2009"


Journal ArticleDOI
TL;DR: Cognitive developmental robotics aims to provide new understanding of how human's higher cognitive functions develop by means of a synthetic approach that developmentally constructs cognitive functions through interactions with the environment, including other agents.
Abstract: Cognitive developmental robotics (CDR) aims to provide new understanding of how human's higher cognitive functions develop by means of a synthetic approach that developmentally constructs cognitive functions. The core idea of CDR is ldquophysical embodimentrdquo that enables information structuring through interactions with the environment, including other agents. The idea is shaped based on the hypothesized development model of human cognitive functions from body representation to social behavior. Along with the model, studies of CDR and related works are introduced, and discussion on the model and future issues are argued.

519 citations


Journal ArticleDOI
TL;DR: This paper introduces a novel formulation of IAC, called robust intelligent adaptive curiosity (R-IAC), and shows that its performances as an intrinsically motivated active learning algorithm are far superior to IAC in a complex sensorimotor space where only a small subspace is neither unlearnable nor trivial.
Abstract: Intelligent adaptive curiosity (IAC) was initially introduced as a developmental mechanism allowing a robot to self-organize developmental trajectories of increasing complexity without preprogramming the particular developmental stages. In this paper, we argue that IAC and other intrinsically motivated learning heuristics could be viewed as active learning algorithms that are particularly suited for learning forward models in unprepared sensorimotor spaces with large unlearnable subspaces. Then, we introduce a novel formulation of IAC, called robust intelligent adaptive curiosity (R-IAC), and show that its performances as an intrinsically motivated active learning algorithm are far superior to IAC in a complex sensorimotor space where only a small subspace is neither unlearnable nor trivial. We also show results in which the learnt forward model is reused in a control scheme. Finally, an open source accompanying software containing these algorithms as well as tools to reproduce all the experiments presented in this paper is made publicly available.

147 citations


Journal ArticleDOI
TL;DR: This paper formulates five basic principles of developmental robotics based on some of the recurring themes in the developmental learning literature and in the author's own research that can be applied to the problem of autonomous tool use in robots.
Abstract: This paper formulates five basic principles of developmental robotics. These principles are formulated based on some of the recurring themes in the developmental learning literature and in the author's own research. The five principles follow logically from the verification principle (postulated by Richard Sutton) which is assumed to be self-evident. This paper also gives an example of how these principles can be applied to the problem of autonomous tool use in robots.

113 citations


Journal ArticleDOI
TL;DR: It is argued that in-place learning algorithms will be crucial for real-world large-size developmental applications due to their simplicity, low computational complexity, and generality.
Abstract: Development imposes great challenges. Internal ldquocorticalrdquorepresentations must be autonomously generated from interactive experiences. The eventual quality of these developed representations is of course important. Additionally, learning must be as fast as possible-to quickly derive better representation from limited experiences. Those who achieve both of these will have competitive advantages. We present a cortex-inspired theory called lobe component analysis (LCA) guided by the aforementioned dual criteria. A lobe component represents a high concentration of probability density of the neuronal input space. We explain how lobe components can achieve a dual-spatiotemporal (ldquobestrdquo and ldquofastestrdquo)-optimality, through mathematical analysis, in which we describe how lobe components plasticity can be temporally scheduled to take into account the history of observations in the best possible way. This contrasts with using only the last observation in gradient-based adaptive learning algorithms. Since they are based on two cell-centered mechanisms-Hebbian learning and lateral inhibition-lobe components develop in-place, meaning every networked neuron is individually responsible for the learning of its signal-processing characteristics within its connected network environment. There is no need for a separate learning network. We argue that in-place learning algorithms will be crucial for real-world large-size developmental applications due to their simplicity, low computational complexity, and generality. Our experimental results show that the learning speed of the LCA algorithm is drastically faster than other Hebbian-based updating methods and independent component analysis algorithms, thanks to its dual optimality, and it does not need to use any second- or higher order statistics. We also introduce the new principle of fast learning from stable representation.

97 citations


Journal ArticleDOI
TL;DR: First, the approach is shown to be more efficient than evolving a single central controller for all agents, and second, cooperation is found to be most efficient through stigmergy, i.e., through role-based responses to the environment, rather than communication between the agents.
Abstract: In tasks such as pursuit and evasion, multiple agents need to coordinate their behavior to achieve a common goal. An interesting question is, how can such behavior be best evolved? A powerful approach is to control the agents with neural networks, coevolve them in separate subpopulations, and test them together in the common task. In this paper, such a method, called multiagent enforced subpopulations (multiagent ESP), is proposed and demonstrated in a prey-capture task. First, the approach is shown to be more efficient than evolving a single central controller for all agents. Second, cooperation is found to be most efficient through stigmergy, i.e., through role-based responses to the environment, rather than communication between the agents. Together these results suggest that role-based cooperation is an effective strategy in certain multiagent tasks.

87 citations


Journal ArticleDOI
TL;DR: New findings are reported using a novel method that seeks to describe the visual learning environment from a young child's point of view and measures the visual information that a child perceives in real-time toy play with a parent and has broad implications for how one studies and thinks about human and artificial learning systems.
Abstract: An important goal in studying both human intelligence and artificial intelligence is to understand how a natural or an artificial learning system deals with the uncertainty and ambiguity of the real world. For a natural intelligence system such as a human toddler, the relevant aspects in a learning environment are only those that make contact with the learner's sensory system. In real-world interactions, what the child perceives critically depends on his own actions as these actions bring information into and out of the learner's sensory field. The present analyses indicate how, in the case of a toddler playing with toys, these perception-action loops may simplify the learning environment by selecting relevant information and filtering irrelevant information. This paper reports new findings using a novel method that seeks to describe the visual learning environment from a young child's point of view and measures the visual information that a child perceives in real-time toy play with a parent. The main results are 1) what the child perceives primarily depends on his own actions but also his social partner's actions; 2) manual actions, in particular, play a critical role in creating visual experiences in which one object dominates; 3) this selecting and filtering of visual objects through the actions of the child provides more constrained and clean input that seems likely to facilitate cognitive learning processes. These findings have broad implications for how one studies and thinks about human and artificial learning systems.

80 citations


Journal ArticleDOI
TL;DR: This analysis employing a bottom-up attention model revealed that motionese has the effects of highlighting the initial and final states of the action, indicating significant state changes in it, and underlining the properties of objects used in the action.
Abstract: A difficulty in robot action learning is that robots do not know where to attend when observing action demonstration. Inspired by human parent-infant interaction, we suggest that parental action demonstration to infants, called motionese, can scaffold robot learning as well as infants'. Since infants' knowledge about the context is limited, which is comparable to robots, parents are supposed to properly guide their attention by emphasizing the important aspects of the action. Our analysis employing a bottom-up attention model revealed that motionese has the effects of highlighting the initial and final states of the action, indicating significant state changes in it, and underlining the properties of objects used in the action. Suppression and addition of parents' body movement and their frequent social signals to infants produced these effects. Our findings are discussed toward designing robots that can take advantage of parental teaching.

74 citations


Journal ArticleDOI
TL;DR: A model of bottom-up attention by multimodal signal-level synchrony informed by recent adult-infant interaction studies is proposed and it is demonstrated that the model is receptive to parental cues during child-directed tutoring.
Abstract: Infants learning about their environment are confronted with many stimuli of different modalities. Therefore, a crucial problem is how to discover which stimuli are related, for instance, in learning words. In making these multimodal ldquobindings,rdquo infants depend on social interaction with a caregiver to guide their attention towards relevant stimuli. The caregiver might, for example, visually highlight an object by shaking it while vocalizing the object's name. These cues are known to help structuring the continuous stream of stimuli. To detect and exploit them, we propose a model of bottom-up attention by multimodal signal-level synchrony. We focus on the guidance of visual attention from audio-visual synchrony informed by recent adult-infant interaction studies. Consequently, we demonstrate that our model is receptive to parental cues during child-directed tutoring. The findings discussed in this paper are consistent with recent results from developmental psychology but for the first time are obtained employing an objective, computational model. The presence of ldquomultimodal mothereserdquo is verified directly on the audio-visual signal. Lastly, we hypothesize how our computational model facilitates tutoring interaction and discuss its application in interactive learning scenarios, enabling social robots to benefit from adult-like tutoring.

48 citations


Journal ArticleDOI
TL;DR: It is proposed that the biological mechanism of spike timing-dependent plasticity, that synchronizes the neural dynamics almost everywhere in the central nervous system, constitutes the perfect algorithm to detect contingency in sensorimotor networks.
Abstract: Agency is the sense that I am the cause or author of a movement. Babies develop early this feeling by perceiving the contingency between afferent (sensor) and efferent (motor) information. A comparator model is hypothesized to be associated with many brain regions to monitor and simulate the concordance between self-produced actions and their consequences. In this paper, we propose that the biological mechanism of spike timing-dependent plasticity, that synchronizes the neural dynamics almost everywhere in the central nervous system, constitutes the perfect algorithm to detect contingency in sensorimotor networks. The coherence or the dissonance in the sensorimotor information flow imparts then the agency level. In a head-neck-eyes robot, we replicate three developmental experiments illustrating how particular perceptual experiences can modulate the overall level of agency inside the system; i.e., (1) by adding a delay between proprioceptive and visual feedback information, (2) by facing a mirror, and (3) a person. We show that the system learns to discriminate animated objects (self-image and other persons) from other type of stimuli. This suggests a basic stage representing the self in relation to others from low-level sensorimotor processes. We discuss then the relevance of our findings with neurobiological evidences and development psychological observations for developmental robots.

33 citations


Journal ArticleDOI
TL;DR: A computational model of the multimodal interplay of action and language in tutoring situations is presented and first evaluation results show that acoustic packaging can provide a meaningful segmentation of action demonstration within tutoring behavior.
Abstract: In order to learn and interact with humans, robots need to understand actions and make use of language in social interactions. The use of language for the learning of actions has been emphasized by Hirsh-Pasek and Golinkoff (MIT Press, 1996), introducing the idea of acoustic packaging . Accordingly, it has been suggested that acoustic information, typically in the form of narration, overlaps with action sequences and provides infants with a bottom-up guide to attend to relevant parts and to find structure within them. In this article, we present a computational model of the multimodal interplay of action and language in tutoring situations. For our purpose, we understand events as temporal intervals, which have to be segmented in both, the visual and the acoustic modality. Our acoustic packaging algorithm merges the segments from both modalities based on temporal overlap. First evaluation results show that acoustic packaging can provide a meaningful segmentation of action demonstration within tutoring behavior. We discuss our findings with regard to a meaningful action segmentation. Based on our future vision of acoustic packaging we point out a roadmap describing the further development of acoustic packaging and interactive scenarios it is employed in.

33 citations


Journal ArticleDOI
Ming Song1, Yong Liu1, Yuan Zhou1, Kun Wang1, Chunshui Yu1, Tianzi Jiang1 
TL;DR: It is found that the strength of some functional connectivities and the global efficiency of the default network were significantly different between the superior intelligence group and the average intelligence group, which indicates that the functional integration of thedefault network might be related to the individual intelligent performance.
Abstract: In the last few years, many studies in the cognitive and system neuroscience found that a consistent network of brain regions, referred to as the default network, showed high levels of activity when no explicit task was performed. Some scientists believed that the resting state activity might reflect some neural functions that consolidate the past, stabilize brain ensembles, and prepare us for the future. Here, we modeled the default network as undirected weighted graph, and then used graph theory to investigate the topological properties of the default network of the two groups of people with different intelligence levels. We found that, in both groups, the posterior cingulate cortex showed the greatest degree in comparison to the other brain regions in the default network, and that the medial temporal lobes and cerebellar tonsils were topologically separations from the other brain regions in the default network. More importantly, we found that the strength of some functional connectivities and the global efficiency of the default network were significantly different between the superior intelligence group and the average intelligence group, which indicates that the functional integration of the default network might be related to the individual intelligent performance.

Journal ArticleDOI
TL;DR: This work is the first neuromorphic, end-to-end model of laminar cortex that integrates temporal context to develop internal representation, and generates accurate motor actions in the challenging problem of detecting disparity in binocular natural images.
Abstract: How our brains develop disparity tuned V1 and V2 cells and then integrate binocular disparity into 3-D perception of the visual world is still largely a mystery. Moreover, computational models that take into account the role of the 6-layer architecture of the laminar cortex and temporal aspects of visual stimuli are elusive for stereo. In this paper, we present cortex-inspired computational models that simulate the development of stereo receptive fields, and use developed disparity sensitive neurons to estimate binocular disparity. Not only do the results show that the use of top-down signals in the form of supervision or temporal context greatly improves the performance of the networks, but also results in biologically compatible cortical maps-the representation of disparity selectivity is grouped, and changes gradually along the cortex. To our knowledge, this work is the first neuromorphic, end-to-end model of laminar cortex that integrates temporal context to develop internal representation, and generates accurate motor actions in the challenging problem of detecting disparity in binocular natural images. The networks reach a subpixel average error in regression, and 0.90 success rate in classification, given limited resources.

Journal ArticleDOI
TL;DR: Computer simulation results of caregiver-infant interaction show the sensorimotor magnets help form small clusters and the automirroring bias shapes these clusters to become clearer vowels in association with the sensorIMotor magnets.
Abstract: The mechanism of infant vowel development is a fundamental issue of human cognitive development that includes perceptual and behavioral development. This paper models the mechanism of imitation underlying caregiver-infant interaction by focusing on potential roles of the caregiver's imitation in guiding infant vowel development. Proposed imitation mechanism is constructed with two kinds of the caregiver's possible biases in mind. The first is what we call ?sensorimotor magnets,? in which a caregiver perceives and imitates infant vocalizations as more prototypical ones as mother-tongue vowels. The second is based on what we call ?automirroring bias,? by which the heard vowel is much closer to the expected vowel because of the anticipation being imitated. Computer simulation results of caregiver-infant interaction show the sensorimotor magnets help form small clusters and the automirroring bias shapes these clusters to become clearer vowels in association with the sensorimotor magnets.

Journal ArticleDOI
TL;DR: A model that enables a robot to acquire face representation in a neuron by utilizing the proprioception of arm posture and the self-organizing map and Hebbian learning methods are presented.
Abstract: Both body and visuo-spatial representations are supposed to be gradually acquired during the developmental process as described in cognitive and brain sciences. A typical example is face representation in a neuron (found in the ventral intraparietal (VIP) area) of which the function is not only to code for the location of visual stimuli in the head-centered reference frame, but also to connect visual sensation with tactile sensation. This paper presents a model that enables a robot to acquire such representation. The proprioception of arm posture is utilized as reference data through the ldquohand regard behavior,rdquo that is, the robot moves its hand in front of its face, and the self-organizing map (SOM) and Hebbian learning methods are applied. The simulation results are shown and discussions on the limitation of the current model and future issues are given.

Journal ArticleDOI
TL;DR: This work accounts for several results: acquired distinctiveness between categories and acquired similarity within categories, a faster increase in discrimination for more acoustically dissimilar vowels, and gradual unsupervised learning of category structure in simple visual stimuli.
Abstract: During the learning of speech sounds and other perceptual categories, category labels are not provided, the number of categories is unknown, and the stimuli are encountered sequentially. These constraints provide a challenge for models, but they have been recently addressed in the online mixture estimation model of unsupervised vowel category learning (see Vallabha in the reference section). The model treats categories as Gaussian distributions, proposing both the number and the parameters of the categories. While the model has been shown to successfully learn vowel categories, it has not been evaluated as a model of the learning process. We account for several results: acquired distinctiveness between categories and acquired similarity within categories, a faster increase in discrimination for more acoustically dissimilar vowels, and gradual unsupervised learning of category structure in simple visual stimuli.

Journal ArticleDOI
TL;DR: Principal component analysis revealed the characteristic regions in the parameter space that correspond to secure, anxious, and avoidant attachment typology.
Abstract: Attachment, or the emotional tie between an infant and its primary caregiver, has been modeled as a homeostatic process by Bowlby's (Attachment and Loss, 1969; Anxiety and Depression, 1973; Loss: Sadness and Depression, 1980). Evidence from neurophysiology has grounded such mechanism of infant attachment to the dynamic interplay between an opioid-based proximity-seeking mechanism and an NE-based arousal system that are regulated by external stimuli (interaction with primary caregiver and the environment). Here, we model such attachment mechanism and its dynamic regulation by a coupled system of ordinary differential equations. We simulated the characteristic patterns of infant behaviors in the strange situation procedure, a common instrument for assessing the quality of attachment outcomes (?types?) for infants at about one year of age. We also manipulated the parameters of our model to account for neurochemical adaptation, and to allow for caregiver style (such as responsiveness and other factors) and temperamental factor (such as reactivity and readiness in self-regulation) to be incorporated into the homeostatic regulation model of attachment dynamics. Principle component analysis revealed the characteristic regions in the parameter space that correspond to secure, anxious, and avoidant attachment typology. Implications from this kind of approach are discussed.

Journal ArticleDOI
TL;DR: Computational models and human performance on learning to solve a high-level, planning-intensive problem and found that humans who learn by imitation and instructions performed more complex solution steps than those trained by reinforcement.
Abstract: We compared computational models and human performance on learning to solve a high-level, planning-intensive problem. Humans and models were subjected to three learning regimes: reinforcement, imitation, and instruction. We modeled learning by reinforcement (rewards) using SARSA, a softmax selection criterion and a neural network function approximator; learning by imitation using supervised learning in a neural network; and learning by instructions using a knowledge-based neural network. We had previously found that human participants who were told if their answers were correct or not (a reinforcement group) were less accurate than participants who watched demonstrations of successful solutions of the task (an imitation group) and participants who read instructions explaining how to solve the task. Furthermore, we had found that humans who learn by imitation and instructions performed more complex solution steps than those trained by reinforcement. Our models reproduced this pattern of results.

Journal ArticleDOI
TL;DR: The authors can deconstruct grammar to derive underlying primitive mechanisms, including serial processing, segmentation, categorization, compositionality, and forward planning that are necessary preparatory steps to reconstruct a working syntactic/semantic/pragmatic processor which can handle real language.
Abstract: A robot that can communicate with humans using natural language will have to acquire a grammatical framework. This paper analyses some crucial underlying mechanisms that are needed in the construction of such a framework. The work is inspired by language acquisition in infants, but it also draws on the emergence of language in evolutionary time and in ontogenic (developmental) time. It focuses on issues arising from the use of real language with all its evolutionary baggage, in contrast to an artificial communication system, and describes approaches to addressing these issues. We can deconstruct grammar to derive underlying primitive mechanisms, including serial processing, segmentation, categorization, compositionality, and forward planning. Implementing these mechanisms are necessary preparatory steps to reconstruct a working syntactic/semantic/pragmatic processor which can handle real language. An overview is given of our own initial experiments in which a robot acquires some basic linguistic capacity via interacting with a human.

Journal ArticleDOI
TL;DR: A hierarchical neural network model is used to learn, without supervision, sensory-sensory coordinate transformations like those believed to be encoded in the dorsal pathway of the cerebral cortex, which suggests that the same mechanisms of learning and development operate across multiple cortical hierarchies.
Abstract: A hierarchical neural network model is used to learn, without supervision, sensory-sensory coordinate transformations like those believed to be encoded in the dorsal pathway of the cerebral cortex. The resulting representations of visual space are invariant to eye orientation, neck orientation, or posture in general. These posture invariant spatial representations are learned using the same mechanisms that have previously been proposed to operate in the cortical ventral pathway to learn object representation that are invariant to translation, scale, orientation, or viewpoint in general. This model thus suggests that the same mechanisms of learning and development operate across multiple cortical hierarchies.

Journal ArticleDOI
TL;DR: In recognition of the gains made in this field and to support its further development, the IEEE has approved this new IEEE TRANSACTIONS on Autonomous Mental Development (TAMD).
Abstract: Although some baby animals can get up and walk within hours after birth, what a human child learns during the first two years of life easily exceeds what those animals learn in their entire lifetime. Furthermore, besides the explosive growth that occurs during this period, it is now well documented that a human brain continues its life-long development and learning [1]. The human brain is one of the most complex systems we know of in the world, composed of about 100 billion strongly interconnected neurons. A single neuron may have more than 10 000 connections to other neurons. For thousands of years, the mind has been the center of myths and human beings have endeavored to understand our own brain and the mind arising from it.