scispace - formally typeset
Search or ask a question

Showing papers presented at "Simulation of Adaptive Behavior in 1993"



Journal ArticleDOI
01 Apr 1993
TL;DR: A class of strategies designed to enhance the learning and planning power of Dyna systems by increasing their computational efficiency are examined.
Abstract: The Dyna class of reinforcement learning architectures enables the creation of integrated learning, planning and reacting systems. A class of strategies designed to enhance the learning and planning power of Dyna systems by increasing their computational efficiency is examined. The benefit of using these strategies is demonstrated on some simple abstract learning tasks. It is proposed that the backups to be performed in Dyna be prioritized in order to improve its efficiency. It is demonstrated with simple tasks that use some specific prioritizing schemes can lead to significant reductions in computational effort and corresponding improvements in learning performance. >

241 citations


Proceedings Article
09 Aug 1993
TL;DR: It is proposed that, sooner rather than later, visual processing will be required in order for robots to engage in non-trivial navigation behaviours, and time constraints suggest that initial architecture evaluations should be largely done in simulation.
Abstract: In this paper we propose and justify a methodology for the development of the control systems, or `cognitive architectures', of autonomous mobile robots. We argue that the design by hand of such control systems becomes prohibitively di cult as complexity increases. We discuss an alternative approach, involving arti cial evolution, where the basic building blocks for cognitive architectures are adaptive noise-tolerant dynamical neural networks, rather than programs. These networks may be recurrent, and should operate in real time. Evolution should be incremental, using an extended and modi ed version of genetic algorithms. We nally propose that, sooner rather than later, visual processing will be required in order for robots to engage in non-trivial navigation behaviours. Time constraints suggest that initial architecture evaluations should be largely done in simulation. The pitfalls of simulations compared with reality are discussed, together with the importance of incorporating noise. To support our claims and proposals, we present results from some preliminary experiments where robots which roam o ce-like environments are evolved.

198 citations


Proceedings Article
09 Aug 1993
TL;DR: Three types of "Tom Thumb robots", whose behavior is based on the foraging behaviors of ants are proposed and their results are critically examined, to show that only a few changes in the robots' behavior may greatly improve the efficiency of the population.
Abstract: In this paper, we experiment, from the point of view of their efficiency, different implementations of the "explorer robots application" Three types of "Tom Thumb robots", whose behavior is based on the foraging behaviors of ants are proposed and their results are critically examined We then introduce chain-making robots (the "dockers"), governed by local perceptions and interactions This helps us to show that only a few changes in the robots' behavior may greatly improve the efficiency of the population

179 citations


Proceedings Article
09 Aug 1993
TL;DR: Coordinated motion in a group of simulated critters can evolve under selection pressure from an appropriate fitness criteria under the Genetic Programming paradigm.
Abstract: Coordinated motion in a group of simulated critters can evolve under selection pressure from an appropriate fitness criteria. Evolution is modeled with the Genetic Programming paradigm. The simulated environment consists of a group of critters, some static obstacles, and a predator. In order to survive, the critters must avoid collisions (with obstacles as well as with each other) and must avoid predation. They must steer a safe path through the dynamic environment using only information received through their visual sensors. The arrangement of visual sensors, as well as the mapping from sensor data to motor action is determined by the evolved controller program. The motor model assumes an innate constant forward velocity and limited steering. The predator preferentially targets isolated “stragglers” and so encourages aggregation. Fitness is based on the sum of all critter lifetimes.

167 citations



Proceedings Article
09 Aug 1993
TL;DR: A mobile robots engaged in a cooperative task that requires communication is described, initially given a but uninterpreted vocabulary for communication and attempting to perform their task the robots learn a private communication lan guage.
Abstract: We describe mobile robots engaged in a cooperative task that requires communication The robots are initially given a xed but uninterpreted vocabulary for communication In attempting to perform their task the robots learn a private communication lan guage Di erent meanings for vocabulary elements are learned in di erent runs of the experiment As circumstances change the robots adapt their lan guage to allow continued success at their task

130 citations


Proceedings Article
09 Aug 1993

126 citations


Proceedings Article
09 Aug 1993

112 citations


Journal ArticleDOI
09 Aug 1993
TL;DR: It is argued that the problem of action selection is, by nature, intrinsically hierarchical, and so Rosenblatt and Payton-like hierarchies (familiarity, ease of access, and combination of evidence) are supported.
Abstract: Several researchers of animal behavior, such as Tinbergen and Baerends, have proposed hierarchical mechanisms for action selection. Maes, among others, has argued against mechanisms of this type be...

111 citations


Proceedings Article
09 Aug 1993
TL;DR: Results are presented which demonstrate that neural-network control architectures can be evolved for an accurate simulation model of a visually guided robot, and it is demonstrated that robust visually-guided control systems evolve from evaluation functions which do not explicitly involve monitoring visual input.
Abstract: We have developed a methodology grounded in two beliefs: that autonomous agents need visual processing capabilities, and that the approach of hand-designing control architectures for autonomous agents is likely to be superseded by methods involving the arti cial evolution of comparable architectures. In this paper we present results which demonstrate that neural-network control architectures can be evolved for an accurate simulation model of a visually guided robot. The simulation system involves detailed models of the physics of a real robot built at Sussex; and the simulated vision involves ray-tracing computer graphics, using models of optical systems which could readily be constructed from discrete components. The control-network architecture is entirely under genetic control, as are parameters governing the optical system. Signi cantly, we demonstrate that robust visually-guided control systems evolve from evaluation functions which do not explicitly involve monitoring visual input. The latter part of the paper discusses work now under development, which allows us to engage in long-term fundamental experiments aimed at thoroughly exploring the possibilities of concurrently evolving control networks and visual sensors for navigational tasks. This involves the construction of specialised visual-robotic equipment which eliminates the need for simulated sensing.


Proceedings Article
09 Aug 1993
TL;DR: An interesting evolutionary pathway is seen to this herding, from aggregation, to staying nearby other animals for mating opportunities, to using herding for safety and food finding.
Abstract: We have created a simulated world ("BioLand") designed to support experiments on the evolution of cooperation, competition, and communication. In this particular experiment we have simulated the evolution of herding behavior in prey animals. We placed a population of simulated prey animals into an environment with a population of their predators. The behavior of each of the animals is controlled by a neural network architecture specified by its individual genome. We have allowed these populations to evolve through interaction over time and have observed the evolution of neural networks that produce herding behavior. The prey animals evolve to congregate in herds, for the protection it provides from predators, as well as the help it provides in finding food and mates. An interesting evolutionary pathway is seen to this herding, from aggregation, to staying nearby other animals for mating opportunities, to using herding for safety and food finding.




Proceedings Article
09 Aug 1993
TL;DR: It is shown by means of a simulation study that the mappings between sensing and acting the system acquires through its interaction with the environment are topology preserving and on the basis of these results it is shown that these mappings implement action related prototypes.
Abstract: Since the world is partly an unpredictable place the agents that have to function in it have to rely on learning to adjust to it. To understand the adaptive properties of autonomous agents, that are related to their learning capacities, it is necessary to explore what they exactly learn. In order to do this we will further analyse an autonomous agent designed according to the methodology of distributed adaptive control. It is shown by means of a simulation study that the mappings between sensing and acting the system acquires through its interaction with the environment are topology preserving. Moreover, on the basis of these results it is shown that these mappings implement action related prototypes. This is demonstrated by translating the mapping back into world coordinates. Based on these results an extension of the model is proposed that illustrates another aspect of our methodology; stretching a model. To overcome some limitations of Hebbian learning, which is crucial for the self-organizing properties of the model, an expectancy mechanism is included in the control architecture. This allows the development of mechanisms that influence the categorization process independently of immediate sensory states. This can be seen as a necessary next step to come to a closer definition of the concept of representation in the context of autonomous agents.

Proceedings Article
09 Aug 1993

Proceedings Article
09 Aug 1993
TL;DR: A framework for exploring the evolution of adaptive behaviors in response to different physical environment structures is described, simple and well-defined enough to allow complete specification of the range of possible actiontypes and their effects on the energy levels of the creature and the environment.
Abstract: We describe a framework for exploring the evolution of adaptive behaviors in response to different physical environment structures. We focus here on the evolving behavior-generating mechanisms of individual creatures, and briefly mention some approaches to characterizing different environments in which various behaviors may prove adaptive. The environments are described initially as simple two-dimensional grids containing food arranged in some layout. The creatures in these worlds can have evolved sensors, internal states, and actions and action-triggering conditions. By allowing all three of these components to evolve, rather than prespecifying any of them, we can explore a wide range of behavior types, including “blind” and memoryless behaviors. Our system is simple and well-defined enough to allow complete specification of the range of possible actiontypes (including moving, eating, and reproducing) and their effects on the energy levels of the creature and the environment (the bioenergetics of the world). Useful and meaningful ways of characterizing the structures of environments in which different behaviors will emerge remain to be developed.


Proceedings Article
09 Aug 1993
TL;DR: This work considers the problem of reaching a given goal state from a given start state by letting anànimat' produce a sequence of actions in an environment with multiple obstacles by way of gradient-based algorithms for learning without a teacher.
Abstract: We consider the problem of reaching a given goal state from a given start state by letting anànimat' produce a sequence of actions in an environment with multiple obstacles. Simple trajectory planning tasks are solved with the help of`neural' gradient-based algorithms for learning without a teacher to generate sequences of appropriate subgoals in response to novel start/goal combinations.

Proceedings Article
09 Aug 1993


Proceedings Article
09 Aug 1993
TL;DR: The paper discusses the special cases when either h = 0 or = 1 in detail, describes some theoretical bounds on h and re-explores a well-known reinforcement learning environment with this new notation.
Abstract: 1 This paper proposes a categorization of reinforcement learning environments based on the optimization of a reinforcement signal over time. Environments are classiied by the simplest agent that can possibly achieve optimal reinforcement. Two parameters, h and , abstractly characterize the complexity of an agent: the ideal (h,)-agent uses the input information provided by the environment and at most h bits of local storage to choose an action that maximizes the discounted sum of the next reinforcements. In an (h,)-environment, an ideal (h,)-agent achieves the maximum possible expected reinforcement for that environment. The paper discusses the special cases when either h = 0 or = 1 in detail, describes some theoretical bounds on h and and re-explores a well-known reinforcement learning environment with this new notation.

Proceedings Article
09 Aug 1993
TL;DR: This paper shows that directional mate preferences can cause populations to wander capriciously through phenotype space, under a strange form of runaway sexual selection, with or without the influence of natural selection pressures, and presents a framework for simulating a wide range of directional and non-directional mate preferences.
Abstract: In the pantheon of evolutionary forces, the optimizing Apollonian powers of natural selection are generally assumed to dominate the dark Dionysian dynamics of sexual selection. But this need not be the case, particularly with a class of selective mating mechanisms called ‘directional mate preferences’ (Kirkpatrick, 1987). In previous simulation research, we showed that nondirectional assortative mating preferences could cause populations to spontaneously split apart into separate species (Todd & Miller, 1991). In this paper, we show that directional mate preferences can cause populations to wander capriciously through phenotype space, under a strange form of runaway sexual selection, with or without the influence of natural selection pressures. When directional mate preferences are free to evolve, they do not always evolve to point in the direction of natural-selective peaks. Sexual selection can thus take on a life of its own, such that mate preferences within a species become a distinct and important part of the environment to which the species’ phenotypes adapt. These results suggest a broader conception of ‘adaptive behavior’, in which attracting potential mates becomes as important as finding food and avoiding predators. We present a framework for simulating a wide range of directional and non-directional mate preferences, and discuss some practical and scientific applications of simulating sexual selection.



Proceedings Article
09 Aug 1993
TL;DR: A model of the dolphin cochlea is developed and used to produce the representations used by a neural network to model the delayed matching-to-sample performance of a bottlenosed dolphin.
Abstract: : The effectiveness of artificial neural network models depends strongly on the way in which the information to be learned is presented to the network. Use of biologically relevant mechanisms is likely to yield effective syntactic systems as well as understanding the performance of biological systems. We developed a model of the dolphin cochlea and used this model to produce the representations used by a neural network to model the delayed matching-to-sample performance of a bottlenosed dolphin. The model yielded psychophysical functions and matching choice accuracy similar to those obtained from the dolphin.... Artificial neural networks (AN-N), Echolocation Gateway- integrator neural network (GIN)


Proceedings Article
09 Aug 1993
TL;DR: The polymerization of vinyl esters and their homologues and esters of acrylic acid, is effected with these substances in emulsion with soap-like emulsifying agents with suitable dispersion media and emulsification agents.
Abstract: Ester condensation products.--The polymerization of vinyl esters and their homologues and esters of acrylic acid and their homologues and derivatives, is effected with these substances in emulsion with soap-like emulsifying agents. The derivatives of acrylic acid esters are defined as having a double bond in the a -b position to the esterified carboxylic group. Suitable catalysts may be added, and this process may be controlled when the reaction is exothermic, by adding further cold dispersion medium or by reduction of the pressure. Preferably closed vessels are employed. The polymerization may be assisted by ultra-violet light. The dispersion medium may be water or aqueous solutions, and chlorinated hydrocarbons may be added. As emulsifying agents may be used naphthene sulphonic acids, or their soaps, sulphonated castor oil or its soaps or other known soap-like emulsifying agents. Organic or inorganic peroxides, gaseous or vaporous substances such as air, oxygen, nitrogen oxides, and organic acid anhydrides, acetals, aldehydes, and the like may be used as catalysts. In the examples (1) acrylic acid methyl ester emulsified with aqueous sulphonated castor oil or its soaps is polymerized with a peroxide in a closed vessel, the heat being absorbed by addition of cold water or by evaporation; (2) a mixture of acrylic acid methyl ester and vinyl acetate is emulsified as in the first example and polymerized by air or oxygen; (3) a similar emulsion of vinyl acetate is polymerized with a peroxide. The Specification as open to inspection under Sect. 91 (3) (a) includes the polymerization of other organic compounds, especially unsaturated compounds with any suitable dispersion media and emulsifying agents. The dispersion media mentioned are saturated hydrocarbons, chlorinated hydrocarbons, benzol, "saturated heavy benzol," petroleum fractions "or alcohol and hydrocarbons or chlorinated hydrocarbons which dissolve the esters may be employed as dispersion accelerators." Emulsifying agents mentioned are known substances for decomposing fats, glycol ethers, amino compounds of the polyglycols, and aliphatic amines. Glacial acetic acid is mentioned as a catalyst. This subject-matter does not appear in the Specification as accepted.