Showing papers on "Active learning (machine learning) published in 1992"

PDF

Open Access

Optimization, Learning and Natural Algorithms

[...]

Marco Dorigo

01 Jan 1992

3,402 citations

Journal Article•DOI•

Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching

[...]

Long-Ji Lin¹•Institutions (1)

Carnegie Mellon University¹

01 May 1992-Machine Learning

TL;DR: This paper compares eight reinforcement learning frameworks: Adaptive heuristic critic (AHC) learning due to Sutton, Q-learning due to Watkins, and three extensions to both basic methods for speeding up learning and two extensions are experience replay, learning action models for planning, and teaching.

...read moreread less

Abstract: To date, reinforcement learning has mostly been studied solving simple learning tasks. Reinforcement learning methods that have been studied so far typically converge slowly. The purpose of this work is thus two-fold: 1) to investigate the utility of reinforcement learning in solving much more complicated learning tasks than previously studied, and 2) to investigate methods that will speed up reinforcement learning. This paper compares eight reinforcement learning frameworks: adaptive heuristic critic (AHC) learning due to Sutton, Q-learning due to Watkins, and three extensions to both basic methods for speeding up learning. The three extensions are experience replay, learning action models for planning, and teaching. The frameworks were investigated using connectionism as an approach to generalization. To evaluate the performance of different frameworks, a dynamic environment was used as a testbed. The environment is moderately complex and nondeterministic. This paper describes these frameworks and algorithms in detail and presents empirical evaluation of the frameworks.

...read moreread less

1,691 citations

Reinforcement learning for robots using neural networks

[...]

Long-Ji Lin

01 Jan 1992

TL;DR: This dissertation concludes that it is possible to build artificial agents than can acquire complex control policies effectively by reinforcement learning and enable its applications to complex robot-learning problems.

...read moreread less

Abstract: Reinforcement learning agents are adaptive, reactive, and self-supervised. The aim of this dissertation is to extend the state of the art of reinforcement learning and enable its applications to complex robot-learning problems. In particular, it focuses on two issues. First, learning from sparse and delayed reinforcement signals is hard and in general a slow process. Techniques for reducing learning time must be devised. Second, most existing reinforcement learning methods assume that the world is a Markov decision process. This assumption is too strong for many robot tasks of interest. This dissertation demonstrates how we can possibly overcome the slow learning problem and tackle non-Markovian environments, making reinforcement learning more practical for realistic robot tasks: (1) Reinforcement learning can be naturally integrated with artificial neural networks to obtain high-quality generalization, resulting in a significant learning speedup. Neural networks are used in this dissertation, and they generalize effectively even in the presence of noise and a large of binary and real-valued inputs. (2) Reinforcement learning agents can save many learning trials by using an action model, which can be learned on-line. With a model, an agent can mentally experience the effects of its actions without actually executing them. Experience replay is a simple technique that implements this idea, and is shown to be effective in reducing the number of action executions required. (3) Reinforcement learning agents can take advantage of instructive training instances provided by human teachers, resulting in a significant learning speedup. Teaching can also help learning agents avoid local optima during the search for optimal control. Simulation experiments indicate that even a small amount of teaching can save agents many learning trials. (4) Reinforcement learning agents can significantly reduce learning time by hierarchical learning--they first solve elementary learning problems and then combine solutions to the elementary problems to solve a complex problem. Simulation experiments indicate that a robot with hierarchical learning can solve a complex problem, which otherwise is hardly solvable within a reasonable time. (5) Reinforcement learning agents can deal with a wide range of non-Markovian environments by having a memory of their past. Three memory architectures are discussed. They work reasonably well for a variety of simple problems. One of them is also successfully applied to a nontrivial non-Markovian robot task. The results of this dissertation rely on computer simulation, including (1) an agent operating in a dynamic and hostile environment and (2) a mobile robot operating in a noisy and non-Markovian environment. The robot simulator is physically realistic. This dissertation concludes that it is possible to build artificial agents than can acquire complex control policies effectively by reinforcement learning.

...read moreread less

911 citations

Book•

Iterative Learning Control for Deterministic Systems

[...]

Kevin L. Moore, Mary Ann Johnson, Michael J. Grimble

31 Dec 1992

TL;DR: The material presented in this book addresses the analysis and design of learning control systems using a system-theoretic approach, and the application of artificial neural networks to the learning control problem.

...read moreread less

Abstract: The material presented in this book addresses the analysis and design of learning control systems. It begins with an introduction to the concept of learning control, including a comprehensive literature review. The text follows with a complete and unifying analysis of the learning control problem for linear LTI systems using a system-theoretic approach which offers insight into the nature of the solution of the learning control problem. Additionally, several design methods are given for LTI learning control, incorporating a technique based on parameter estimation and a one-step learning control algorithm for finite-horizon problems. Further chapters focus unpon learning control for deterministic nonlinear systems, and a time-varying learning controller is presented which can be applied to a class of nonlinear systems, including the models of typical robotic manipulators. The book concludes with the application of artificial neural networks to the learning control problem. Three specific ways to use neural nets for this purpose are discussed, including two methods which use backpropagation training and reinforcement learning.

...read moreread less

771 citations

Journal Article•DOI•

Automatic programming of behavior-based robots using reinforcement learning

[...]

Sridhar Mahadevan¹, Jonathan H. Connell¹•Institutions (1)

IBM¹

01 Jun 1992-Artificial Intelligence

TL;DR: In this article, two algorithms for behavior learning are described that combine Q learning, a well-known scheme for propagating reinforcement values temporally across actions, with statistical clustering and Hamming distance.

...read moreread less

632 citations

Efficient Exploration In Reinforcement Learning

[...]

Sebastian Thrun

01 Jan 1992

TL;DR: It is proved that for all finite deterministic domains, reinforcement learning using a directed technique can always be performed in polynomial time, demonstrating the important role of exploration in reinforcement learning.

...read moreread less

Abstract: Exploration plays a fundamental role in any active learning system. This study evaluates the role of exploration in active learning and describes several local techniques for exploration in finite, discrete domains, embedded in a reinforcement learning framework (delayed reinforcement). This paper distinguishes between two families of exploration schemes: undirected and directed exploration. While the former family is closely related to random walk exploration, directed exploration techniques memorize exploration-specific knowledge which is used for guiding the exploration search. In many finite deterministic domains, any learning technique based on undirected exploration is inefficient in terms of learning time, i.e., learning time is expected to scale exponentially with the size of the state space. We prove that for all these domains, reinforcement learning using a directed technique can always be performed in polynomial time, demonstrating the important role of exploration in reinforcement learning. (The proof is given for one specific directed exploration technique named counter-based exploration.) Subsequently, several exploration techniques found in recent reinforcement learning and connectionist adaptive control literature are described. In order to trade off efficiently between exploration and exploitation --- a trade-off which characterizes many real-world active learning tasks --- combination methods are described which explore and avoid costs simultaneously. This includes a selective attention mechanism, which allows smooth switching between exploration and exploitation. All techniques are evaluated and compared on a discrete reinforcement learning task (robot navigation). The empirical evaluation is followed by an extensive discussion of benefits and limitations of this work.

...read moreread less

311 citations

Proceedings Article•

Adapting bias by gradient descent: an incremental version of delta-bar-delta

[...]

Richard S. Sutton

12 Jul 1992

TL;DR: A new algorithm, the Incremental Delta-Bar-Delta (IDBD) algorithm, for the learning of appropriate biases based on previous learning experience, and a novel interpretation of the IDBD algorithm as an incremental form of hold-one-out cross validation.

...read moreread less

Abstract: Appropriate bias is widely viewed as the key to efficient learning and generalization. I present a new algorithm, the Incremental Delta-Bar-Delta (IDBD) algorithm, for the learning of appropriate biases based on previous learning experience. The IDBD algorithm is developed for the case of a simple, linear learning system--the LMS or delta rule with a separate learning-rate parameter for each input. The IDBD algorithm adjusts the learning-rate parameters, which are an important form of bias for this system. Because bias in this approach is adapted based on previous learning experience, the appropriate test beds are drifting or non-stationary learning tasks. For particular tasks of this type, I show that the IDBD algorithm performs better than ordinary LMS and in fact finds the optimal learning rates. The IDBD algorithm extends and improves over prior work by Jacobs and by me in that it is fully incremental and has only a single free parameter. This paper also extends previous work by presenting a derivation of the IDBD algorithm as gradient descent in the space of learning-rate parameters. Finally, I offer a novel interpretation of the IDBD algorithm as an incremental form of hold-one-out cross validation.

...read moreread less

236 citations

Journal Article•DOI•

Iterative learning control: A survey and new results

[...]

Kevin L. Moore¹, Mohammed Dahleh², Shankar P. Bhattacharyya³•Institutions (3)

Idaho State University¹, University of California, Santa Barbara², Texas A&M University³

01 Jul 1992-Journal of Robotic Systems

TL;DR: This analysis offers insight into the nature of the solution of the learning control problem by deriving sufficient convergence conditions; an approach to learning control for linear systems based on parameter estimation; and an analysis that shows that for finite-horizon problems it is possible to design a learning control algorithm that converges, with memory, in one step.

...read moreread less

Abstract: Learning control is an iterative approach to the problem of improving transient behavior for processes that are repetitive in nature. Some results on iterative learning control are presented. A complete review of the literature is given first. Then, a general formulation of the problem is given. Next, a complete analysis of the learning control problem for the case of linear, time-invariant plants and controllers is presented. This analysis offers: insight into the nature of the solution of the learning control problem by deriving sufficient convergence conditions; an approach to learning control for linear systems based on parameter estimation; and an analysis that shows that for finite-horizon problems it is possible to design a learning control algorithm that converges, with memory, in one step. Finally, a time-varying learning controller is given for controlling the trajectory of a nonlinear robot manipulator. A brief simulation example is presented to illustrate the effectiveness of this scheme. 56 refs.

...read moreread less

217 citations

Journal Article•DOI•

Comparison of four neural net learning methods for dynamic system identification

[...]

S.-Z. Qin¹, H.-T. Su¹, Thomas J. McAvoy¹•Institutions (1)

University of Maryland, College Park¹

01 Jan 1992-IEEE Transactions on Neural Networks

TL;DR: It is shown that the feedforward network (FFN) pattern learning rule is a first-order approximation of the FFN-batch learning rule, and is valid for nonlinear activation networks provided the learning rate is small.

...read moreread less

Abstract: Four types of neural net learning rules are discussed for dynamic system identification. It is shown that the feedforward network (FFN) pattern learning rule is a first-order approximation of the FFN-batch learning rule. As a result, pattern learning is valid for nonlinear activation networks provided the learning rate is small. For recurrent types of networks (RecNs), RecN-pattern learning is different from RecN-batch learning. However, the difference can be controlled by using small learning rates. While RecN-batch learning is strict in a mathematical sense, RecN-pattern learning is simple to implement and can be implemented in a real-time manner. Simulation results agree very well with the theorems derived. It is shown by simulation that for system identification problems, recurrent networks are less sensitive to noise. >

...read moreread less

202 citations

Journal Article•DOI•

An iterative learning control theory for a class of nonlinear dynamic systems

[...]

Tae-Yong Kuc, Jin S. Lee, Kwanghee Nam

01 Nov 1992-Automatica

TL;DR: An iterative learning control scheme is presented for a class of nonlinear dynamic systems which includes holonomic systems as its subset and neither uses derivative terms of feedback errors nor assumes external input perturbations as a prerequisite.

...read moreread less

189 citations

Proceedings Article•

A Fast Stochastic Error-Descent Algorithm for Supervised Learning and Optimization

[...]

Gert Cauwenberghs¹•Institutions (1)

California Institute of Technology¹

30 Nov 1992

TL;DR: A parallel stochastic algorithm is investigated for error-descent learning and optimization in deterministic networks of arbitrary topology based on the model-free distributed learning mechanism of Dembo and Kailath and supported by a modified parameter update rule.

...read moreread less

Abstract: A parallel stochastic algorithm is investigated for error-descent learning and optimization in deterministic networks of arbitrary topology. No explicit information about internal network structure is needed. The method is based on the model-free distributed learning mechanism of Dembo and Kailath. A modified parameter update rule is proposed by which each individual parameter vector perturbation contributes a decrease in error. A substantially faster learning speed is hence allowed. Furthermore, the modified algorithm supports learning time-varying features in dynamical networks. We analyze the convergence and scaling properties of the algorithm, and present simulation results for dynamic trajectory learning in recurrent networks.

...read moreread less

Book Chapter•DOI•

A Teaching Method for Reinforcement Learning

[...]

Jeffrey A. Clouse¹, Paul E. Utgoff¹•Institutions (1)

University of Massachusetts Amherst¹

07 Jul 1992

TL;DR: A method that allows a human expert to interact in real-time with a reinforcement learning algorithm is shown to accelerate the learning process.

...read moreread less

Abstract: This paper presents a method for accelerating the learning rates of reinforcement learning algorithms. Reinforcement learning algorithms are known for their slow learning rates, and researchers have focused recently on increasing those rates. In this paper, a method that allows a human expert to interact in real-time with a reinforcement learning algorithm is shown to accelerate the learning process. Two experiments, each with a different domain and a different reinforcement learning algorithm, illustrate that the unobtrusive method accelerates learning by more than an order of magnitude

...read moreread less

Journal Article•DOI•

Learning Two-Tiered Descriptions of Flexible Concepts: The POSEIDON System

[...]

Francesco Bergadano¹, Stan Matwin¹, Ryszard S. Michalski¹, J. Zhang¹•Institutions (1)

Artificial Intelligence Center¹

03 Jan 1992-Machine Learning

TL;DR: The proposed method has been implemented in the POSEIDON system, and experimentally tested on two real-world problems: learning the concept of an acceptable union contract, and learning voting patterns of Republicans and Democrats in the U.S. Congress.

...read moreread less

Abstract: This paper describes a method for learning flexible concepts, by which are meant concepts that lack precise definition and are context-dependent. To describe such concepts, the method employs a two-tiered representation, in which the first tier captures explicitly basic concept properties, and the second tier characterizes allowable concept's modifications and context dependency. In the proposed method, the first tier, called Base Concept Representation (BCR), is created in two phases. In phase 1, the AQ-15 rule learning program is applied to induce a complete and consistent concept description from supplied examples. In phase 2, this description is optimized according to a domain-dependent quality criterion. The second tier, called the inferential concept interpretation (ICI), consists of a procedure for flexible matching, and a set of inference rules. The proposed method has been implemented in the POSEIDON system, and experimentally tested on two real-world problems: learning the concept of an acceptable union contract, and learning voting patterns of Republicans and Democrats in the U.S. Congress. For comparison, a few other learning methods were also applied to the same problems. These methods included simple variants of exemplar-based learning, and an ID-3-type decision tree learning, implemented in the ASSISTANT program. In the experiments, POSEIDON generated concept descriptions that were both, more accurate and also substantially simpler than those produced by the other methods.

...read moreread less

Book Chapter•DOI•

An Approach to Anytime Learning

[...]

John J. Grefenstette¹, Connie Loggia Ramsey¹•Institutions (1)

United States Naval Research Laboratory¹

01 Jul 1992

TL;DR: Anytime learning is a general approach to continuous learning in a changing environment that continuously tests new strategies against a simulation model of the task environment, and dynamically updates the knowledge base used by the agent on the basis of the results.

...read moreread less

Abstract: Anytime learning is a general approach to continuous learning in a changing environment. The agent's learning module continuously tests new strategies against a simulation model of the task environment, and dynamically updates the knowledge base used by the agent on the basis of the results. The execution module controls the agent's interaction with the environment, and includes a monitor that can dynamically modify the simulation model based on its observations of the environment. When the simulation model is modified, the learning process is restarted on the modified model. The learning system is assumed to operate indefinitely, and the execution system uses the results of learning as they become available. An experimental study tests one of the key aspects of this design using a two-agent cat-and-mouse game as the task environment.

...read moreread less

Book•

The design and analysis of efficient learning algorithms

[...]

Robert E. Schapire¹•Institutions (1)

Massachusetts Institute of Technology¹

29 Sep 1992

TL;DR: It is shown that any "weak" learning algorithm that performs just slightly better than random guessing can be converted into one whose error can be made arbitrarily small, and a technique for converting any PAC-learning algorithm into one that is highly space efficient is explored.

...read moreread less

Abstract: Many of the results in this thesis are concerned with the so-called distribution-free or probably approximately correct (PAC) model of learning proposed by Valiant. In this model, the learner tries to identify an unknown concept based on randomly chosen examples of the concept. Examples are chosen according to an unchanging but unknown and arbitrary distribution on the space of instances. Following a brief introduction, this thesis begins in Chapter 2 with a study of the problem of improving the accuracy of a hypothesis output by a learning algorithm in this model. In particular, it is shown that any "weak" learning algorithm that performs just slightly better than random guessing can be converted into one whose error can be made arbitrarily small. Among the many consequences of this result is a technique for converting any PAC-learning algorithm into one that is highly space efficient. In Chapter 3, we next explore in detail a simple but seemingly powerful technique for discovering the structure of an unknown read-once formula from random examples. The method is based on sampling of the target formula's statistical behavior under various perturbations of the underlying instance-space distribution. An especially nice feature of this technique is its powerful resistance to noise. We next consider in Chapter 4 a realistic extension of the PAC model to concepts that may exhibit uncertain or probabilistic behavior. While building on the recent results of Haussler on the sample complexity of learning in probabilistic settings, this chapter focuses primarily on the design of efficient algorithms for learning probabilistic concepts. This work also extends many of the results in the standard PAC model to the new probabilistic model. In the last chapter, we present new algorithms for inferring an unknown finite-state automaton from its input-output behavior. This problem is motivated by the problem faced by a robot in unfamiliar surroundings who must, through experimentation, discover the "structure" of its environment. Some of our algorithms are based on Angluin's algorithm for learning finite-state automata. We also present superior algorithms for the special class of permutation automata. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.) (Abstract shortened with permission of school.)

...read moreread less

Proceedings Article•DOI•

Learning reactive admittance control

[...]

Vijaykumar Gullapalli, Roderic A. Grupen, Andrew G. Barto

12 May 1992

TL;DR: The results indicated that direct reinforcement learning can be used to learn a reactive control strategy that works well even in the presence of a high degree of noise and uncertainty.

...read moreread less

Abstract: A peg-in-hole insertion task is used as an example to illustrate the utility of direct associative reinforcement learning methods for learning control under real-world conditions of uncertainty and noise. An associative reinforcement learning system has to learn appropriate actions in various situations through a search guided by evaluative performance feedback The authors used such a learning system, implemented as a connectionist network, to learn active compliant control for peg-in-hole insertion. The results indicated that direct reinforcement learning can be used to learn a reactive control strategy that works well even in the presence of a high degree of noise and uncertainty. >

...read moreread less

Reinforcement learning and its application to control

[...]

Vijaykumar Gullapalli¹•Institutions (1)

University of Massachusetts Boston¹

01 Jan 1992

TL;DR: It is argued that for certain types of problems the latter approach, of which reinforcement learning is an example, can yield faster, more reliable learning, while the former approach is relatively inefficient.

...read moreread less

Abstract: Learning control involves modifying a controller's behavior to improve its performance as measured by some predefined index of performance (IP). If control actions that improve performance as measured by the IP are known, supervised learning methods, or methods for learning from examples, can be used to train the controller. But when such control actions are not known a priori, appropriate control behavior has to be inferred from observations of the IP. One can distinguish between two classes of methods for training controllers under such circumstances. Indirect methods involve constructing a model of the problem's IP and using the model to obtain training information for the controller. On the other hand, direct, or model-free, methods obtain the requisite training information by observing the effects of perturbing the controlled process on the IP. Despite its reputation for inefficiency, we argue that for certain types of problems the latter approach, of which reinforcement learning is an example, can yield faster, more reliable learning. Using several control problems as examples, we illustrate how the complexity of model construction can often exceed that of solving the original control problem using direct reinforcement learning methods, making indirect methods relatively inefficient. These results indicate the importance of considering direct reinforcement learning methods as tools for learning to solve control problems. We also present several techniques for augmenting the power of reinforcement learning methods. These include (1) the use of local models to guide assigning credit to the components of a reinforcement learning system, (2) implementing a procedure from experimental psychology called "shaping" to improve the efficiency of learning, thereby making more complex problems amenable to solution, and (3) implementing a multi-level learning architecture designed for exploiting task decomposability by using previously-learned behaviors as primitives for learning more complex tasks.

...read moreread less

Machine Learning: A Theoretical Approach

[...]

Translator-IEEE Expert staff

01 Aug 1992

Journal Article•DOI•

Constrained supervised learning

[...]

Michael I. Jordan¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Sep 1992-Journal of Mathematical Psychology

TL;DR: The current paper considers how to specify targets by sets of constraints, rather than as particular vectors, to allow supervised learning algorithms to make use of flexibility in training.

...read moreread less

Book Chapter•DOI•

Compiling prior knowledge into an explicit basis

[...]

William W. Cohen¹•Institutions (1)

Bell Labs¹

07 Jul 1992

TL;DR: This work describes a new system in which theory-guided learning is separated into two phases, and introduces antecedent description grammars as a language for explicitly representing this bias in an inductive learning system.

...read moreread less

Abstract: Current theory-guided learning systems are inflexible, in that they are committed to performing one particular class of theory corrections; this is problematic because in some cases special-purpose theory-guided learning systems can dramatically outperform general-purpose ones. To address this problem, we describe a new system in which theory-guided learning is separated into two phases. The first phase is “theory intrepretation”, in which the initial theory is translated into an explicit description of the bias for an inductive learning system; we introduce antecedent description grammars as a language for explicitly representing this bias. The second phase is “grammatically biased learning”, in which this bias is used to search for a hypothesis. We demonstrate empirically that this approach leads to a flexible learning system which can, by use of suitable translators, emulate several useful learning systems; we also argue that this architecture makes it easier for a user unfamiliar with the details of the learning system to predict the impact that an initial theory will have on learning.

...read moreread less

A Framework for Combining Symbolic and Neural Learning

[...]

Jude W. Shavlik¹•Institutions (1)

University of Wisconsin-Madison¹

01 Jan 1992

TL;DR: An approach to combining symbolic and connectionist approaches to machine learning is described, with a three-stage framework and the research of several groups is reviewed with respect to this framework.

...read moreread less

Abstract: This article describes an approach to combining symbolic and connectionist approaches to machine learning A three-stage framework is presented and the research of several groups is reviewed with respect to this framework The first stage involves the insertion of symbolic knowledge into neural networks, the second addresses the refinement of this prior knowledge in its neural representation, while the third concerns the extraction of the refined symbolic knowledge Experimental results and open research issues are discussed

...read moreread less

Journal Article•DOI•

Machine fault classification: a neural network approach

[...]

Gerald M. Knapp¹, Hsu-Pin (Ben) Wang¹•Institutions (1)

University of Iowa¹

01 Apr 1992-International Journal of Production Research

TL;DR: A neural network-based machine fault diagnosis model is developed using the back propagation (BP) learning paradigm and network training efficiency is studied by varying the learning rate and learning momentum of the activation function.

...read moreread less

Abstract: This paper presents a neural network approach for machine fault diagnosis. Specifically, two tasks are explained and discussed: (1) a neural network-based machine fault diagnosis model is developed using the back propagation (BP) learning paradigm; (2) network training efficiency is studied by varying the learning rate and learning momentum of the activation function. The results are presented and discussed.

...read moreread less

Proceedings Article•DOI•

Genetic and learning automata algorithms for adaptive digital filters

[...]

R. Nambiar¹, C.K.K. Tang, P. Mars•Institutions (1)

Durham University¹

23 Mar 1992

TL;DR: Two different approaches to adaptive digital filtering based on learning algorithms are presented in detail and the use of improved learning schemes published elsewhere is detailed.

...read moreread less

Abstract: Two different approaches to adaptive digital filtering based on learning algorithms are presented in detail. The first approach is based on stochastic learning automata where the discretized values of a parameter(s) form the actions of a learning automata which then obtains the optimal parameter setting using a suitably defined error function as the feedback from the environment. The authors detail the use of improved learning schemes published elsewhere and also point out the basic shortcoming of this approach. The second approach is based on genetic algorithms (GAs). GAs have been used in the context of multiparameter optimization. Simulation results are presented to show how this approach is able to tackle the problems of dimensionality when adapting high-order filters. The effect of the differential parameters of a GA on the learning process is also demonstrated. Comparative results between a pure random search algorithm and the GA are also presented. >

...read moreread less

Proceedings Article•DOI•

Shaping as a method for accelerating reinforcement learning

[...]

Vijaykumar Gullapalli, Andrew G. Barto

11 Aug 1992

TL;DR: Shaping a reinforcement learning controller's behavior over time by gradually increasing the complexity of the control task as the controller learns makes it possible to scale reinforcement learning methods to more complex tasks.

...read moreread less

Abstract: Learning complex control behavior by building some initial control knowledge into the learning controller through shaping is addressed. The principle underlying shaping is that learning to solve complex problems can be facilitated by first learning to solve related simpler problems. The authors present experimental results illustrating the utility of shaping in training controllers by means of reinforcement learning methods. Shaping a reinforcement learning controller's behavior over time by gradually increasing the complexity of the control task as the controller learns makes it possible to scale reinforcement learning methods to more complex tasks. This is illustrated by an example. >

...read moreread less

Book•

Modular Learning in Neural Networks: A Modularized Approach to Neural Network Classification

[...]

Tomas Hrycej

25 Sep 1992

TL;DR: This important new work recognizes the advanced nature of today's artificial neural networks, uniquely emphasizing a modular approach to neural network learning, and covers the full range of conceivable approaches to the modularization of learning.

...read moreread less

Abstract: From the Publisher: This important new work recognizes the advanced nature of today's artificial neural networks, uniquely emphasizing a modular approach to neural network learning. By breaking down the learning task into relatively independent parts of lower complexity, Modular Learning in Neural Networks demonstrates how neural network learning can be made more powerful and efficient. The book's modular approach, unlike the monolithic viewpoint, admits intermediary solution stages whose success can be independently verified, as in other engineering fields. Each stage can be evaluated before moving on to the subsequent one, and the reason for possible failures can be analyzed, ultimately leading to the improved development and engineering of applications. The modular approach also takes into account growing network complexity, reducing the difficulty of such inevitable problems as scaling and convergence. Modular Learning in Neural Networks' modular approach is also fully in step with important psychological and neurobiological research. Studies in developmental psychology demonstrate the incremental nature of human learning, in which the success of each stage is conditioned by the successful accomplishment of the previous stage, while neurobiology has depicted the human brain as a complex structure of cooperating modules. Modular Learning in Neural Networks covers the full range of conceivable approaches to the modularization of learning, including decomposition of learning into modules using supervised and unsupervised learning types; decomposition of the function to be mapped into linear and nonlinear parts; decomposition of the neural network to minimize harmful interferences between a large number of network parameters during learning; decomposition of the application task into subtasks that are learned separately; decomposition into a knowledge-based part and a learning part. The book attempts to show that modular learning based on these approaches is helpful in improving t

...read moreread less

Proceedings Article•

Learning flexible concepts from streams of examples: FLORA2

[...]

Gerhard Widmer, Miroslav Kubat

30 Aug 1992

TL;DR: FLORA2 is a program for supervised learning of concepts that are subject to concept drift that keeps in memory not only valid descriptions of the concepts as they are derived from the objects currently present in the window, but alsòcandidate descriptions' that may turn into valid descriptions in the future.

...read moreread less

Abstract: FLORA2 is a program for supervised learning of concepts that are subject to concept drift. The learning process is incremental in that the examples are processed one by one. A special feature of our program consists in keeping in memory a subset of examples { a window. In time, new examples are being added to the window while other ones are considered outdated and are forgotten. In order to track the concept drift, the system keeps in memory not only valid descriptions of the concepts as they are derived from the objects currently present in the window, but alsòcandidate descriptions' that may turn into valid descriptions in the future.

...read moreread less

Proceedings Article•DOI•

Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension

[...]

Michael Kearns¹•Institutions (1)

Bell Labs¹

07 Jun 1992

TL;DR: An understanding of the sample complexity of learning in several existing models is provided and a systematic investigation and comparison of two fundamental quantities in learning and information theory is undertaken.

...read moreread less

Abstract: Summary form only given. A Bayesian or average-case model of concept learning is given. The model provides more precise characterizations of the learning curve (sample complexity) behaviour that depends on properties of both the prior distribution over concepts and the sequence of instances seen by the learner. It unites in a common framework statistical physics and VC dimension theories of learning curves. A systematic investigation and comparison of two fundamental quantities in learning and information theory is undertaken. These are the probability of an incorrect prediction for an optimal learning algorithm, and the Shannon information gain. This paper provides an understanding of the sample complexity of learning in several existing models. >

...read moreread less

Proceedings Article•DOI•

Supervised and unsupervised learning with fuzzy similarity for neural-network-based fuzzy logic control systems

[...]

C.S.G. Lee¹, Chin-Teng Lin•Institutions (1)

Purdue University¹

18 Oct 1992

TL;DR: An online supervised structure/parameter learning algorithm is proposed which can find proper fuzzy logic rules, membership functions, and the size of output fuzzy partitions simultaneously and performs well if sets of training data are available offline.

...read moreread less

Abstract: A feedforward multilayered connectionist network that has distributed learning abilities is proposed to realize the basic elements and functions of a traditional fuzzy logic controller Two complementary structure/parameter learning algorithms are proposed for setting up the proposed neural-network-based fuzzy logic control system (NN-FLCS) First, a two-phase hybrid learning algorithm is proposed which combines unsupervised and supervised learning procedures to build the rule nodes and train the membership functions The two-phase hybrid learning algorithm performs well if sets of training data are available offline The authors then propose an online supervised structure/parameter learning algorithm for constructing the NN-FLCS dynamically This algorithm combines the backpropagation learning scheme for the parameter learning and a fuzzy similarity measure for the structure learning The proposed online structure/parameter learning algorithm can find proper fuzzy logic rules, membership functions, and the size of output fuzzy partitions simultaneously Computer simulation examples are presented to illustrate the performance of the learning algorithms >

...read moreread less

Proceedings Article•DOI•

Genetic cascade learning for neural networks

[...]

N. Karunanithi¹, R. Das¹, Darrell Whitley¹•Institutions (1)

Colorado State University¹

06 Jun 1992

TL;DR: This step-wise constructive algorithm exhibits more scalability than existing genetic algorithms and is free of the competing conventions problem which results from the fact that functionally equivalent networks may have different assignments of functionality to individual hidden units.

...read moreread less

Abstract: Genetic cascade learning is a new constructive algorithm for connectionist learning which combines genetic algorithms and the architectural feature of the cascade-correlation learning algorithm. Like the cascade-correlation learning architecture, this new algorithm also starts with a minimal network and dynamically builds a suitable cascade structure by training and installing one hidden unit at a time until the problem is successfully learned. This step-wise constructive algorithm exhibits more scalability than existing genetic algorithms and is free of the competing conventions problem which results from the fact that functionally equivalent networks may have different assignments of functionality to individual hidden units. Initial tests of genetic cascade learning are carried out on a difficult supervised learning problem as well as a reinforcement learning control problem. >

...read moreread less

Proceedings Article•DOI•

A smart algorithm for incremental learning

[...]

E.H.-C. Wang¹, A. Kuh•Institutions (1)

Stanford University¹

07 Jun 1992

TL;DR: A new perceptron learning algorithm is proposed, the smart algorithm, to find the near-optimal set of weights at each node, to improve the performance of the incremental learning algorithms.

...read moreread less

Abstract: Incremental learning algorithms not only adjust interconnection weights, but also change the network architecture by adding hidden nodes at the network. The capabilities of these incremental learning algorithms are examined. Four different incremental learning algorithms have been simulated for a variety of learning tasks. To improve the performance of the incremental learning algorithms, a new perceptron learning algorithm is proposed, the smart algorithm, to find the near-optimal set of weights at each node. The simulation results show that the smart algorithm improves the performance of these incremental learning algorithms. Among the four algorithms examined, the global algorithm performed the best. >

...read moreread less