Showing papers on "Algorithmic learning theory published in 2007"

PDF

Open Access

Proceedings Article•

[...]

Olivier Bousquet¹, Léon Bottou²•Institutions (2)

03 Dec 2007

TL;DR: This contribution develops a theoretical framework that takes into account the effect of approximate optimization on learning algorithms and shows distinct tradeoffs for the case of small-scale and large-scale learning problems.

...read moreread less

Abstract: This contribution develops a theoretical framework that takes into account the effect of approximate optimization on learning algorithms. The analysis shows distinct tradeoffs for the case of small-scale and large-scale learning problems. Small-scale learning problems are subject to the usual approximation-estimation tradeoff. Large-scale learning problems are subject to a qualitatively different tradeoff involving the computational complexity of the underlying optimization algorithms in non-trivial ways.

...read moreread less

1,599 citations

Proceedings Article•DOI•

An empirical evaluation of deep architectures on problems with many factors of variation

[...]

Hugo Larochelle¹, Dumitru Erhan¹, Aaron Courville¹, James Bergstra¹, Yoshua Bengio¹ - Show less +1 more•Institutions (1)

Université de Montréal¹

20 Jun 2007

TL;DR: A series of experiments indicate that these models with deep architectures show promise in solving harder learning problems that exhibit many factors of variation.

...read moreread less

Abstract: Recently, several learning algorithms relying on models with deep architectures have been proposed. Though they have demonstrated impressive performance, to date, they have only been evaluated on relatively simple problems such as digit recognition in a controlled environment, for which many machine learning algorithms already report reasonable results. Here, we present a series of experiments which indicate that these models show promise in solving harder learning problems that exhibit many factors of variation. These models are compared with well-established algorithms such as Support Vector Machines and single hidden-layer feed-forward neural networks.

...read moreread less

1,122 citations

Introduction to Statistical Relational Learning

[...]

Charles Sutton, Andrew McCallum

01 Jan 2007

837 citations

Book•

Algorithmic Learning in a Random World

[...]

Vladimir Vovk¹, Alexander Gammerman¹, Glenn Shafer•Institutions (1)

Royal Holloway, University of London¹

01 Jan 2007

TL;DR: Algorithmic Learning in a Random World describes recent theoretical and experimental developments in building computable approximations to Kolmogorov's algorithmic notion of randomness and describes how several important machine learning problems cannot be solved if the only assumption is randomness.

...read moreread less

Abstract: Algorithmic Learning in a Random World describes recent theoretical and experimental developments in building computable approximations to Kolmogorov's algorithmic notion of randomness. Based on these approximations, a new set of machine learning algorithms have been developed that can be used to make predictions and to estimate their confidence and credibility in high-dimensional spaces under the usual assumption that the data are independent and identically distributed (assumption of randomness). Another aim of this unique monograph is to outline some limits of predictions: The approach based on algorithmic theory of randomness allows for the proof of impossibility of prediction in certain situations. The book describes how several important machine learning problems, such as density estimation in high-dimensional spaces, cannot be solved if the only assumption is randomness.

...read moreread less

636 citations

Book•

Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)

[...]

Lise Getoor, Ben Taskar

01 Aug 2007

TL;DR: This book is intended to be a guide to the art of self-consistency and should not be relied on as a substitute for professional advice on how to deal with ambiguity.

...read moreread less

Abstract: All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher.

...read moreread less

610 citations

Online Learning: Theory, Algorithms, and Applications

[...]

Shai Shalev-Shwartz

01 Jan 2007

TL;DR: This dissertation describes a novel framework for the design and analysis of online learning algorithms and proposes a new perspective on regret bounds which is based on the notion of duality in convex optimization.

...read moreread less

Abstract: Online learning is the process of answering a sequence of questions given knowledge of the correct answers to previous questions and possibly additional available information. Answering questions in an intelligent fashion and being able to make rational decisions as a result is a basic feature of everyday life. Will it rain today (so should I take an umbrella)? Should I fight the wild animal that is after me, or should I run away? Should I open an attachment in an email message or is it a virus? The study of online learning algorithms is thus an important domain in machine learning, and one that has interesting theoretical properties and practical applications. This dissertation describes a novel framework for the design and analysis of online learning algorithms. We show that various online learning algorithms can all be derived as special cases of our algorithmic framework. This unified view explains the properties of existing algorithms and also enables us to derive several new interesting algorithms. Online learning is performed in a sequence of consecutive rounds, where at each round the learner is given a question and is required to provide an answer to this question. After predicting an answer, the correct answer is revealed and the learner suffers a loss if there is a discrepancy between his answer and the correct one. The algorithmic framework for online learning we propose in this dissertation stems from a connection that we make between the notions of regret in online learning and weak duality in convex optimization. Regret bounds are the common thread in the analysis of online learning algorithms. A regret bound measures the performance of an online algorithm relative to the performance of a competing prediction mechanism, called a competing hypothesis. The competing hypothesis can be chosen in hindsight from a class of hypotheses, after observing the entire sequence of question- answer pairs. Over the years, competitive analysis techniques have been refined and extended to numerous prediction problems by employing complex and varied notions of progress toward a good competing hypothesis. We propose a new perspective on regret bounds which is based on the notion of duality in convex optimization. Regret bounds are universal in the sense that they hold for any possible fixed hypothesis in a given hypothesis class. We therefore cast the universal bound as a lower bound

...read moreread less

359 citations

Book Chapter•DOI•

Algorithmic Game Theory: Cascading Behavior in Networks: Algorithmic and Economic Issues

[...]

Jon Kleinberg¹•Institutions (1)

Cornell University¹

01 Sep 2007

TL;DR: A collection of probabilistic and game-theoretic models for such phenomena proposed in the mathematical social sciences, as well as recent algorithmic work on the problem by computer scientists are considered.

...read moreread less

Abstract: The flow of information or influence through a large social network can be thought of as unfolding with the dynamics of an epidemic: as individuals become aware of new ideas, technologies, fads, rumors, or gossip, they have the potential to pass them on to their friends and colleagues, causing the resulting behavior to cascade through the network. We consider a collection of probabilistic and game-theoretic models for such phenomena proposed in the mathematical social sciences, as well as recent algorithmic work on the problem by computer scientists. Building on this, we discuss the implications of cascading behavior in a number of on-line settings, including word-of-mouth effects (also known as “viral marketing”) in the success of new products, and the influence of social networks in the growth of on-line

...read moreread less

302 citations

Journal Article•DOI•

Algorithmic information theory

[...]

Marcus Hutter

06 Mar 2007-Scholarpedia

TL;DR: This article is a brief guide to the field of algorithmic information theory, its underlying philosophy, the major subfields, applications, history, and a map of the field are presented.

...read moreread less

Abstract: This article is a brief guide to the field of algorithmic information theory (AIT), its underlying philosophy, and the most important concepts. AIT arises by mixing information theory and computation theory to obtain an objective and absolute notion of information in an individual object, and in so doing gives rise to an objective and robust notion of randomness of individual objects. This is in contrast to classical information theory that is based on random variables and communication, and has no bearing on information and randomness of individual objects. After a brief overview, the major subfields, applications, history, and a map of the field are presented.

...read moreread less

266 citations

Book•

Machine Learning and Data Mining: Introduction to Principles and Algorithms

[...]

Igor Kononenko, Matjaz Kukar

01 Jun 2007

TL;DR: Introduction Learning and intelligence Machine learning basics Knowledge representation Learning as search Attribute quality measures Data pre-processing Constructive induction Symbolic learning Statistical learning

...read moreread less

Abstract: Introduction Learning and intelligence Machine learning basics Knowledge representation Learning as search Attribute quality measures Data pre-processing Constructive induction Symbolic learning Statistical learning Artificial neural networks Cluster analysis Learning theory Computational learning theory Definitions References and index.

...read moreread less

266 citations

Proceedings Article•

Guiding Semi-Supervision with Constraint-Driven Learning

[...]

Ming-Wei Chang¹, Lev Ratinov¹, Dan Roth¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Dec 2007

TL;DR: The experimental results presented in the information extraction domain demonstrate that applying constraints helps the model to generate better feedback during learning, and hence the framework allows for high performance learning with significantly less training data than was possible before on these tasks.

...read moreread less

Abstract: Over the last few years, two of the main research directions in machine learning of natural language processing have been the study of semi-supervised learning algorithms as a way to train classiers when the labeled data is scarce, and the study of ways to exploit knowledge and global information in structured learning tasks. In this paper, we suggest a method for incorporating domain knowledge in semi-supervised learning algorithms. Our novel framework unies and can exploit several kinds of task specic constraints. The experimental results presented in the information extraction domain demonstrate that applying constraints helps the model to generate better feedback during learning, and hence the framework allows for high performance learning with significantly less training data than was possible before on these tasks.

...read moreread less

228 citations

Journal Article•DOI•

A unified model of early word learning: Integrating statistical and social cues

[...]

Chen Yu¹, Dana H. Ballard²•Institutions (2)

Indiana University¹, University of Rochester²

01 Aug 2007-Neurocomputing

TL;DR: It is argued that statistical and social cues can be seamlessly integrated to facilitate early word learning and a unified model is presented that is able to make use of different kinds of embodied social cues in the statistical learning framework.

...read moreread less

Book•

Semisupervised Learning for Computational Linguistics

[...]

Steven Abney

17 Sep 2007

TL;DR: Taking an intuitive approach to the material, this lucid book facilitates the application of semisupervised learning methods to natural language processing and provides the framework and motivation for a more systematic study of machine learning.

...read moreread less

Abstract: The rapid advancement in the theoretical understanding of statistical and machine learning methods for semisupervised learning has made it difficult for nonspecialists to keep up to date in the field. Providing a broad, accessible treatment of the theory as well as linguistic applications, Semisupervised Learning for Computational Linguistics offers self-contained coverage of semisupervised methods that includes background material on supervised and unsupervised learning. The book presents a brief history of semisupervised learning and its place in the spectrum of learning methods before moving on to discuss well-known natural language processing methods, such as self-training and co-training. It then centers on machine learning techniques, including the boundary-oriented methods of perceptrons, boosting, support vector machines (SVMs), and the null-category noise model. In addition, the book covers clustering, the expectation-maximization (EM) algorithm, related generative methods, and agreement methods. It concludes with the graph-based method of label propagation as well as a detailed discussion of spectral methods. Taking an intuitive approach to the material, this lucid book facilitates the application of semisupervised learning methods to natural language processing and provides the framework and motivation for a more systematic study of machine learning.

...read moreread less

Proceedings Article•DOI•

On the relation between multi-instance learning and semi-supervised learning

[...]

Zhi-Hua Zhou¹, Jun-Ming Xu¹•Institutions (1)

Nanjing University¹

20 Jun 2007

TL;DR: The MissSVM algorithm is proposed which addresses multi- instance learning using a special semi-supervised support vector machine and is competitive with state-of-the-art multi-instance learning algorithms.

...read moreread less

Abstract: Multi-instance learning and semi-supervised learning are different branches of machine learning. The former attempts to learn from a training set consists of labeled bags each containing many unlabeled instances; the latter tries to exploit abundant unlabeled instances when learning with a small number of labeled examples. In this paper, we establish a bridge between these two branches by showing that multi-instance learning can be viewed as a special case of semi-supervised learning. Based on this recognition, we propose the MissSVM algorithm which addresses multi-instance learning using a special semi-supervised support vector machine. Experiments show that solving multi-instance problems from the view of semi-supervised learning is feasible, and the MissSVM algorithm is competitive with state-of-the-art multi-instance learning algorithms.

...read moreread less

Journal Article•DOI•

Psychometric intelligence dissociates implicit and explicit learning.

[...]

Guido F. Gebauer¹, N. J. Mackintosh¹•Institutions (1)

University of Cambridge¹

01 Jan 2007-Journal of Experimental Psychology: Learning, Memory and Cognition

TL;DR: This finding provides support for Reber's hypothesis that implicit learning, in contrast to explicit learning, is independent of intelligence, and confirms thereby the distinction between the 2 modes of learning.

...read moreread less

Abstract: The hypothesis that performance on implicit learning tasks is unrelated to psychometric intelligence was examined in a sample of 605 German pupils. Performance in artificial grammar learning, process control, and serial learning did not correlate with various measures of intelligence when participants were given standard implicit instructions. Under an explicit rule discovery instruction, however, a significant relationship between performance on the learning tasks and intelligence appeared. This finding provides support for Reber's hypothesis that implicit learning, in contrast to explicit learning, is independent of intelligence, and confirms thereby the distinction between the 2 modes of learning. However, because there were virtually no correlations among the 3 learning tasks, the assumption of a unitary ability of implicit learning was not supported.

...read moreread less

Journal Article•DOI•

What Do We Actually Mean by Experiential Learning

[...]

Knud Illeris

01 Mar 2007-Human Resource Development Review

TL;DR: The concept of experiential learning is used in a wide range of connections and situations with a different meaning and content as discussed by the authors, and it is the aim of this article to try to find a common definition o...

...read moreread less

Abstract: The concept of “experiential learning” is used in a wide range of connections and situations with a different meaning and content. It is the aim of this article to try to find a common definition o...

...read moreread less

Journal Article•

Ontologies for effective use of context in e-learning settings

[...]

Jelena Jovanovic¹, Dragan Gašević², Dragan Gašević³, Colin Knight², Griff Richards² - Show less +1 more•Institutions (3)

University of Belgrade¹, Simon Fraser University², Athabasca University³

01 Jul 2007-Educational Technology & Society

Book Chapter•DOI•

Intelligence Through Interaction: Towards a Unified Theory for Learning

[...]

Ah-Hwee Tan¹, Gail A. Carpenter², Stephen Grossberg²•Institutions (2)

Nanyang Technological University¹, Boston University²

03 Jun 2007

TL;DR: A learning architecture within which a universal adaptation mechanism unifies a rich set of traditionally distinct learning paradigms, including learning by matching,learning by association, learning by instruction, and learning by reinforcement is presented.

...read moreread less

Abstract: Machine learning, a cornerstone of intelligent systems, has typically been studied in the context of specific tasks, including clustering (unsupervised learning), classification (supervised learning), and control (reinforcement learning). This paper presents a learning architecture within which a universal adaptation mechanism unifies a rich set of traditionally distinct learning paradigms, including learning by matching, learning by association, learning by instruction, and learning by reinforcement. In accordance with the notion of embodied intelligence, such a learning theory provides a computational account of how an autonomous agent may acquire the knowledge of its environment in a real-time, incremental, and continuous manner. Through a case study on a minefield navigation domain, we illustrate the efficacy of the proposed model, the learning paradigms encompassed, and the various types of knowledge learned.

...read moreread less

Book Chapter•DOI•

Algorithmic Game Theory: Manipulation-Resistant Reputation Systems

[...]

Eric J. Friedman, Paul Resnick, Rahul Sami

01 Jan 2007

TL;DR: This chapter is an overview of the design and analysis of reputation systems for strategic users and considers three specific strategic threats to reputation systems: the possibility of users with poor reputations starting afresh (whitewashing); lack of effort or honesty in providing feedback; and sybil attacks, in which users create phantom feedback from fake identities to manipulate their own reputation.

...read moreread less

Abstract: This chapter is an overview of the design and analysis of reputation systems for strategic users. We consider three specific strategic threats to reputation systems: the possibility of users with poor reputations starting afresh (whitewashing); lack of effort or honesty in providing feedback; and sybil attacks, in which users create phantom feedback from fake identities to manipulate their own reputation. In each case, we present a simple analytical model that captures the essence of the strategy, and describe approaches to solving the strategic problem in the context of this model. We conclude with a discussion of open questions in this research area. 27.

...read moreread less

Journal Article•DOI•

Efficient reinforcement learning: computational theories, neuroscience and robotics.

[...]

Mitsuo Kawato, Kazuyuki Samejima¹•Institutions (1)

Tamagawa University¹

01 Apr 2007-Current Opinion in Neurobiology

TL;DR: Computational studies that emphasize meta-parameters, hierarchy, modularity and supervised learning to resolve difficult theoretical issues are reviewed here, together with the related experimental data.

...read moreread less

Book Chapter•DOI•

Transfer Learning in Reinforcement Learning Problems Through Partial Policy Recycling

[...]

Jan Ramon¹, Kurt Driessens¹, Tom Croonenborghs¹•Institutions (1)

Katholieke Universiteit Leuven¹

17 Sep 2007

TL;DR: This work presents a new incremental relational regression tree algorithm that is capable of dealing with concept drift through tree restructuring and shows that it enables a Q-learner to transfer knowledge from one task to another by recycling those parts of the generalized Q-function that still hold interesting information for the new task.

...read moreread less

Abstract: We investigate the relation between transfer learning in reinforcement learning with function approximation and supervised learning with concept drift. We present a new incremental relational regression tree algorithm that is capable of dealing with concept drift through tree restructuring and show that it enables a Q-learner to transfer knowledge from one task to another by recycling those parts of the generalized Q-function that still hold interesting information for the new task. We illustrate the performance of the algorithm in several experiments.

...read moreread less

Book Chapter•DOI•

Algorithmic Game Theory: Computational Aspects of Prediction Markets

[...]

David M. Pennock¹, Rahul Sami²•Institutions (2)

Harvard University¹, University of Michigan²

01 Jan 2007

TL;DR: In this article, an overview of the current research on computational aspects of prediction markets is given, including the computational complexity of operating markets for combinatorial events, the design of automated market makers, and the analysis of the computational power and speed of a market as an aggregation tool.

...read moreread less

Abstract: Prediction markets (also known as information markets) are markets established to aggregate knowledge and opinions about the likelihood of future events. This chapter is intended to give an overview of the current research on computational aspects of these markets. We begin with a brief survey of prediction market research, and then give a more detailed description of models and results in three areas: the computational complexity of operating markets for combinatorial events; the design of automated market makers; and the analysis of the computational power and speed of a market as an aggregation tool. We conclude with a discussion of open problems and directions for future research. 1.

...read moreread less

Journal Article•DOI•

A tutorial on kernel methods for categorization

[...]

Frank Jäkel¹, Bernhard Schölkopf¹, Felix A. Wichmann¹•Institutions (1)

Max Planck Society¹

01 Dec 2007-Journal of Mathematical Psychology

TL;DR: A tutorial introduction to a popular class of machine learning tools, called kernel methods, which give basic explanations of some key concepts—the so-called kernel trick, the representer theorem and regularization—which may open up the possibility that insights from machine learning can feed back into psychology.

...read moreread less

Proceedings Article•

A Re-examination of Machine Learning Approaches for Sentence-Level MT Evaluation

[...]

Joshua Albrecht¹, Rebecca Hwa¹•Institutions (1)

University of Pittsburgh¹

01 Jun 2007

TL;DR: It is argued that previously proposed approaches of training a HumanLikeness classifier is not as well correlated with human judgments of translation quality, but that regression-based learning produces more reliable metrics.

...read moreread less

Abstract: Recent studies suggest that machine learning can be applied to develop good automatic evaluation metrics for machine translated sentences. This paper further analyzes aspects of learning that impact performance. We argue that previously proposed approaches of training a HumanLikeness classifier is not as well correlated with human judgments of translation quality, but that regression-based learning produces more reliable metrics. We demonstrate the feasibility of regression-based metrics through empirical analysis of learning curves and generalization studies and show that they can achieve higher correlations with human judgments than standard automatic metrics.

...read moreread less

Proceedings Article•DOI•

The Theoretical Framework and Cognitive Process of Learning

[...]

Yingxu Wang¹•Institutions (1)

University of Calgary¹

06 Aug 2007

TL;DR: The cognitive processes of learning are formally described using real-time process algebra (RTPA) and can be applied to both human and machine learning systems.

...read moreread less

Abstract: Learning is a fundamental cognitive process of human intelligence. According to cognitive informatics, learning as a collective term can be classified into the categories of transitive, objective, and complex learning. This paper presents a theoretical framework of learning and explains its cognitive processes. The neural informatics foundations of learning, particularly the hierarchical neural cluster (HNC) model and the object-attribute-relation (OAR) model, are explored. The taxonomy and theory of learning are described based on concept algebra. The mathematical models of learning are systematically established for the categories of the transitive, objective, and complex learning. On the basis of the fundamental theories of learning, the cognitive processes of learning are formally described using real-time process algebra (RTPA). The theoretical framework established in this work can be applied to both human and machine learning systems.

...read moreread less

Journal Article•DOI•

2007 Special Issue: The interaction of implicit learning, explicit hypothesis testing learning and implicit-to-explicit knowledge extraction

[...]

Ron Sun¹, Xi Zhang², Paul Slusarz², Robert C. Mathews³•Institutions (3)

Rensselaer Polytechnic Institute¹, University of Missouri², Louisiana State University³

01 Jan 2007-Neural Networks

TL;DR: This work advocates an integrated model of skill learning that takes into account both implicit and explicit learning processes and uniquely embodies a bottom-up (implicit-to-explicit) learning approach in addition to other types of learning.

...read moreread less

Journal Article•DOI•

A general criterion and an algorithmic framework for learning in multi-agent systems

[...]

Rob Powers¹, Yoav Shoham¹, Thuc Vu¹•Institutions (1)

Stanford University¹

01 May 2007-Machine Learning

TL;DR: A modular approach for achieving effective agent-centric learning in multi-agent systems that consists of a number of basic algorithmic building blocks, which can be instantiated and composed differently depending on the environment setting as well as the target class of opponents.

...read moreread less

Abstract: We offer a new formal criterion for agent-centric learning in multi-agent systems, that is, learning that maximizes one's rewards in the presence of other agents who might also be learning (using the same or other learning algorithms). This new criterion takes in as a parameter the class of opponents. We then provide a modular approach for achieving effective agent-centric learning; the approach consists of a number of basic algorithmic building blocks, which can be instantiated and composed differently depending on the environment setting (for example, 2- versus n-player games) as well as the target class of opponents. We then provide several specific instances of the approach: an algorithm for stationary opponents, and two algorithms for adaptive opponents with bounded memory, one algorithm for the n-player case and another optimized for the 2-player case. We prove our algorithms correct with respect to the formal criterion, and furthermore show the algorithms to be experimentally effective via comprehensive computer testing.

...read moreread less

Journal Article•DOI•

Machine learning theory and practice as a source of insight into universal grammar

[...]

Shalom Lappin¹, Stuart M. Shieber²•Institutions (2)

King's College London¹, Harvard University²

01 Jul 2007-Journal of Linguistics

TL;DR: The authors explore the possibility that machine learning approaches to naturallanguage processing (NLP) being developed in engineering-oriented computational linguistics (CL) may be able to provide specific scientific insights into the nature of human language.

...read moreread less

Abstract: In this paper, we explore the possibility that machine learning approaches to naturallanguage processing (NLP) being developed in engineering-oriented computational linguistics (CL) may be able to provide specific scientific insights into the nature of human language. We argue that, in principle, machine learning (ML) results could inform basic debates about language, in one area at least, and that in practice, existing results may offer initial tentative support for this prospect. Further, results from computational learning theory can inform arguments carried on within linguistic theory as well.

...read moreread less

Proceedings Article•

Learning by Problem-Posing as Sentence-Integration and Experimental Use

[...]

Tsukasa Hirashima¹, Takuro Yokoyama², Masahiko Okamoto³, Akira Takeuchi²•Institutions (3)

Hiroshima University¹, Kyushu Institute of Technology², Osaka Prefecture University³

08 Jun 2007

TL;DR: The results suggest the effect of refinement of problem schema by the problem-posing, but it was only compared against a no instruction control condition, so it is necessary to compare the learning against other instructions to examine its characteristics.

...read moreread less

Abstract: In this paper, “problem-posing as sentence-integration” is proposed as a new design of learning using computer-based method. We also introduce an interactive learning environment for the problem-posing and report experimental use of the environment in elementary schools. We have already developed a computer-based learning environment for problem-posing as concept combination. However, we could understand that it was difficult for students of the lower grades. Problem-posing by sentence-integration is a framework to realize learning by problem-posing in the lower grades. Sentence-integration is a simple method than problem-posing by concept combination, but it is expected to keep the learning effect to refine problem schema. The learning environment was used by 132 second grade students of two elementary schools to examine whether the lower grade students can pose problems in the environment or not. The effect of refinement of problem schema by the problem-posing was also analyzed by pre-test and post-test with excessive information problems. The students and teachers who were observers of the experiment accepted the learning environment as a useful learning tool to realize learning by problem-posing through the questionnaires and additional comments. Although the results suggest the effect of refinement of problem schema by the problem-posing, it was only compared against a no instruction control condition. Therefore, as a future work, it is necessary to compare the learning against other instructions to examine its characteristics. tsukasa@isl.hiroshima-u.ac.jp

...read moreread less

Book Chapter•DOI•

Learning relational options for inductive transfer in relational reinforcement learning

[...]

Tom Croonenborghs¹, Kurt Driessens¹, Maurice Bruynooghe¹•Institutions (1)

Katholieke Universiteit Leuven¹

19 Jun 2007

TL;DR: An extension of the options framework to the relational setting is introduced and it is shown how one can learn skills that can be transferred across similar, but different domains.

...read moreread less

Abstract: In reinforcement learning problems, an agent has the task of learning a good or optimal strategy from interaction with his environment. At the start of the learning task, the agent usually has very little information. Therefore, when faced with complex problems that have a large state space, learning a good strategy might be infeasible or too slow to work in practice. One way to overcome this problem, is the use of guidance to supply the agent with traces of "reasonable policies". However, in a lot of cases it will be hard for the user to supply such a policy. In this paper, we will investigate the use of transfer learning in Relational Reinforcement Learning. The goal of transfer learning is to accelerate learning on a target task after training on a different, but related, source task. More specifically, we introduce an extension of the options framework to the relational setting and show how one can learn skills that can be transferred across similar, but different domains. We present experiments showing the possible benefits of using relational options for transfer learning.

...read moreread less

Proceedings Article•

Effective control knowledge transfer through learning skill and representation hierarchies

[...]

Mehran Asadi¹, Manfred Huber²•Institutions (2)

West Chester University of Pennsylvania¹, University of Texas at Arlington²

06 Jan 2007

TL;DR: A learning architecture which transfers control knowledge in the form of behavioral skills and corresponding representation concepts from one task to subsequent learning tasks and can significantly outperform learning on a flat state space representation and the MAXQ method for hierarchical reinforcement learning.

...read moreread less

Abstract: Learning capabilities of computer systems still lag far behind biological systems. One of the reasons can be seen in the inefficient re-use of control knowledge acquired over the lifetime of the artificial learning system. To address this deficiency, this paper presents a learning architecture which transfers control knowledge in the form of behavioral skills and corresponding representation concepts from one task to subsequent learning tasks. The presented system uses this knowledge to construct a more compact state space representation for learning while assuring bounded optimality of the learned task policy by utilizing a representation hierarchy. Experimental results show that the presented method can significantly outperform learning on a flat state space representation and the MAXQ method for hierarchical reinforcement learning.

...read moreread less

Collapse