Home
/
Authors
/
Gregory Kuhlmann

Author

Gregory Kuhlmann

Bio: Gregory Kuhlmann is an academic researcher from University of Texas at Austin. The author has contributed to research in topics: General game playing & Simulations and games in economics education. The author has an hindex of 12, co-authored 16 publications receiving 1088 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Reinforcement learning for RoboCup soccer keepaway

[...]

Peter Stone¹, Richard S. Sutton², Gregory Kuhlmann¹•Institutions (2)

University of Texas at Austin¹, University of Alberta²

01 Sep 2005-Adaptive Behavior

TL;DR: The application of episodic SMDP Sarsa(λ) with linear tile-coding function approximation and variable λ to learning higher-level decisions in a keepaway subtask of RoboCup soccer results in agents that significantly outperform a range of benchmark policies.

...read moreread less

Abstract: RoboCup simulated soccer presents many challenges to reinforcement learning methods, including a large state space, hidden and uncertain state, multiple independent agents learning simultaneously, and long and variable delays in the effects of actions. We describe our application of episodic SMDP Sarsa(λ) with linear tile-coding function approximation and variable λ to learning higher-level decisions in a keepaway subtask of RoboCup soccer. In keepaway, one team, “the keepers,” tries to keep control of the ball for as long as possible despite the efforts of “the takers.” The keepers learn individually when to hold the ball and when to pass to a teammate. Our agents learned policies that significantly outperform a range of benchmark policies. We demonstrate the generality of our approach by applying it to a number of task variations including different field sizes and different numbers of players on each team.

...read moreread less

430 citations

Book Chapter•DOI•

Keepaway soccer: from machine learning testbed to benchmark

[...]

Peter Stone¹, Gregory Kuhlmann¹, Matthew D. Taylor¹, Yaxin Liu¹•Institutions (1)

University of Texas at Austin¹

01 Jan 2006

TL;DR: In this paper, a set of programs, tools, and resources designed to make the keepaway soccer domain easily usable for experimentation without any prior knowledge of RoboCup or the Soccer Server are presented.

...read moreread less

Abstract: Keepaway soccer has been previously put forth as a testbed for machine learning. Although multiple researchers have used it successfully for machine learning experiments, doing so has required a good deal of domain expertise. This paper introduces a set of programs, tools, and resources designed to make the domain easily usable for experimentation without any prior knowledge of RoboCup or the Soccer Server. In addition, we report on new experiments in the Keepaway domain, along with performance results designed to be directly comparable with future experimental results. Combined, the new infrastructure and our concrete demonstration of its use in comparative experiments elevate the domain to a machine learning benchmark, suitable for use by researchers across the field.

...read moreread less

156 citations

Proceedings Article•DOI•

Autonomous transfer for reinforcement learning

[...]

Matthew D. Taylor¹, Gregory Kuhlmann¹, Peter Stone¹•Institutions (1)

University of Texas at Austin¹

12 May 2008

TL;DR: This paper introduces Modeling Approximate State Transitions by Exploiting Regression (MASTER), a method for automatically learning a mapping from one task to another through an agent's experience and demonstrates that such learned relationships can significantly improve the speed of a reinforcement learning algorithm in a series of Mountain Car tasks.

...read moreread less

Abstract: Recent work in transfer learning has succeeded in making reinforcement learning algorithms more efficient by incorporating knowledge from previous tasks. However, such methods typically must be provided either a full model of the tasks or an explicit relation mapping one task into the other. An autonomous agent may not have access to such high-level information, but would be able to analyze its experience to find similarities between tasks. In this paper we introduce Modeling Approximate State Transitions by Exploiting Regression (MASTER), a method for automatically learning a mapping from one task to another through an agent's experience. We empirically demonstrate that such learned relationships can significantly improve the speed of a reinforcement learning algorithm in a series of Mountain Car tasks. Additionally, we demonstrate that our method may also assist with the difficult problem of task selection for transfer.

...read moreread less

135 citations

Proceedings Article•

Automatic heuristic construction in a complete general game player

[...]

Gregory Kuhlmann¹, Kurt Dresner¹, Peter Stone¹•Institutions (1)

University of Texas at Austin¹

16 Jul 2006

TL;DR: The main feature of the approach is a novel method for automatically constructing effective search heuristics based on the formal game description that is fully implemented and tested in a range of different games.

...read moreread less

Abstract: Computer game players are typically designed to play a single game: today's best chess-playing programs cannot play checkers, or even tic-tac-toe. General Game Playing is the problem of designing an agent capable of playing many different previously unseen games. The first AAAI General Game Playing Competition was held at AAAI 2005 in order to promote research in this area. In this article, we survey some of the issues involved in creating a general game playing system and introduce our entry to that event. The main feature of our approach is a novel method for automatically constructing effective search heuristics based on the formal game description. Our agent is fully implemented and tested in a range of different games.

...read moreread less

74 citations

Book Chapter•DOI•

Graph-Based Domain Mapping for Transfer Learning in General Games

[...]

Gregory Kuhlmann¹, Peter Stone¹•Institutions (1)

University of Texas at Austin¹

17 Sep 2007

TL;DR: This work introduces a graph-based method for identifying previously encountered games and proves its robustness formally, and describes how the same basic approach can be used to identify similar but non-identical games.

...read moreread less

Abstract: A general game player is an agent capable of taking as input a description of a game's rules in a formal language and proceeding to play without any subsequent human input. To do well, an agent should learn from experience with past games and transfer the learned knowledge to new problems. We introduce a graph-based method for identifying previously encountered games and prove its robustness formally. We then describe how the same basic approach can be used to identify similar but non-identical games. We apply this technique to automate domain mapping for value function transfer and speed up reinforcement learning on variants of previously played games. Our approach is fully implemented with empirical results in the general game playing system.

...read moreread less

72 citations

1
2
3
4
…

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

A Survey on Transfer Learning

[...]

Sinno Jialin Pan¹, Qiang Yang¹•Institutions (1)

Hong Kong University of Science and Technology¹

01 Oct 2010-IEEE Transactions on Knowledge and Data Engineering

TL;DR: The relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift are discussed.

...read moreread less

Abstract: A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a classification task in one domain of interest, but we only have sufficient training data in another domain of interest, where the latter data may be in a different feature space or follow a different data distribution. In such cases, knowledge transfer, if done successfully, would greatly improve the performance of learning by avoiding much expensive data-labeling efforts. In recent years, transfer learning has emerged as a new learning framework to address this problem. This survey focuses on categorizing and reviewing the current progress on transfer learning for classification, regression, and clustering problems. In this survey, we discuss the relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift. We also explore some potential future issues in transfer learning research.

...read moreread less

18,616 citations

Journal Article•DOI•

Transfer Learning for Reinforcement Learning Domains: A Survey

[...]

Matthew D. Taylor¹, Peter Stone•Institutions (1)

University of Southern California¹

01 Dec 2009-Journal of Machine Learning Research

TL;DR: This article presents a framework that classifies transfer learning methods in terms of their capabilities and goals, and then uses it to survey the existing literature, as well as to suggest future directions for transfer learning work.

...read moreread less

Abstract: The reinforcement learning paradigm is a popular way to address problems that have only limited environmental feedback, rather than correctly labeled examples, as is common in other machine learning contexts. While significant progress has been made to improve learning in a single task, the idea of transfer learning has only recently been applied to reinforcement learning tasks. The core idea of transfer is that experience gained in learning to perform one task can help improve learning performance in a related, but different, task. In this article we present a framework that classifies transfer learning methods in terms of their capabilities and goals, and then use it to survey the existing literature, as well as to suggest future directions for transfer learning work.

...read moreread less

1,634 citations

Proceedings Article•

Benchmarking deep reinforcement learning for continuous control

[...]

Yan Duan¹, Xi Chen¹, Rein Houthooft¹, John Schulman¹, Pieter Abbeel¹ - Show less +1 more•Institutions (1)

University of California, Berkeley¹

19 Jun 2016

TL;DR: In this paper, the authors present a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with high state and action dimensionality such as 3D humanoid locomotion, and tasks with partial observations.

...read moreread less

Abstract: Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning. Some notable examples include training agents to play Atari games based on raw pixel data and to acquire advanced manipulation skills using raw sensory inputs. However, it has been difficult to quantify progress in the domain of continuous control due to the lack of a commonly adopted benchmark. In this work, we present a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with very high state and action dimensionality such as 3D humanoid locomotion, tasks with partial observations, and tasks with hierarchical structure. We report novel findings based on the systematic evaluation of a range of implemented reinforcement learning algorithms. Both the benchmark and reference implementations are released at https://github.com/rllab/rllab in order to facilitate experimental reproducibility and to encourage adoption by other researchers.

...read moreread less

1,038 citations

Posted Content•

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

[...]

Tabish Rashid¹, Mikayel Samvelyan, Christian Schroeder de Witt¹, Gregory Farquhar¹, Jakob Foerster², Shimon Whiteson¹ - Show less +2 more•Institutions (2)

University of Oxford¹, Facebook²

30 Mar 2018-arXiv: Learning

TL;DR: In this article, the authors propose a value-based method that can train decentralised policies in a centralised end-to-end fashion in simulated or laboratory settings, where global state information is available and communication constraints are lifted.

...read moreread less

Abstract: In many real-world settings, a team of agents must coordinate their behaviour while acting in a decentralised way. At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted. Learning joint action-values conditioned on extra state information is an attractive way to exploit centralised learning, but the best strategy for then extracting decentralised policies is unclear. Our solution is QMIX, a novel value-based method that can train decentralised policies in a centralised end-to-end fashion. QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations. We structurally enforce that the joint-action value is monotonic in the per-agent values, which allows tractable maximisation of the joint action-value in off-policy learning, and guarantees consistency between the centralised and decentralised policies. We evaluate QMIX on a challenging set of StarCraft II micromanagement tasks, and show that QMIX significantly outperforms existing value-based multi-agent reinforcement learning methods.

...read moreread less

693 citations

Journal Article•DOI•

Imitation Learning: A Survey of Learning Methods

[...]

Ahmed Hussein¹, Mohamed Medhat Gaber², Eyad Elyan¹, Chrisina Jayne¹•Institutions (2)

Robert Gordon University¹, Birmingham City University²

06 Apr 2017-ACM Computing Surveys

TL;DR: This article surveys imitation learning methods and presents design options in different steps of the learning process, and extensively discusses combining imitation learning approaches using different sources and methods, as well as incorporating other motion learning methods to enhance imitation.

...read moreread less

Abstract: Imitation learning techniques aim to mimic human behavior in a given task. An agent (a learning machine) is trained to perform a task from demonstrations by learning a mapping between observations and actions. The idea of teaching by imitation has been around for many years; however, the field is gaining attention recently due to advances in computing and sensing as well as rising demand for intelligent applications. The paradigm of learning by imitation is gaining popularity because it facilitates teaching complex tasks with minimal expert knowledge of the tasks. Generic imitation learning methods could potentially reduce the problem of teaching a task to that of providing demonstrations, without the need for explicit programming or designing reward functions specific to the task. Modern sensors are able to collect and transmit high volumes of data rapidly, and processors with high computational power allow fast processing that maps the sensory data to actions in a timely manner. This opens the door for many potential AI applications that require real-time perception and reaction such as humanoid robots, self-driving vehicles, human computer interaction, and computer games, to name a few. However, specialized algorithms are needed to effectively and robustly learn models as learning by imitation poses its own set of challenges. In this article, we survey imitation learning methods and present design options in different steps of the learning process. We introduce a background and motivation for the field as well as highlight challenges specific to the imitation problem. Methods for designing and evaluating imitation learning tasks are categorized and reviewed. Special attention is given to learning methods in robotics and games as these domains are the most popular in the literature and provide a wide array of problems and methodologies. We extensively discuss combining imitation learning approaches using different sources and methods, as well as incorporating other motion learning methods to enhance imitation. We also discuss the potential impact on industry, present major applications, and highlight current and future research directions.

...read moreread less

535 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191

Collapse