scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Survey of Real-Time Strategy Game AI Research and Competition in StarCraft

TL;DR: An overview of the existing work on AI for real-time strategy (RTS) games focuses on the work around the game StarCraft, which has emerged in the past few years as the unified test bed for this research.
Abstract: This paper presents an overview of the existing work on AI for real-time strategy (RTS) games. Specifically, we focus on the work around the game StarCraft, which has emerged in the past few years as the unified test bed for this research. We describe the specific AI challenges posed by RTS games, and overview the solutions that have been explored to address them. Additionally, we also present a summary of the results of the recent StarCraft AI competitions, describing the architectures used by the participants. Finally, we conclude with a discussion emphasizing which problems in the context of RTS game AI have been solved, and which remain open.

Summary (5 min read)

Introduction

  • Road traffic, finance, or weather forecasts are examples of such large, complex, reallife dynamic environments.
  • This paper aims to provide a one-stop guide on what is the state of the art in RTS AI, with a particular emphasis on the work done in StarCraft.

II. REAL-TIME STRATEGY GAMES

  • Real-time Strategy (RTS) is a sub-genre of strategy games where players need to build an economy (gathering resources and building a base) and military power (training units and researching technologies) in order to defeat their opponents (destroying their army and base).
  • They are simultaneous move games, where more than one player can issue actions at the same time.
  • In comparison, the state space of StarCraft in a typical map is estimated to 1An extended tournament, which can potentially go on indefinitely.
  • Be many orders of magnitude larger than any of those, as discussed in the next section.
  • Interestingly enough, humans seem to be able to deal with the complexity of RTS games, and are still vastly superior to computers in these types of games [2].

B. Challenges in RTS Game AI

  • One the other hand, large datasets of replays have been created [4], [5], from where strategies, trends or plans have been tried to learn.
  • The StarCraft community typically talks about two tasks: Micro: is the ability to control units individually (roughly corresponding to Reactive Control above, and part of Tactics).
  • The reader can find a good presentation of task decomposition for AIs playing RTS in [6].

III. EXISTING WORK ON RTS GAME AI

  • Systems that play RTS games need to address most, if not all, the aforementioned problems together.
  • Therefore, it is hard to classify existing work on RTS AI as addressing the different problems above.
  • Figure 2 graphically illustrates how strategy, tactics and reactive control are three points in a continuum scale where strategy corresponds to decisions making processes that affect long spans of time (several minutes in the case of StarCraft), reactive control corresponds to low-level second-by-second decisions, and tactics sit in the middle.
  • Following this idea, the authors consider strategy to be everything related to the technology trees, build-order3, upgrades, and army composition.

A. Strategy

  • Strategic decision making in real-time domains is still an open problem.
  • Commercial approaches also include Hierarchical FSMs, in which FSMs are composed hierarchically.
  • For example Ontañón et al. [9] explored the use of real-time case-based planning (CBP) in the domain of Wargus (a Warcraft II clone).
  • Concerning machine learning-based approaches, Weber and Mateas [4] proposed a data mining approach to strategy prediction and performed supervised learning on labeled StarCraft replays.
  • One final consideration concerning strategy is that RTS games are typically partially observable.

B. Tactics

  • Tactical reasoning involves reasoning about the different abilities of the units in a group and about the environment and positions of the different groups of units in order to gain military advantage in battles.
  • The authors will divide the work on tactical reasoning in two parts: terrain analysis and decision making.
  • This technique is used by human StarCraft players to survive early aggression and earn time to train more units.
  • Concerning tactical decision making, many different approaches have been explored such as machine learning or game tree search.
  • IMTrees for each strategic decision in the game involving spatial reasoning by combining a set of basic influence maps.

C. Reactive Control

  • Reactive control aims at maximizing the effectiveness of units, including simultaneous control of units of different types in complex battles on heterogeneous terrain.
  • Danielsiek et al. [47] used influence maps to achieve intelligent squad movement to flank the opponent in a RTS game.
  • A drawback for potential field-based techniques is the large number of parameters that has to be tuned in order to achieve the desired behavior.
  • Additionally, there have been some interesting uses of reinforcement learning (RL) [51]: Wender and Watson [52] evaluated the different major RL algorithms for micro-management, which perform all equally.
  • Even if specialized algorithms, such as D*-Lite [60] exist, it is most common to use A* combined with a map simplification technique that generates a simpler navigation graph to be used for pathfinding.

D. Holistic Approaches

  • Holistic approaches to address RTS AI attempt to address the whole problem using a single unified method.
  • To the best of their knowledge, with a few exceptions, such as the Darmok system [65] (which uses a combination of case-based reasoning and learning from demonstration) or ALisp [53], there has not been much work in this direction.
  • The main reason is that the complexity of RTS games is too large, and approaches that decompose the problem into smaller, separate, problems, achieve better results in practice.
  • A related problem is that of integrating reasoning at multiple levels of abstraction.
  • Reactive planning [24], a decompositional planning similar to task networks [11], allows for plans to be changed at different granularity levels and so for multi-scale goals integration of low-level control.

IV. STATE OF THE ART BOTS FOR STARCRAFT

  • Thanks to the recent organization of international game AI competitions focused around the popular StarCraft game (see Section V), several groups have been working on integrating many of the techniques described in the previous section into complete “bots”, capable of playing complete StarCraft games.
  • Looking back at Figure 3, the authors can see the following use of abstraction and divide-in-conquer in the bots: BroodwarBotQ5: uses abstraction for combat, and divide- and-conquer for economy and intelligence gathering.
  • But it also uses it in economy: as can be seen, the production manager sends commands to the building manager, who is in charge of producing the buildings.
  • The authors can see a high level module that issues commands to a series of tactics modules.
  • One interesting aspect of the seven bots described above is that, while all of them (except AIUR) are reactive at the lower level (reactive control), most if not all of them, are scripted at the highest level of abstraction.

V. RECENT STARCRAFT AI COMPETITIONS

  • This section reviews the results of the recent international competitions on AI for StarCraft.
  • These competitions, typically co-located with scientific conferences, have been possible thanks to the existence of the Brood War Application Programming Interface 12, which enables replacing the human 12http://code.google.com/p/bwapi/ player interface with C++ code.
  • The following subsections summarize the results of all the StarCraft AI competitions held at the AIIDE (Artificial Intelligence for Interactive Digital Entertainment) and CIG (Computational Intelligence in Games) conferences during the past years.
  • Additionally the authors analyze the statistics from the StarCraft Bot Ladder, where the best bots play against each other continuously over time.

A. AIIDE

  • Started in 2010, the AIIDE StarCraft AI Competition13 is the most well known and longest running StarCraft AI Competition in the world.
  • Tournament 4 was the complete game of StarCraft: Brood War with fog-of-war enforced.
  • After the competition, many bot programmers (including the Overmind team) realized that their 2010 strategy was quite easily defeated by early game rushing strategies, and so they submitted a Terran bot instead, called Undermind, which finished in 7th.
  • Skynet’s initial attack destroyed Oriol’s early defenses, and nearly won the game in the first few minutes, however it then proceeded to send zealots to attack one at a time rather than group up its units before moving in, which allowed Oriol to catch up.
  • The effect of this strategy selection process can be seen Figure 4 which shows bot win percentages over time.

B. CIG

  • An initial attempt to run a StarCraft tournament at the Computational Intelligence in Games conference (CIG 2010) suffered from technical problems.
  • Is a modification of the AIUR bot that chooses the Dark Templar Opening in order to destroy the enemy base before defenses against invisible units are available, also known as Xelnaga.
  • In each bracket of 5 bots, a round-robin tournament was held with 10 repetitions per pairing, resulting in 40 games per bot.
  • The 5 maps chosen for the first round were selected from the pool of well-known league play maps found on the Internet: (2)MatchPoint 1.3, (4)Fighting Spirit 1.3, iCCup Destination 1.1, iCCup Gaia, and iCCup Great Barrier Reef.
  • It also seems that most of the current bots are not very good at adapting their strategy to the one of their opponent during a game, or at least (via the read/write procedure of game information) within a series of games.

C. StarCraft Bot Ladder

  • The StarCraft Bot Ladder is a website20 where bot versus bot matches are automatized, and are running all the time.
  • In the Elo system each player has a numerical rating that gets incremented or decremented some points after each game.
  • Figure 7 shows the Elo rating of the bots in the ladder during the first half year of 2012.
  • The authors can notice these more extremely in Figure 8 for the second half year of 2012.
  • Both KillerBot and Ximp use hard-coded strategies without any kind of adaptation capabilities.

D. Adaptation Analysis

  • One of the major conclusions from the results of the StarCraft competitions is the lack of adaptation of bots.
  • The authors analyzed the replays from the 2011 and 2012 AIIDE competitions (shown in Figures 9 and 10 respectively).
  • Train Zealots and research attack and speed upgrades as soon as possible, some early game power transitioning into late game tech, also known as – speedzeal.
  • Produce Corsairs (air-air flying unit) as soon as possible and then transition into training Dark Templars (safe from Zerg’s flying detectors thanks to Corsairs), Reavers (ground artillery unit) or Dragoons, also known as – corsair.
  • As the figures show, the top three ranked bots in the competition (Skynet, Aiur and UalbertaBot) do not change their strategy at all depending on their opponent.

VI. OPEN QUESTIONS IN RTS GAME AI

  • As illustrated in this paper, there is a set of problems in RTS game AI that could be considered mostly solved, of for which the authors have very good solutions.
  • One example of such problems is pathfinding (mostly solved) or low-scale micro-management (for which the authors have good solutions).
  • There has been some work in this direction [65], but it is very far from being mature.
  • – Adversarial planning with resources: similarly, even if there exist planning algorithms that handle resources (like GRT-R [73]), they cannot scale up to the size of problems needed for RTS games like StarCraft.
  • The authors know how to incorporate some aspects of domain knowledge (e.g. build orders) into RTS game playing agents.

VII. CONCLUSIONS

  • As the list in the previous section indicates, Real-Time Strategy games are an excellent testbed for AI techniques, which pose a very large list of open problems.
  • One of the main goals of this paper is to provide a centralized and unified overview to the research being done in the area of RTS game AI.
  • Additionally, the authors have presented an analysis of the results of the different StarCraft AI competitions, highlighting strengths and weaknesses of each of the bots.
  • Optimizing assembly line operations in factories is akin to performing build-order optimizations.

Did you find this useful? Give us your feedback

Figures (16)

Content maybe subject to copyright    Report

HAL Id: hal-00871001
https://hal.archives-ouvertes.fr/hal-00871001
Submitted on 8 Oct 2013
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
A Survey of Real-Time Strategy Game AI Research and
Competition in StarCraft
Santiago Ontañon, Gabriel Synnaeve, Alberto Uriarte, Florian Richoux,
David Churchill, Mike Preuss
To cite this version:
Santiago Ontañon, Gabriel Synnaeve, Alberto Uriarte, Florian Richoux, David Churchill, et al.. A
Survey of Real-Time Strategy Game AI Research and Competition in StarCraft. IEEE Transactions
on Computational Intelligence and AI in games, IEEE Computational Intelligence Society, 2013, 5
(4), pp.1-19. �hal-00871001�

TCIAIG VOL. X, NO. Y, MONTH YEAR 1
A Survey of Real-Time Strategy Game AI
Research and Competition in StarCraft
Santiago Onta
˜
n
´
on, Gabriel Synnaeve, Alberto Uriarte, Florian Richoux, David Churchill, Mike Preuss
Abstract—This paper presents an overview of the existing work
on AI for real-time strategy (RTS) games. Specifically, we focus
on the work around the game StarCraft, which has emerged in
the past few years as the unified test-bed for this research. We
describe the specific AI challenges posed by RTS games, and
overview the solutions that have been explored to address them.
Additionally, we also present a summary of the results of the
recent StarCraft AI competitions, describing the architectures
used by the participants. Finally, we conclude with a discussion
emphasizing which problems in the context of RTS game AI have
been solved, and which remain open.
Index Terms—Game AI, Real-Time Strategy, StarCraft, Re-
view1
I. INTRODUCTION
T
HE field of real-time strategy (RTS) game AI has ad-
vanced significantly since Michael Buro’s call for re-
search in this area [1]. Specially, competitions like the “ORTS
RTS Game AI Competition” (held from 2006 to 2009), the
AIIDE StarCraft AI Competition” (held since 2010), and
the “CIG StarCraft RTS AI Competition” (held since 2011)
have motivated the exploration of many AI approaches in the
context of RTS AI. We will list and classify these approaches,
explain their strengths and weaknesses and conclude on what
is left to achieve human-level RTS AI.
Complex dynamic environments, where neither perfect nor
complete information about the current state or about the
dynamics of the environment are available, pose significant
challenges for artificial intelligence. Road traffic, finance, or
weather forecasts are examples of such large, complex, real-
life dynamic environments. RTS games can be seen as a
simplification of one such real-life environment, with simpler
dynamics in a finite and smaller world, although still complex
enough to study some of the key interesting problems like
decision making under uncertainty or real-time adversarial
planning. Finding efficient techniques for tackling these prob-
lems on RTS games can thus benefit other AI disciplines
and application domains, and also have concrete and direct
applications in the ever growing industry of video games.
Santiago Onta
˜
n
´
on is with the Computer Science Department at Drexel
University, Philadelphia, PA, USA.
Gabriel Synnaeve is with the Laboratory of Cognitive Science and Psy-
cholinguistics (LSCP) of ENS Ulm in Paris, France.
Alberto Uriarte is with the Computer Science Department at Drexel
University, Philadelphia, PA, USA.
Florian Richoux is with the Nantes Atlantic Computer Science Laboratory
(LINA) of the Universit
´
e de Nantes, France.
David Churchill is with the Computing Science Department of the Univer-
sity of Alberta, Edmonton, Canada.
Mike Preuss is with the Department of Computer Science of Technische
Universit
¨
at Dortmund, Germany.
This paper aims to provide a one-stop guide on what is
the state of the art in RTS AI, with a particular emphasis
on the work done in StarCraft. It is organized as follows:
Section II introduces RTS games, in particular the game
StarCraft, and their main AI challenges. Section III reviews
the existing work on tackling these challenges in RTS games.
Section IV analyzes several current state of the art RTS game
playing agents (called bots), selected from the participants to
annual StarCraft AI competitions. Section V presents results
of the recent annual competitions held at the AIIDE and CIG
conferences and a StarCraft bot game ladder
1
. Section VI
compiles open questions in RTS game AI. Finally, the paper
concludes on discussions and perspectives.
II. REAL-TIME STRATEGY GAMES
Real-time Strategy (RTS) is a sub-genre of strategy games
where players need to build an economy (gathering resources
and building a base) and military power (training units and
researching technologies) in order to defeat their opponents
(destroying their army and base). From a theoretical point of
view, the main differences between RTS games and traditional
board games such as Chess are:
They are simultaneous move games, where more than one
player can issue actions at the same time. Additionally,
these actions are durative, i.e. actions are not instanta-
neous, but take some amount of time to complete.
RTS games are “real-time”, which actually means is that
each player has a very small amount of time to decide the
next move. Compared to Chess, where players may have
several minutes to decide the next action, in StarCraft, the
game executes at 24 frames per second, which means that
players can act as fast as every 42ms, before the game
state changes.
Most RTS games are partially observable: players can
only see the part of the map that has been explored. This
is referred to as the fog-of-war.
Most RTS games are non-deterministic. Some actions
have a chance of success.
And finally, the complexity of these games, both in
terms of state space size and in terms of number of
actions available at each decision cycle is very large. For
example, the state space of Chess is typically estimated to
be around 10
50
, heads up no-limit Texas holdem poker
around 10
80
, and Go around 10
170
. In comparison, the
state space of StarCraft in a typical map is estimated to
1
An extended tournament, which can potentially go on indefinitely.

TCIAIG VOL. X, NO. Y, MONTH YEAR 2
be many orders of magnitude larger than any of those, as
discussed in the next section.
For those reasons, standard techniques used for playing
classic board games, such as game tree search, cannot be
directly applied to solve RTS games without the definition
of some level of abstraction, or some other simplification.
Interestingly enough, humans seem to be able to deal with
the complexity of RTS games, and are still vastly superior to
computers in these types of games [2]. For those reasons, a
large spectrum of techniques have been attempted to deal with
this domain, as we will describe below. The remainder of this
section is devoted to describe StarCraft as a research testbed,
and on detailing the open challenges in RTS game AI.
A. StarCraft
StarCraft: Broo d War is an immensely popular RTS game
released in 1998 by Blizzard Entertainment. StarCraft is set in
a science-fiction based universe where the player must choose
one of the three races: Terran, Protoss or Zerg. One of the
most remarkable aspects of StarCraft is that the three races
are extremely well balanced:
Terrans provide units that are versatile and flexible giving
a balanced option between Protoss and Zergs.
Protoss units have lengthy and expensive manufacturing
processes, but they are strong and resistant. These con-
ditions make players follow a strategy of quality over
quantity.
Zergs, the insectoid race, units are cheap and weak. They
can be produced fast, encouraging players to overwhelm
their opponents with sheer numbers.
Figure 1 shows a screenshot of StarCraft showing a player
playing the Terran race. In order to win a StarCraft game,
players must first gather resources (minerals and Vespene gas).
As resources become available, players need to allocate them
for creating more buildings (which reinforce the economy, and
allow players to create units or unlock stronger units), research
new technologies (in order to use new unit abilities or improve
the units) and train attack units. Units must be distributed to
accomplish different tasks such as reconnaissance, defense and
attack. While performing all of those tasks, players also need
to strategically understand the geometry of the map at hand,
in order to decide where to place new buildings (concentrate
in a single area, or expand to different areas) or where to
set defensive outposts. Finally, when offensive units of two
players meet, each player must quickly maneuver each of
the units in order to fight a battle, which requires quick and
reactive control of each of the units.
A typical StarCraft map is defined as a rectangular grid,
where the width × height of the map is measured in the
number of 32 × 32 squares of pixels, also known as build
tiles. However, the resolution of walkable areas is in squares of
8 × 8 pixels, also known as walk tiles. The typical dimensions
for maps range from 64 × 64 to 256 × 256 build tiles. Each
player can control up to 200 units (plus an unlimited number
of buildings). Moreover, each different race contains between
30 to 35 different types of units and buildings, most of them
with a significant number of special abilities. All these factors
Fig. 1. A screenshot of StarCraft: Brood War.
together make StarCraft a significant challenge, in which
humans are still much better than computers. For instance,
in the game ladder iCCup
2
where users are ranked by their
current point totals (E being the lowest possible rank, and
A
+
and Olympic being the second highest and highest ranks,
respectively), the best StarCraft AI bots are ranked between D
and D
+
, where average amateur players are ranked between
C
+
and B. For comparison, StarCraft professional players are
usually ranked between A
and A
+
.
From a theoretical point of view, the state space of a
StarCraft game for a given map is enormous. For example,
consider a 128 × 128 map. At any given moment there might
be between 50 to 400 units in the map, each of which might
have a complex internal state (remaining energy and hit-
points, action being executed, etc.). This quickly leads to
an immense number of possible states (way beyond the size
of smaller games, such as Chess or Go). For example, just
considering the location of each unit (with 128 × 128 possible
positions per unit), and 400 units, gives us an initial number
of 16384
400
10
1685
. If we add the other factors playing a
role in the game, we obtain even larger numbers.
Another way to measure the complexity of the game is by
looking at the branching factor, b, and the depth of the game,
d, as proposed in [3], with a total game complexity of b
d
.
In Chess, b 35 and d 80. In more complex games,
like Go, b 30 to 300, and d 150 to 200. In order
to determine the branching factor in StarCraft when an AI
plays it, we must have in mind, that the AI can issue actions
simultaneously to as many units in the game as desired. Thus,
considering that, in a typical game, a player controls between
50 to 200 units, the branching factor would be between u
50
and u
200
, where u is the average number of actions each unit
can execute. Estimating the value of u is not easy, since the
number of actions a unit can execute is highly dependent on
the context. Let us make the following assumptions: 1) at most
16 enemy units will be in range of a friendly unit (larger values
are possible, but unlikely), 2) when an AI plays StarCraft,
it only makes sense to consider movement in the 8 cardinal
directions per unit (instead of assuming that the player can
issue a “move” command to anywhere in the map at any point
2
http://www.iccup.com/StarCraft/

TCIAIG VOL. X, NO. Y, MONTH YEAR 3
in time), 3) for “build” actions, we consider that SCVs (Terran
worker units) only build in their current location (otherwise, if
they need to move, we consider that as first issuing a “move”
action, and then a “build”), and 4) let’s consider only the
Terran race. With those assumptions, units in StarCraft can
execute between 1 (units like “Supply Depots”, whose only
action is to be “idle”) to 43 actions (Terran “Ghosts”), with
typical values around 20 to 30. Now, if we have in mind
that actions have cool-down times, and thus not all units
can execute all of the actions at every frame, we can take a
conservative estimation of about 10 possible actions per unit
per game frame. This results in a conservative estimate for the
branching factor between b [10
50
, 10
200
], only considering
units (ignoring the actions buildings can execute). Now, to
compute d, we simply consider the fact that typical games
last for about 25 minutes, which results in d 36000 (25
minutes × 60 seconds × 24 frames per second).
B. Challenges in RTS Game AI
Early research in AI for RTS games [1] identified the
following six challenges:
Resource management
Decision making under uncertainty
Spatial and temporal reasoning
Collaboration (between multiple AIs)
Opponent modeling and learning
Adversarial real-time planning
While there has been a significant work in many, others
have been untouched (e.g. collaboration). Moreover, recent
research in this area has identified several additional research
challenges, such as how to exploit the massive amounts of
existing domain knowledge (strategies, build-orders, replays,
and so on). Below, we describe current challenges in RTS
Game AI, grouped in six main different areas.
1) Planning: As mentioned above, the size of the state
space in RTS games is much larger than that of traditional
board games such as Chess or Go. Additionally, the number
of actions that can be executed at a given instant of time is also
much larger. Thus, standard adversarial planning approaches,
such as game tree search are not directly applicable. As we
elaborate later, planning in RTS games can be seen as having
multiple levels of abstraction: at a higher level, players need
long-term planning capabilities, in order to develop a strong
economy in the game; at a low level, individual units need to
be moved in coordination to fight battles taking into account
the terrain and the opponent. Techniques that can address these
large planning problems by either sampling, or hierarchical
decomposition do not yet exist.
2) Learning: Given the difficulties in playing RTS games
by directly using adversarial planning techniques, many re-
search groups have turned attention to learning techniques.
We can distinguish three types of learning problems in RTS
games:
Prior learning: How can we exploit available data, such
as existing replays, or information about specific maps for
learning appropriate strategies before hand? A significant
amount of work has gone in this direction.
In-game learning: How can bots deploy online learning
techniques that allow them to improve their game play
while playing a game? These techniques might include
reinforcement learning techniques, but also opponent
modeling. The main problem again is the fact that the
state space is too large and the fact that RTS games are
partially observable.
Inter-game learning: What can be learned from one game
that can be used to increase the chances of victory in the
next game? Some work has used simple game-theoretical
solutions to select amongst a pool of predefined strategies,
but the general problem remains unsolved.
3) Uncertainty: Adversarial planning under uncertainty in
domains of the size of RTS games is still an unsolved chal-
lenge. In RTS games, there are two main kinds of uncertainty.
First, the game is partially observable, and players cannot
observe the whole game map (like in Chess), but need to
scout in order to see what the opponent is doing. This type of
uncertainty can be lowered by good scouting, and knowledge
representation (to infer what is possible given what has been
seen). Second, there is also uncertainty arising from the fact
that the games are adversarial, and a player cannot predict
the actions that the opponent(s) will execute. For this type
of uncertainty, the AI, as the human player, can only build a
sensible model of what the opponent is likely to do.
4) Spatial and Temporal Reasoning: Spatial reasoning is
related to each aspect of terrain exploitation. It is involved
in tasks such as building placement or base expansion. In
the former, the player needs to carefully consider building
positioning into its own bases to both protect them by creating
a wall against invasions and to avoid bad configurations where
large units could be stuck. In base expansion, the player has to
choose good available locations to build a new base, regarding
its own position and opponent’s bases. Finally, spatial reason-
ing is key to tactical reasoning: players need to decide where
to place units for battle, favoring, for instance, engagements
when the opponent’s units are lead into a bottleneck.
Another example of spatial reasoning in StarCraft is that
it is always an advantage to have own units on high ground
while the enemy is on low ground, since units on low ground
have no vision onto the high ground.
Analogously, temporal reasoning is key in tactical or strate-
gic reasoning. For example, timing attacks and retreats to gain
an advantage. At a higher strategic level, players need to rea-
son about when to perform long-term impact economic actions
such as upgrades, building construction, strategy switching,
etc. all taking into account that the effects of these actions are
not immediate, but longer term.
5) Domain Knowledge Exploitation: In traditional board
games such as Chess, researchers have exploited the large
amounts of existing domain knowledge to create good evalu-
ation functions to be used by alpha-beta search algorithms,
extensive opening books, or end-game tables. In the case
of RTS games, it is still unclear how the significantly large
amount of domain knowledge (in the forms or strategy guides,
replays, etc.) can be exploited by bots. Most work in this
area has focused on two main directions: on the one hand,
researchers are finding ways in which to hard-code existing

TCIAIG VOL. X, NO. Y, MONTH YEAR 4
strategies into bots, so that bots only need to decide which
strategies to deploy, instead of having to solve the complete
problem of deciding which actions to execute by each individ-
ual unit at each time step. One the other hand, large datasets
of replays have been created [4], [5], from where strategies,
trends or plans have been tried to learn. However, StarCraft
games are quite complex, and how to automatically learn from
such datasets is still an open problem.
6) Task Decomposition: For all the previous reasons, most
existing approaches to play games as StarCraft work by
decomposing the problem of playing an RTS game into a
collection of smaller problems, to be solved independently.
Specifically, a common subdivision is:
Strategy: corresponds to the high-level decision making
process. This is the highest level of abstraction for the
game comprehension. Finding an efficient strategy or
counter-strategy against a given opponent is key in RTS
games. It concerns the whole set of units and buildings
a player owns.
Tactics: are the implementation of the current strategy.
It implies army and building positioning, movements,
timing, and so on. Tactics concerns a group of units.
Reactive control: is the implementation of tactics. This
consists in moving, targeting, firing, fleeing, hit-and-
run techniques (also knows as “kiting”) during battle.
Reactive control focuses on a specific unit.
Terrain analysis: consists in the analysis of regions com-
posing the map: choke-points, minerals and gas emplace-
ments, low and high walkable grounds, islands, etc.
Intelligence gathering: corresponds to information col-
lected about the opponent. Because of the fog-of-war,
players must regularly send scouts to localize and spy
enemy bases.
In comparison, when humans play StarCraft, they typically
divide their decision making in a very different way. The
StarCraft community typically talks about two tasks:
Micro: is the ability to control units individually (roughly
corresponding to Reactive Control above, and part of
Tactics). A good micro player usually keeps their units
alive over a longer period of time.
Macro: is the ability to produce units and to expand
at the appropriate times to keep your production of
units flowing (roughly corresponding to everything but
Reactive Control and part of Tactics above). A good
macro player usually has the larger army.
The reader can find a good presentation of task decom-
position for AIs playing RTS in [6]. Although the previous
task decomposition is common, a significant challenge is on
designing architectures so that the individual AI techniques
that address each of those tasks can communicate and effec-
tively work together, resolving conflicts, prioritizing resources
between them, etc. Section IV provides an overview of the
task decompositions that state-of-the-art bots use. Moreover,
we would like to point out that the task decomposition above
is not the only possible approach. Some systems, such as
IMAI [7], divide gameplay into much smaller tasks, which are
then assigned resources depending on the expected benefits of
Strategy
Tactics
Reactive control
~1 sec
~30 sec
~3 min
direct knowledge
Spatial reasoning
player's intentions
partial observations
mean
plan term
Temporal reasoning
numb
3
Fig. 2. RTS AI levels of abstraction and theirs properties: uncertainty (coming
from partial observation and from not knowing the intentions of the opponent)
is higher for higher abstraction levels. Timings on the right correspond to an
estimate of the duration of a behavior switch in StarCraft. Spatial and temporal
reasoning are indicated for the levels at which greedy solutions are not enough.
achieving each task.
III. EXISTING WORK ON RTS GAME AI
Systems that play RTS games need to address most, if
not all, the aforementioned problems together. Therefore, it
is hard to classify existing work on RTS AI as addressing the
different problems above. For that reason, we will divide it
according to three levels of abstraction: strategy (which loosely
corresponds to “macro”), tactics and reactive control (which
loosely corresponds to “micro”).
Figure 2 graphically illustrates how strategy, tactics and
reactive control are three points in a continuum scale where
strategy corresponds to decisions making processes that affect
long spans of time (several minutes in the case of StarCraft),
reactive control corresponds to low-level second-by-second
decisions, and tactics sit in the middle. Also, strategic deci-
sions reason about the whole game at once, whereas tactical
or reactive control decisions are localized, and affect only
specific groups of units. Typically, strategic decisions constrain
future tactical decisions, which in turn condition reactive
control. Moreover, information gathered while performing
reactive control, can cause reconsideration of the tactics being
employed; which could trigger further strategic reasoning.
Following this idea, we consider strategy to be everything
related to the technology trees, build-order
3
, upgrades, and
army composition. It is the most deliberative level, as a player
selects and performs a strategy with future stances (aggressive,
defensive, economy, technology) and tactics in mind. We
consider tactics to be everything related to confrontations
between groups of units. Tactical reasoning involves both
spatial (exploiting the terrain) and temporal (army movements)
reasoning, constrained on the possible types of attacks by the
army composition of the player and their opponent. Finally,
reactive control describes how the player controls individual
units to maximize their efficiency in real-time. The main
difference between tactics and reactive control is that tactical
3
The build-order is the specific sequence in which buildings of different
types will be constructed at the beginning of a game, and completely
determines the long-term strategy of a player.

Citations
More filters
Posted Content
TL;DR: This work discusses core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration, and important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn.
Abstract: We give an overview of recent exciting achievements of deep reinforcement learning (RL). We discuss six core elements, six important mechanisms, and twelve applications. We start with background of machine learning, deep learning and reinforcement learning. Next we discuss core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration. After that, we discuss important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn. Then we discuss various applications of RL, including games, in particular, AlphaGo, robotics, natural language processing, including dialogue systems, machine translation, and text generation, computer vision, neural architecture design, business management, finance, healthcare, Industry 4.0, smart grid, intelligent transportation systems, and computer systems. We mention topics not reviewed yet, and list a collection of RL resources. After presenting a brief summary, we close with discussions. Please see Deep Reinforcement Learning, arXiv:1810.06339, for a significant update.

935 citations

Posted Content
TL;DR: This paper introduces SC2LE (StarCraft II Learning Environment), a reinforcement learning environment based on the StarCraft II game that offers a new and challenging environment for exploring deep reinforcement learning algorithms and architectures and gives initial baseline results for neural networks trained from this data to predict game outcomes and player actions.
Abstract: This paper introduces SC2LE (StarCraft II Learning Environment), a reinforcement learning environment based on the StarCraft II game. This domain poses a new grand challenge for reinforcement learning, representing a more difficult class of problems than considered in most prior work. It is a multi-agent problem with multiple players interacting; there is imperfect information due to a partially observed map; it has a large action space involving the selection and control of hundreds of units; it has a large state space that must be observed solely from raw input feature planes; and it has delayed credit assignment requiring long-term strategies over thousands of steps. We describe the observation, action, and reward specification for the StarCraft II domain and provide an open source Python-based interface for communicating with the game engine. In addition to the main game maps, we provide a suite of mini-games focusing on different elements of StarCraft II gameplay. For the main game maps, we also provide an accompanying dataset of game replay data from human expert players. We give initial baseline results for neural networks trained from this data to predict game outcomes and player actions. Finally, we present initial baseline results for canonical deep reinforcement learning agents applied to the StarCraft II domain. On the mini-games, these agents learn to achieve a level of play that is comparable to a novice player. However, when trained on the main game, these agents are unable to make significant progress. Thus, SC2LE offers a new and challenging environment for exploring deep reinforcement learning algorithms and architectures.

683 citations


Cites background from "A Survey of Real-Time Strategy Game..."

  • ...[21] and Robertson & Watson [25] for an overview....

    [...]

Proceedings Article
08 May 2019
TL;DR: The StarCraft Multi-Agent Challenge (SMAC), based on the popular real-time strategy game StarCraft II, is proposed as a benchmark problem and an open-source deep multi-agent RL learning framework including state-of-the-art algorithms is opened.
Abstract: In the last few years, deep multi-agent reinforcement learning (RL) has become a highly active area of research. A particularly challenging class of problems in this area is partially observable, cooperative, multi-agent learning, in which teams of agents must learn to coordinate their behaviour while conditioning only on their private observations. This is an attractive research area since such problems are relevant to a large number of real-world systems and are also more amenable to evaluation than general-sum problems. Standardised environments such as the ALE and MuJoCo have allowed single-agent RL to move beyond toy domains, such as grid worlds. However, there is no comparable benchmark for cooperative multi-agent RL. As a result, most papers in this field use one-off toy problems, making it difficult to measure real progress. In this paper, we propose the StarCraft Multi-Agent Challenge (SMAC) as a benchmark problem to fill this gap. SMAC is based on the popular real-time strategy game StarCraft II and focuses on micromanagement challenges where each unit is controlled by an independent agent that must act based on local observations. We offer a diverse set of challenge maps and recommendations for best practices in benchmarking and evaluations. We also open-source a deep multi-agent RL learning framework including state-of-the-art algorithms. We believe that SMAC can provide a standard benchmark environment for years to come. Videos of our best agents for several SMAC scenarios are available at: https://youtu.be/VZ7zmQ_obZ0.

353 citations


Cites background from "A Survey of Real-Time Strategy Game..."

  • ...…in several communities: work ranging from evolutionary algorithms to tabular RL applied has shown that the game is an excellent testbed for both modelling and planning (Ontanón et al., 2013), however, most have focused on single-agent settings with multiple controllers, and classical algorithms....

    [...]

  • ...Learning to play StarCraft games also has been investigated in several communities: work ranging from evolutionary algorithms to tabular RL applied has shown that the game is an excellent testbed for both modelling and planning (Ontanón et al., 2013), however, most have focused on single-agent settings with multiple controllers, and classical algorithms....

    [...]

BookDOI
18 Feb 2018
TL;DR: This is the first textbook dedicated to explaining how artificial intelligence techniques can be used in and for games, and how to use AI to play games, to generate content for games and to model players.
Abstract: This is the first textbook dedicated to explaining how artificial intelligence (AI) techniques can be used in and for games. After introductory chapters that explain the background and key techniques in AI and games, the authors explain how to use AI to play games, to generate content for games and to model players. The book will be suitable for undergraduate and graduate courses in games, artificial intelligence, design, human-computer interaction, and computational intelligence, and also for self-study by industrial game developers and practitioners. The authors have developed a website (http://www.gameaibook.org) that complements the material covered in the book with up-to-date exercises, lecture slides and reading.

276 citations


Cites background from "A Survey of Real-Time Strategy Game..."

  • ...However, opponent modeling is a more narrow concept referring to predicting behavior of an adversarial player when playing to win in an imperfect information game like Poker [48] or StarCraft (Blizzard Entertainment, 1988) [504]....

    [...]

  • ...In academia, there is already a rich body of work on algorithms for playing (parts of) StarCraft [504, 569, 505, 124], or generating maps for it [712]....

    [...]

  • ...For a fuller overview of research on AI for playing SC:BW, the reader is referred to a recent survey [504]....

    [...]

  • ...It is also worth noting that most existing game-based benchmarks measure how well an agent plays a game—see for example [322, 404, 504]....

    [...]

  • ...331 Acronyms A3C Asynchronous Advantage Actor-Critic ABL A Behavior Language AFC Alternative Forced Choice AI Artificial Intelligence AIIDE Artificial Intelligence and Interactive Digital Entertainment ALE Arcade Learning Environment ANN Artificial Neural Network ASP Answer Set Programming BDI Belief-Desire-Intention BT Behavior Tree BWAPI Brood War API CA Cellular Automata CI Computational Intelligence CIG Computational Intelligence and Games CFR Counterfactual Regret Minimization CMA-ES Covariance Matrix Adaptation Evolution Strategy CNN Convolutional Neural Network CPPN Compositional Pattern Producing Network DQN Deep Q Network EA Evolutionary Algorithm ECG Electrocardiography EDPCG Experience-Driven Procedural Content Generation EEG Electroencephalography EMG Electromyography FPS First-Person Shooter FSM Finite State Machine FSMC Functional Scaffolding for Musical Composition GA Genetic Algorithm GDC Game Developers Conference GGP General Game Playing GSP Generalized Sequential Patterns xxi xxii Acronyms GSR Galvanic Skin Response GVGAI General Video Game Artificial Intelligence HCI Human-Computer Interaction ID3 Iterative Dichotomiser 3 JPS Jump Point Search LSTM Long Short-Term Memory MCTS Monte Carlo Tree Search MDP Markov Decision Process MLP Multi-Layer Perceptron MOBA Multiplayer Online Battle Arenas NEAT NeuroEvolution of Augmenting Topologies NES Natural Evolution Strategy NLP Natural Language Processing NPC Non-Player Character PC Player Character PCG Procedural Content Generation PENS Player Experience of Need Satisfaction PLT Preference Learning Toolbox RBF Radial Basis Function ReLU Rectified Linear Unit RPG Role-Playing Game RTS Real-Time Strategy RL Reinforcement learning TCIAIG Transactions on Computational Intelligence and AI in Games TD Temporal Difference ToG Transactions on Games TORCS The Open Racing Car Simulator TRU Tomb Raider: Underworld TSP Traveling Salesman Problem SC:BW StarCraft: Brood War SOM Self-Organizing Map STRIPS Stanford Research Institute Problem Solver SVM Support Vector Machine UT2k4 Unreal Tournament 2004 VGDL Video Game Description Language Website http://gameaibook.org/ This book is associated with the above website....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This work shows that the optimal logarithmic regret is also achievable uniformly over time, with simple and efficient policies, and for all reward distributions with bounded support.
Abstract: Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while taking the empirically best action as often as possible. A popular measure of a policy's success in addressing this dilemma is the regret, that is the loss due to the fact that the globally optimal policy is not followed all the times. One of the simplest examples of the exploration/exploitation dilemma is the multi-armed bandit problem. Lai and Robbins were the first ones to show that the regret for this problem has to grow at least logarithmically in the number of plays. Since then, policies which asymptotically achieve this regret have been devised by Lai and Robbins and many others. In this work we show that the optimal logarithmic regret is also achievable uniformly over time, with simple and efficient policies, and for all reward distributions with bounded support.

6,361 citations


"A Survey of Real-Time Strategy Game..." refers result in this paper

  • ...UAlbertaBot did this using the upper confidence bound (UCB) algorithm [68], while AIUR used a uniform distribution to choose its mood before altering this distribution after some games against the same opponent to favor efficient strategies, achieving similar results as UAlbertaBot....

    [...]

Journal ArticleDOI
TL;DR: An overview of the foundational issues related to case-based reasoning is given, some of the leading methodological approaches within the field are described, and the current state of the field is exemplified through pointers to some systems.
Abstract: Case-based reasoning is a recent approach to problem solving and learning that has got a lot of attention over the last few years. Originating in the US, the basic idea and underlying theories have spread to other continents, and we are now within a period of highly active research in case-based reasoning in Europe, as well. This paper gives an overview of the foundational issues related to case-based reasoning, describes some of the leading methodological approaches within the field, and exemplifies the current state through pointers to some systems. Initially, a general framework is defined, to which the subsequent descriptions and discussions will refer. The framework is influenced by recent methodologies for knowledge level descriptions of intelligent systems. The methods for case retrieval, reuse, solution testing, and learning are summarized, and their actual realization is discussed in the light of a few example systems that represent different CBR approaches. We also discuss the role of case-based methods as one type of reasoning and learning method within an integrated system architecture.

5,750 citations


"A Survey of Real-Time Strategy Game..." refers background in this paper

  • ...Hsieh and Sun [19] based their work on Aha et al.’s CBR model [18] and used StarCraft replays to construct states and building sequences (“build orders”)....

    [...]

  • ...Also falling into the machine-learning category, a significant group of researchers has explored case-based reasoning (CBR) [17] approaches for strategic decision making....

    [...]

  • ...Jaidee et al. [21] study the use of CBR for automatic goal selection, while playing an RTS game....

    [...]

  • ...Molineaux et al. [67] showed that the difficulty of working with multi-scale goals and plans can be handled directly by case-based reasoning (CBR), via an integrated RL/CBR algorithm using continuous models....

    [...]

  • ...For example Aha et al. [18] used CBR to perform dynamic plan retrieval in the Wargus domain....

    [...]

Journal ArticleDOI
01 Jul 2006
TL;DR: In this model, a dynamic potential field simultaneously integrates global navigation with moving obstacles such as other people, efficiently solving for the motion of large crowds without the need for explicit collision avoidance.
Abstract: We present a real-time crowd model based on continuum dynamics. In our model, a dynamic potential field simultaneously integrates global navigation with moving obstacles such as other people, efficiently solving for the motion of large crowds without the need for explicit collision avoidance. Simulations created with our system run at interactive rates, demonstrate smooth flow under a variety of conditions, and naturally exhibit emergent phenomena that have been observed in real crowds.

883 citations


"A Survey of Real-Time Strategy Game..." refers background in this paper

  • ...In recent commercial RTS games like StarCraft 2 or Supreme Commander 2, flocking-like behaviors have been inspired by continuum crowds (flow field) [62]....

    [...]

Journal ArticleDOI
Gerald Tesauro1
TL;DR: The latest version of TD-Gammon is now estimated to play at a strong master level that is extremely close to the world's best human players.
Abstract: TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results, based on the TD(») reinforcement learning algorithm (Sutton 1988). Despite starting from random initial weights (and hence random initial strategy), TD-Gammon achieves a surprisingly strong level of play. With zero knowledge built in at the start of learning (i.e., given only a "raw" description of the board state), the network learns to play at a strong intermediate level. Furthermore, when a set of hand-crafted features is added to the network's input representation, the result is a truly staggering level of performance: the latest version of TD-Gammon is now estimated to play at a strong master level that is extremely close to the world's best human players.

852 citations

Proceedings ArticleDOI

[...]

28 Jul 2002
TL;DR: This paper applies Lifelong Planning A* to robot navigation inunknown terrain, including goal-directed navigation in unknown terrain and mapping of unknown terrain, and develops the resulting D* Lite algorithm, which implements the same behavior as Stentz' Focussed Dynamic A* but is algorithmically different.
Abstract: Incremental heuristic search methods use heuristics to focus their search and reuse information from previous searches to find solutions to series of similar search tasks much faster than is possible by solving each search task from scratch. In this paper, we apply Lifelong Planning A* to robot navigation in unknown terrain, including goal-directed navigation in unknown terrain and mapping of unknown terrain. The resulting D* Lite algorithm is easy to understand and analyze. It implements the same behavior as Stentz' Focussed Dynamic A* but is algorithmically different. We prove properties about D* Lite and demonstrate experimentally the advantages of combining incremental and heuristic search for the applications studied. We believe that these results provide a strong foundation for further research on fast replanning methods in artificial intelligence and robotics.

576 citations


"A Survey of Real-Time Strategy Game..." refers methods in this paper

  • ...Even if specialized algorithms, such as D*-Lite [60] exist, it is most common to use A* combined with a map simplification technique that generates a simpler navigation graph to be used for pathfinding....

    [...]

  • ...Even if specialized algorithms, such as D*-Lite [59]...

    [...]

Related Papers (5)
Frequently Asked Questions (12)
Q1. What are the contributions in "A survey of real-time strategy game ai research and competition in starcraft" ?

This paper presents an overview of the existing work on AI for real-time strategy ( RTS ) games. Specifically, the authors focus on the work around the game StarCraft, which has emerged in the past few years as the unified test-bed for this research. The authors describe the specific AI challenges posed by RTS games, and overview the solutions that have been explored to address them. Additionally, the authors also present a summary of the results of the recent StarCraft AI competitions, describing the architectures used by the participants. Finally, the authors conclude with a discussion emphasizing which problems in the context of RTS game AI have been solved, and which remain open. 

Finally, the authors have closed the paper with a list of specific open research questions for future research. All of these issues mentioned in this paper must are being tackled by the realtime strategy game AI community, and in doing so the authors will not only be improving techniques for writing tournament-winning bots, but for advance the state of the art for many other fields as well. 

The most common pathfinding algorithm is A*, but its big problem is CPU time and memory consumption, hard to satisfy in a complex, dynamic, real-time environment with large numbers of units. 

Other research falling into reactive control has been performed in the field of cognitive science, where Wintermute et al. [59] have explored human-like attention models (with units grouping and vision of a unique screen location) for reactive control. 

Kabanza et al. [32] improve the probabilistic hostile agent task tracker (PHATT [33], a simulated HMM for plan recognition) by encoding strategies as HTN, used for plan and intent recognition to find tactical opportunities. 

In order to make the human player believe the AI of these games does not cheat, sometimes they simulate some scouting tasks as Bob Fitch described in his AIIDE 2011 keynote for the WarCraft and StarCraft game series. 

holistic approaches, based, for example, on Monte Carlo Tree Search, have only been explored in the context of smaller-scale RTS games [66], but techniques that scale up to large RTS games as StarCraft are still not available. 

The first competition in 2010 was organized and run by Ben Weber in the Expressive Intelligence Studio at University of California, Santa Cruz14. 26 total submissions were received from around the world. 

Despite their success, a drawback for potential field-based techniques is the large number of parameters that has to be tuned in order to achieve the desired behavior. 

Each year, AI bots are submitted by competitors to do battle within the retail version of StarCraft: Brood War, with prizes supplied by Blizzard Entertainment. 

More improvements to the tournament environment also meant that a total of 4240 games could now be played in the same time period. 

For bot ranking, the ladder uses an Elo rating system suitable for calculating the relative skill level of a bot in two-player games.