scispace - formally typeset
Open AccessJournal ArticleDOI

Measuring Generalization Performance in Coevolutionary Learning

TLDR
This paper presents a theoretical framework for generalization, the first time that generalization is defined and analyzed rigorously in coevolutionary learning, and shows that a small sample of test strategies can be used to estimate the generalization performance.
Abstract
Coevolutionary learning involves a training process where training samples are instances of solutions that interact strategically to guide the evolutionary (learning) process. One main research issue is with the generalization performance, i.e., the search for solutions (e.g., input-output mappings) that best predict the required output for any new input that has not been seen during the evolutionary process. However, there is currently no such framework for determining the generalization performance in coevolutionary learning even though the notion of generalization is well-understood in machine learning. In this paper, we introduce a theoretical framework to address this research issue. We present the framework in terms of game-playing although our results are more general. Here, a strategy's generalization performance is its average performance against all test strategies. Given that the true value may not be determined by solving analytically a closed-form formula and is computationally prohibitive, we propose an estimation procedure that computes the average performance against a small sample of random test strategies instead. We perform a mathematical analysis to provide a statistical claim on the accuracy of our estimation procedure, which can be further improved by performing a second estimation on the variance of the random variable. For game-playing, it is well-known that one is more interested in the generalization performance against a biased and diverse sample of "good" test strategies. We introduce a simple approach to obtain such a test sample through the multiple partial enumerative search of the strategy space that does not require human expertise and is generally applicable to a wide range of domains. We illustrate the generalization framework on the coevolutionary learning of the iterated prisoner's dilemma (IPD) games. We investigate two definitions of generalization performance for the IPD game based on different performance criteria, e.g., in terms of the number of wins based on individual outcomes and in terms of average payoff. We show that a small sample of test strategies can be used to estimate the generalization performance. We also show that the generalization performance using a biased and diverse set of "good" test strategies is lower compared to the unbiased case for the IPD game. This is the first time that generalization is defined and analyzed rigorously in coevolutionary learning. The framework allows the evaluation of the generalization performance of any coevolutionary learning system quantitatively.

read more

Content maybe subject to copyright    Report

Measuring Generalization Performance in Co-evolutionary
Learning
Siang Y. Chong, Peter Ti˜no, and Xin Yao
The Centre of Excellence for Research in
Computational Intelligence and Applications (CERCIA)
School of Computer Science
The University of Birmingham, UK.
E-mails: S.Y.Chong@cs.bham.ac.uk, P.Tino@cs.bham.ac.uk, X.Yao@cs.bham.ac.uk
Abstract
Co-evolutionary learning involves a tra ining process where training samples are insta nce s of solutions that
interact strategically to guide the evolutionary (learning) process. One main research issue is with the
generalizatio n performance, i.e., the search for solutions (e.g., input-output mappings) that best predict
the required output for any new input that has not been seen during the evolutionary process. Howe ver,
there is currently no such framework for determining the generalization performa nce in co-evolutionary
learning even though the notion of generaliza tio n is well-understood in machine learning. In this paper,
we introduce a theoretical fra mework to address this r esearch issue. We present the framework in terms
of game-playing although our results ar e more general. Here, a strategy’s generalizatio n perfo rmance is
its average performance against all test strategies. Given that the true value may not be deter mined by
solving analytically a closed-form formula and is computationally pro hibitive, we propos e an estimation
procedure that computes the average performance against a small sample of random test strategies
instead. We perform a mathematical analysis to provide a statistical claim on the accuracy of our
estimation procedure, which can be further improved by performing a second estimation on the variance of
the random variable. For game-playing, it is well-known that one is more interested in the generalization
performance against a biased and diverse sample of good” test s trategies. We introduce a simple
approach to obtain such a test sample thro ugh the multiple partial enumerative search of the strategy
space that do e s not require human expertise and is generally applicable to a wide range of domains. We
illustrate the generalization framework on the co-evolutionary learning of the iterated prisoner’s dilemma
(IPD) games. We investigate two definitions of generalization per formance for the IPD game based on
different performance criteria, e.g., in terms of the number of wins based on individual outcomes and
in terms of average payoff. We show that a small sample of test strategies can be used to estimate
the generalization performance. We als o show that the generalization p e rformance using a biased and
diverse set of “good” test strategies is lower compared to the unbiased case for the IPD game. This
is the first time that generalization is defined and analyzed rigorously in co-evolutionary learning. The
framework allows the evaluation of the generalization performance of any co-evolutionary learning sys tem
quantitatively.
Keywords: Evolutionary computation, co-evol utionary learning, gene ralization, Cheby-
shev’s inequality, iterated prisoner’s dilemma.
1 Introd uction
Co-evolutionary learning refers to a broad class of p opulation-based, stochastic search algorithms that
involve the simultaneous evolution of competing solutions with coupled fitness [1]. A co-evolutionary
learning system can be implemented using co-evolutionary algorithms [2, 3], which can be derived from
evolutionary algorithms (EAs) [4 , 5]. That is, both co-evolutionary learning and EAs can be described
in terms of the framework whereby an adaptation process is carried out on the solutions in some form
of representation through a repea ted process of variation and selec tio n. The framework distinguishes
co-evolutionary learning and EAs in general from classical approaches (e.g., steepest-descent-based a lgo-
rithms) from two specific features, i.e., they are population-base d and incorpora te information exchange
1

mechanisms between populations in successive generations (iterative steps) to guide the search process
[6].
Despite their s imila rity in framework, co-evolutionary learning and EAs are fundamentally different
in how the fitness of a solution is assig ned, leading to significantly different outcomes when applied
to similar problems (e.g., different s e arch behaviors on the space of solutions [7, 8]). EAs are often
viewed and constructed in terms of an optimization context [4, 5], whereby an absolute fitness function
is required to assign the fitness value to a solution that is objective (fitness value for a solution is
always the same regar dless of the context). For co-evolutionary learning, the fitness of a solution is
obtained through its interactions with other competing solutions in the cur rent population, and as such,
is subjective (fitness value for a solution depends on the context, e.g., population, and as such is relative
and dynamic). Here, co-evolutionary learning operates to find solutions guided by strategic interactions
among the competing solutions from one generation to the next that results (hopefully) in an arms race
of increasingly innovative solutions [9, 10, 11].
One of the early motivations for using co-evolutionary learning is its potential application for s olving
problems that cannot be framed in the context of optimization (e.g., using EAs) because it is not possible
or very difficult to fo rmulate an absolute fitness function that reflects the underlying properties of the
problem. For such problems, continued use of an inappropriate fitness function will often bias the search
to s olutions that do no t reflect the underlying prope rties of the problem, leading to suboptimal solutions
[12]. Even if a fitness function can be for mulated, it may not be able to e valuate and differentiate
between individual solutions to provide some gradient to direct the search when using EAs [8, 1 3]. One
such problem that is difficult to solve using EAs, but can be naturally framed in co-evolutionary learning,
is the problem of game-playing [2, 3, 14, 15, 16, 17].
However, despite the success of co-evolutionary learning in solving the problem of games [2, 3, 14, 15, 16,
17] (and other problems in the c ontext of o ptimization [18] and classification [19, 20]), the approach is
not trouble-free for a ll problems. In particular, co-evolutionary learning is now recognized to suffer from
problems (collectively called co-evolutionary pathologies) that affect the performance of a co -evolutionary
learning system [11, 12, 13, 21, 22, 23]. For example, overspecialization to a single so lution can occur
in the population [21], which can be a result of ear lier population disengagement, e.g., some solutions
are favored over others due to large gaps in competence level [13]. When intransitivity exists in the
relationship between solutions, cyclic dynamics may occur during co-evolution, whereby at some point
in the proce ss, the population overspecialize s to a solution that is vulnerable to another solution that
exploits it [11, 12]. Furthermore, when a so lution is driven to extinction but at a later point is adaptively
found again, the co-evolution is said to exhibit forgetting [22, 23].
Co-evolutionary patho logies are usually attributed to the use of relative fitness in the selection process
of a co-evolutionary learning system [8, 24]. There are also broader implications in the design of the co-
evolutionary learning sy stem to its performance that is dependent on the co-evolutionary search dynamics
(i.e., representation [25], variation [21, 25, 26], and selection [27]). This necessitates a more in-depth
study of co-evolutionary learning search dynamics. In particular, one approach that has been given much
attention in the pas t is the monitoring of the progres s of the co-evolution for search of more innovative
and sophisticated solutions, e.g., arms race dynamics [28, 29, 30, 31].
Another a pproach is to consider a global view of increas ing performance in co-evolutionary learning.
However, for this particular investigation, previous studies are restricted to simple problems or problems
where the global v iew is known in adva nce [8, 11]. Tools introduced for monitoring progress of arms race
dynamics are ina ppropriate or may not be suitably adapted for the global view analysis because they
only provide relative performance information o f solutions between different generations. This is because
solutions in the current generation that are better than those in previous generations do not necessarily
imply that they perform better globally when co mpared with new or all possible solutions.
In machine learning, ther e exists a powerful framework, g eneralization, that provides a global view of
performance for learning systems. Generalization refers to the ability of the learning system to find the
solution, which can be viewed in the context of input-output mappings, that best pr e dicts the required
output for any new input that has not been seen during the tr aining process. With the context of
generalizatio n, one is interested in how the learning system can realize the underlying properties of the
problem from a small sample of training da ta (e.g., the input-output set) to produce the solution. For
example, in the case of neur al network training, one is not interested in learning the “exact representation
2

of the training data itself, but rather to build a statistical model of the process which generates the data”
(page 332 of [32]). Furthermore, it should be noted tha t learning is not necessar ily the same as optimizing
because the solution with optimal fitness does not nec e ssarily imply it has the best generalization (unless
an acc urate metric to measure generalization is used in the fitness function) [33].
Co-evolutionary learning, like other learning systems, a lso uses a small training sample during the evo-
lutionary (training) process to produce a solution to the problem. However, co-evolutionary learning is
different fr om other learning systems in that the tra ining samples are not fixed, but instead, are instances
of solutions that are changing (evolving) and are interacting with each other strategically to guide the
evolutionary proc e ss (i.e., learning). Despite this difference, the generalization fra mework can be applied
to co-evolutionary lea rning systems to provide a global view in performance. Although the notion of
generalizatio n is well-understo od in machine learning, there is a lack of theoretical framework to justify
how the generalization performance of a co-evolutionary learning system is determined with resp e c t to
the problem. For example, past studies such as [33, 34] have used a large sample of randomly o bta ined
test cases to estimate the generalization performance. It is not known how accurate such an estimation
is, i.e., how close the estimated generalizatio n performance (using test samples randomly obtained from
the se arch space) to the true generalization performance (using the entire search space).
In this paper, we introduce a theoretica l fr amework that addresses the problem of determining the
generalizatio n performance of co-evolutionary learning in general. We present the framework in terms
of game-playing, i.e., learning game strategies that generalize well to the game (e.g., defeat a large
number of strategies that exist). However, our theoretical framework is general, i.e., the problem can
be put in the context of test-based evaluations (e.g., comparisons between solutions) where so me tests
can reflect the underlying properties (objectives) of the problem that are unknown (game-playing is one
such problem) [12]. We first define generalization performance of a strategy as its average performance
against all test strategies. With this definition, it follows that the best gener alization p erformance for a
co-evolutionary learning s ystem is the one that produces evolved strategies with the maximum average
performance against all strategies .
Although this definition is s imple, the generalization performance ca n be difficult to determine due
to two reasons. First, the analytical function for game outcomes can be unknown, and as such, we
cannot determine the generalization performance by solving analytically a closed-for m formula. Second,
the strategy space can be very large (although finite), thus making it computationally prohibitive. To
address this problem, we propose the alternative of estimating the generalization performance by taking
the average performance of the evolved strategy against a sample of test strategies that are ra ndomly
drawn from the strategy space.
We show through a mathematical analysis using Chebyshev’s Theorem that the probability that the
absolute difference between the estimated and true values exceeding a give n error (precision value) is
bounded by a value that reciprocally depends only on the square of the error and the size of the r andom
test sample. However, this probability bound assumes the worst-case of having maximum va riance fo r the
distribution o f the random variable over a bounded interval. In general, the true varia nce is smaller than
the maximum value. As such, we perform a mathematical analysis and show how a second estimation
of the variance can be used to obtain a tighter b ound. In addition, we also show that for some games,
the true variance of a strategy performance with resp e c t to the stra tegy space is smaller, and as such,
requires a smaller sample size to make the same statistical claim.
With this framework, it is now shown that a sample of randomly obtained test strategies that is much
smaller in size compa red with the total number of strategies in the space is sufficient to estimate the
generalizatio n p e rformance of co-evolutionary learning, and that this estimation is close enough to the
true value. We apply the framework to the co-evolutionary learning of the IPD games to illustrate
both the advantage of the framework and how it can be used in general. In particular, we inve stigate
two different definitions of generalization performance for the IPD game based on different performance
criteria, e.g., in terms of the number of wins based on individual outcomes and in terms of average payoff.
It is well-known that rather than considering the average performance for all cases, one may be more
interested in the average pe rformance for specific c ases that are more commo n or that arise naturally.
In the context of game-playing, one is more interested in the generalization performance of the co-
evolutionary learning system against “good” strategies, and not the average performance against all
strategies since it is possible that a large proportion of strategies in the space are poor or mediocre. To
3

determine generalization performance against this biased sample of good test strategies, we introduce
a simple a pproach to obtain such a test sample through the multiple partial enumerative search of the
strategy space. Each partial enumerative search uses a population size that is much larger than the total
number of possible unique strategies that c an be searched during co-evolution to produce a best strategy.
This approach does not require human expertise (as in generating some arbitrarily designed strateg ies)
and is more comprehensive than the previous single partial enumer ative search that we introduced
earlier in [35, 36, 37]. We show that fo r the co-evolutionary learning of IPD games, the generalization
performance for the case o f a biased and diverse sample is lower compared to the case of an unbiased
sample.
This paper is org anized as follows. Section 2 presents our theoretical framework of generalization pe r-
formance of co-evolutionary learning. Section 3 illustrates how the framework can be applied to the
IPD game. Section 4 investigates the applica tion o f the framework to estimate the generalization per-
formance of co-evolutionary learning of simple IPD games where the true generalization performance
can be determined. Section 5 presents an empiric al study on estimating generalizatio n p e rformance of
co-evolutionary learning of slightly more complex games where the true generalization perfo rmance can-
not be determined. The section also compares the results of estimates based on using an unbiased test
sample and that of using a biased and diverse sample of good” test strategies. Section 6 discusses the
implications of the framework and concludes the paper.
2 Generalization Framework for Co-evolutionary Leaning
2.1 A N eed for a Consistent and General Approach to Estimate the Gener-
alization Performance of Co-evolutionary Learning
Darwen and Yao [33, 34, 35] were among the first to explicitly investigate co-evolutionary learning through
the framework of generalization from machine learning (others include [38]). However, they [33, 34, 35]
studied the is sue through an empiric al approach only. In particular, they investigated the utility of
using some random sample of test cases to estimate the generalization performance of a co-evolutionary
learning system.
There a re other studies that can be related to generalization in the context of co-e volutionary learning.
Wiegand and Potter [39] studied the notion of r obustness (of individual co mponents) in a cooperative
co-evolution setting. Ficici and Pollack [23] studied the notion of solution concepts, e.g., a partitioning
of search space into solutions (that are wanted) and non-solutions for a problem from measuring some
properties and establishing some criteria that the searched point is a solution. Bowling [40] studied the
notion of regret, i.e., measures performance difference b e tween a lea rning algorithm with the best static
strategy during training. Powers and Shoham [41] studied the estimation of best-response through the
use of ra ndom samples. Studies in [42, 43, 44] inve stigated formalisms of monotonic improvement in
co-evolutionary learning and developed algorithmic frameworks that guarantee monotonic improvements
of co-evolving solutions based on archive of test cases.
Here, our main motivation is to develop a framework for a rigorous quantitative analysis of performance
in co-evolutionary learning using the notion of genera lization from machine learning. We are motivated to
address the need for a principled approach to estimate the generalization performance of co- e volutionary
learning. The framework aims to allow one to estimate the generalization performance in general for
problems co-evolutionary learning is us e d to solve and at any point in the evolutionary process wher e
generalizatio n performance is measured.
There are two reasons why measuring generalization performance of co-evolutionary learning is necessary
and important. First, it is used to provide an absolute quality measure on how well a co-evolutionary
learning system is performing with respect to the problem, i.e., how well the co-evolutionary learning
generalizes. Second, it can be used as a means to compare the generalization pe rformance of different
co-evolutionary learning systems with respect to the problem.
We first introduce a theore tical framework that defines explicitly the generalization performance for co-
evolutionary learning, and how it can be determined, i.e., measured. However, it is noted that obtaining
4

the true generalization performance for a co-evolutionary learning system is very difficult. As such,
through the theoretical framework, we pr ovide an a lternative of a consistent and general procedure to
estimate the ge neralization performance. We show a mathematical analysis of how the generalization
performance can be estimated using a random sample o f test cases. We demonstrate the utility of the
estimation procedure by determining the statistical claim that one can make of how confident one is with
the accuracy of the estimate compared to the true generalization for a random test sample of a given
size.
2.2 Estimating Generalization Performance
In co- e volutionary learning, the quality of a solution is determined relative to other solutions. This is
achieved through c omparisons, i.e., interactions between solutions. These interactions can be framed in
terms of game-playing, i.e., an interaction is a ga me played betwe e n two strategies (solutions). The game
outcome of a strategy i against the opponent strategy j is G
i
(j), and conversely, the game outco me of
strategy j a gainst strategy i is denoted G
j
(i). Strategy i is said to solve the test provided by strategy j
if G
i
(j) G
j
(i)
1
.
Here, the absolute quality (generalization performance) o f a strategy i is meas ured with respect to its
exp ected performance in solving tests provided by strategies j. The goal of co-evolutionary learning
for the problem of game-playing is to learn a strategy i with the best generalization performance. In
the fo llowing, we present a theoretical framework for estimating the generalization p erformance of co-
evolutionary learning. It should be noted that the framework is presented in the context of the complete
solution. As such, the framework is directly applicable for the estimation of generalization performance
of a complete solution obtained either by a competitive or a cooper ative co-evolutionary learning system.
2.2.1 True Generalization Performance
The generalization performance of co-evolutionary learning is determined with respect to the evolved
strategy that is produced. The true generalization performance of co-evolutionary learning is defined as
the expected performance of strategy i that is produced after a learning process (co-evolution) against
all str ategies j in the strategy space S. The true generalization pe rformance of s trategy i, G
i
, can be
written as follows:
G
i
= E
P
1
(j)
[G
i
(j)] =
Z
S
G
i
(j)P
1
(j)dj, (1)
where G
i
is the expectation of strategy’s i performance against j, G
i
(j), with respect to the distribution
P
1
(j) over strategy space S (i.e., the distribution with which opponent strategies j are drawn). Note that
this definition of generalization does not imply hav ing to compare with all strategies that exist. Instead,
P
1
(j), which can be s pecified by the strateg y representation (e.g., neural networks), can be induced to
the strategy space S. As such, some strategies j that exist may not be included to determine G
i
because
P
1
(j) = 0.
There are two difficulties to apply Equation 1 directly. First, the analytical form for G
i
(j) is not known
or difficult to obtain (even for simple games). Second, the distribution P
1
(j) over S may be unknown
(at best we can only sample from S from a str ategy representation that is used).
However, it is possible for these games to have a finite number of possible unique strategies, e.g., S is
discrete and finite, or that we can consider a subset of S that is discrete and finite by inducing some
strategy distribution P
1
(j) to S. For the purpose of presentation and simplicity, we assume a uniform
strategy distribution in S.
With this, one can compute the true g e neralization per fo rmance of co-evolutionary learning through a
strategy i, which is simply its average performance against all strategies j, and can be written in the
1
For example, in a zero-sum game such as chess, one can say that strategy i solves the test provided by str ategy j if i
defeats j, i.e., G
i
(j) > G
j
(i).
5

Figures
Citations
More filters
Journal ArticleDOI

The Theory of Probability.

TL;DR: The Elements of Statistics Appendix as mentioned in this paper Theoretically, the classical limit theorem and the theory of infinite-divisible distributions have been studied in the context of random variables and distribution functions.
Journal ArticleDOI

Higher Order Mutation Testing

TL;DR: The paper introduces the concept of a subsuming HOM; one that is harder to kill than the first order mutants from which it is constructed, by definition, subsumed HOMs denote subtle fault combinations.
Proceedings ArticleDOI

Large Scale Global Optimization using Differential Evolution with self-adaptation and cooperative co-evolution

TL;DR: The proposed algorithm is named DEwSAcc and is based on Differential Evolution algorithm, which is a floating-point encoding evolutionary algorithm for global optimization over continuous spaces based on log-normal self-adaptation of its control parameters and combined with cooperative co-evolution as a dimension decomposition mechanism.
Proceedings ArticleDOI

A Manifesto for Higher Order Mutation Testing

TL;DR: This paper argues that Search Based Software Engineering can provide a solution to this apparent problem of real faults in Higher Order Mutation Testing, citing results from recent work on search based optimization techniques for constructing higher order mutants.
Journal ArticleDOI

The Effect of Memory Size on the Evolutionary Stability of Strategies in Iterated Prisoner's Dilemma

TL;DR: It is proved that longer memory strategies outperform shorter memory strategies statistically in the sense of evolutionary stability and given an example of a memory-two strategy to show how the theoretical study of evolutionary Stability assists in developing novel strategies.
References
More filters
Book

Neural networks for pattern recognition

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Book

The Evolution of Cooperation

TL;DR: In this paper, a model based on the concept of an evolutionarily stable strategy in the context of the Prisoner's Dilemma game was developed for cooperation in organisms, and the results of a computer tournament showed how cooperation based on reciprocity can get started in an asocial world, can thrive while interacting with a wide range of other strategies, and can resist invasion once fully established.
Book ChapterDOI

Neural Networks for Pattern Recognition

TL;DR: The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue.
Journal ArticleDOI

The Evolution of Cooperation

TL;DR: A model is developed based on the concept of an evolutionarily stable strategy in the context of the Prisoner's Dilemma game to show how cooperation based on reciprocity can get started in an asocial world, can thrive while interacting with a wide range of other strategies, and can resist invasion once fully established.
Journal ArticleDOI

An introduction to simulated evolutionary optimization

TL;DR: The development of each of these procedures over the past 35 years is described and some recent efforts in these areas are reviewed.
Frequently Asked Questions (6)
Q1. What are the contributions in "Measuring generalization performance in co-evolutionary learning" ?

In this paper, the authors introduce a theoretical framework to address this research issue. The authors present the framework in terms of game-playing although their results are more general. Given that the true value may not be determined by solving analytically a closed-form formula and is computationally prohibitive, the authors propose an estimation procedure that computes the average performance against a small sample of random test strategies instead. The authors perform a mathematical analysis to provide a statistical claim on the accuracy of their estimation procedure, which can be further improved by performing a second estimation on the variance of the random variable. The authors introduce a simple approach to obtain such a test sample through the multiple partial enumerative search of the strategy space that does not require human expertise and is generally applicable to a wide range of domains. The authors investigate two definitions of generalization performance for the IPD game based on different performance criteria, e. g., in terms of the number of wins based on individual outcomes and in terms of average payoff. The authors show that a small sample of test strategies can be used to estimate the generalization performance. The authors also show that the generalization performance using a biased and diverse set of “ good ” test strategies is lower compared to the unbiased case for the IPD game. 

The maximum variance for a random variable over [a, b] is when half of the mass is at a and the other half is at b, i.e., σ2MAX = R 2 X/4 where RX = b−a. 

Given that there are 22 combinations of such tables, one can easily calculate the proportion of “all cooperate” in S as 22/25 (since the total number of unique combinations of the table is 22 2+1). 

The true generalization performance of co-evolutionary learning is defined as the expected performance of strategy i that is produced after a learning process (co-evolution) against all strategies j in the strategy space S. 

The authors require the population size for each partial enumerative search, PS, to be larger than the maximum number of unique strategies that can be obtained from an evolutionary run, i.e., PS > (generation × POPSIZE), for two reasons. 

The authors also show how a strategy’s performance profile with respect to the strategy space, the variance σ2, can be used through a second estimation to obtain a tighter Chebyshev’s bound.