What are the contributions in "Measuring generalization performance in co-evolutionary learning" ?

In this paper, the authors introduce a theoretical framework to address this research issue. The authors present the framework in terms of game-playing although their results are more general. Given that the true value may not be determined by solving analytically a closed-form formula and is computationally prohibitive, the authors propose an estimation procedure that computes the average performance against a small sample of random test strategies instead. The authors perform a mathematical analysis to provide a statistical claim on the accuracy of their estimation procedure, which can be further improved by performing a second estimation on the variance of the random variable. The authors introduce a simple approach to obtain such a test sample through the multiple partial enumerative search of the strategy space that does not require human expertise and is generally applicable to a wide range of domains. The authors investigate two definitions of generalization performance for the IPD game based on different performance criteria, e. g., in terms of the number of wins based on individual outcomes and in terms of average payoff. The authors show that a small sample of test strategies can be used to estimate the generalization performance. The authors also show that the generalization performance using a biased and diverse set of “ good ” test strategies is lower compared to the unbiased case for the IPD game.

What is the maximum variance for a random variable over a compact interval?

The maximum variance for a random variable over [a, b] is when half of the mass is at a and the other half is at b, i.e., σ2MAX = R 2 X/4 where RX = b−a.

How many combinations of tables can be used to calculate the proportion of “all cooperate” in S?

Given that there are 22 combinations of such tables, one can easily calculate the proportion of “all cooperate” in S as 22/25 (since the total number of unique combinations of the table is 22 2+1).

Why do the authors require a larger population size for each partial enumerative search?

The authors require the population size for each partial enumerative search, PS, to be larger than the maximum number of unique strategies that can be obtained from an evolutionary run, i.e., PS > (generation × POPSIZE), for two reasons.

How can the authors obtain a tighter Chebyshev’s bound?

The authors also show how a strategy’s performance profile with respect to the strategy space, the variance σ2, can be used through a second estimation to obtain a tighter Chebyshev’s bound.

(Open Access) Measuring Generalization Performance in Coevolutionary Learning (2008) | Siang Yew Chong

Measuring Generalization Performance in Co-evolutionary

Learning

Siang Y. Chong, Peter Ti˜no, and Xin Yao

The Centre of Excellence for Research in

Computational Intelligence and Applications (CERCIA)

School of Computer Science

The University of Birmingham, UK.

E-mails: S.Y.Chong@cs.bham.ac.uk, P.Tino@cs.bham.ac.uk, X.Yao@cs.bham.ac.uk

Abstract

Co-evolutionary learning involves a tra ining process where training samples are insta nce s of solutions that

interact strategically to guide the evolutionary (learning) process. One main research issue is with the

generalizatio n performance, i.e., the search for solutions (e.g., input-output mappings) that best predict

the required output for any new input that has not been seen during the evolutionary process. Howe ver,

there is currently no such framework for determining the generalization performa nce in co-evolutionary

learning even though the notion of generaliza tio n is well-understood in machine learning. In this paper,

we introduce a theoretical fra mework to address this r esearch issue. We present the framework in terms

of game-playing although our results ar e more general. Here, a strategy’s generalizatio n perfo rmance is

its average performance against all test strategies. Given that the true value may not be deter mined by

solving analytically a closed-form formula and is computationally pro hibitive, we propos e an estimation

procedure that computes the average performance against a small sample of random test strategies

instead. We perform a mathematical analysis to provide a statistical claim on the accuracy of our

estimation procedure, which can be further improved by performing a second estimation on the variance of

the random variable. For game-playing, it is well-known that one is more interested in the generalization

performance against a biased and diverse sample of “good” test s trategies. We introduce a simple

approach to obtain such a test sample thro ugh the multiple partial enumerative search of the strategy

space that do e s not require human expertise and is generally applicable to a wide range of domains. We

illustrate the generalization framework on the co-evolutionary learning of the iterated prisoner’s dilemma

(IPD) games. We investigate two deﬁnitions of generalization per formance for the IPD game based on

diﬀerent performance criteria, e.g., in terms of the number of wins based on individual outcomes and

in terms of average payoﬀ. We show that a small sample of test strategies can be used to estimate

the generalization performance. We als o show that the generalization p e rformance using a biased and

diverse set of “good” test strategies is lower compared to the unbiased case for the IPD game. This

is the ﬁrst time that generalization is deﬁned and analyzed rigorously in co-evolutionary learning. The

framework allows the evaluation of the generalization performance of any co-evolutionary learning sys tem

quantitatively.

Keywords: Evolutionary computation, co-evol utionary learning, gene ralization, Cheby-

shev’s inequality, iterated prisoner’s dilemma.

1 Introd uction

Co-evolutionary learning refers to a broad class of p opulation-based, stochastic search algorithms that

involve the simultaneous evolution of competing solutions with coupled ﬁtness [1]. A co-evolutionary

learning system can be implemented using co-evolutionary algorithms [2, 3], which can be derived from

evolutionary algorithms (EAs) [4 , 5]. That is, both co-evolutionary learning and EAs can be described

in terms of the framework whereby an adaptation process is carried out on the solutions in some form

of representation through a repea ted process of variation and selec tio n. The framework distinguishes

co-evolutionary learning and EAs in general from classical approaches (e.g., steepest-descent-based a lgo-

rithms) from two speciﬁc features, i.e., they are population-base d and incorpora te information exchange

mechanisms between populations in successive generations (iterative steps) to guide the search process

[6].

Despite their s imila rity in framework, co-evolutionary learning and EAs are fundamentally diﬀerent

in how the ﬁtness of a solution is assig ned, leading to signiﬁcantly diﬀerent outcomes when applied

to similar problems (e.g., diﬀerent s e arch behaviors on the space of solutions [7, 8]). EAs are often

viewed and constructed in terms of an optimization context [4, 5], whereby an absolute ﬁtness function

is required to assign the ﬁtness value to a solution that is objective (ﬁtness value for a solution is

always the same regar dless of the context). For co-evolutionary learning, the ﬁtness of a solution is

obtained through its interactions with other competing solutions in the cur rent population, and as such,

is subjective (ﬁtness value for a solution depends on the context, e.g., population, and as such is relative

and dynamic). Here, co-evolutionary learning operates to ﬁnd solutions guided by strategic interactions

among the competing solutions from one generation to the next that results (hopefully) in an arms race

of increasingly innovative solutions [9, 10, 11].

One of the early motivations for using co-evolutionary learning is its potential application for s olving

problems that cannot be framed in the context of optimization (e.g., using EAs) because it is not possible

or very diﬃcult to fo rmulate an absolute ﬁtness function that reﬂects the underlying properties of the

problem. For such problems, continued use of an inappropriate ﬁtness function will often bias the search

to s olutions that do no t reﬂect the underlying prope rties of the problem, leading to suboptimal solutions

[12]. Even if a ﬁtness function can be for mulated, it may not be able to e valuate and diﬀerentiate

between individual solutions to provide some gradient to direct the search when using EAs [8, 1 3]. One

such problem that is diﬃcult to solve using EAs, but can be naturally framed in co-evolutionary learning,

is the problem of game-playing [2, 3, 14, 15, 16, 17].

However, despite the success of co-evolutionary learning in solving the problem of games [2, 3, 14, 15, 16,

17] (and other problems in the c ontext of o ptimization [18] and classiﬁcation [19, 20]), the approach is

not trouble-free for a ll problems. In particular, co-evolutionary learning is now recognized to suﬀer from

problems (collectively called co-evolutionary pathologies) that aﬀect the performance of a co -evolutionary

learning system [11, 12, 13, 21, 22, 23]. For example, overspecialization to a single so lution can occur

in the population [21], which can be a result of ear lier population disengagement, e.g., some solutions

are favored over others due to large gaps in competence level [13]. When intransitivity exists in the

relationship between solutions, cyclic dynamics may occur during co-evolution, whereby at some point

in the proce ss, the population overspecialize s to a solution that is vulnerable to another solution that

exploits it [11, 12]. Furthermore, when a so lution is driven to extinction but at a later point is adaptively

found again, the co-evolution is said to exhibit forgetting [22, 23].

Co-evolutionary patho logies are usually attributed to the use of relative ﬁtness in the selection process

of a co-evolutionary learning system [8, 24]. There are also broader implications in the design of the co-

evolutionary learning sy stem to its performance that is dependent on the co-evolutionary search dynamics

(i.e., representation [25], variation [21, 25, 26], and selection [27]). This necessitates a more in-depth

study of co-evolutionary learning search dynamics. In particular, one approach that has been given much

attention in the pas t is the monitoring of the progres s of the co-evolution for search of more innovative

and sophisticated solutions, e.g., arms race dynamics [28, 29, 30, 31].

Another a pproach is to consider a global view of increas ing performance in co-evolutionary learning.

However, for this particular investigation, previous studies are restricted to simple problems or problems

where the global v iew is known in adva nce [8, 11]. Tools introduced for monitoring progress of arms race

dynamics are ina ppropriate or may not be suitably adapted for the global view analysis because they

only provide relative performance information o f solutions between diﬀerent generations. This is because

solutions in the current generation that are better than those in previous generations do not necessarily

imply that they perform better globally when co mpared with new or all possible solutions.

In machine learning, ther e exists a powerful framework, g eneralization, that provides a global view of

performance for learning systems. Generalization refers to the ability of the learning system to ﬁnd the

solution, which can be viewed in the context of input-output mappings, that best pr e dicts the required

output for any new input that has not been seen during the tr aining process. With the context of

generalizatio n, one is interested in how the learning system can realize the underlying properties of the

problem from a small sample of training da ta (e.g., the input-output set) to produce the solution. For

example, in the case of neur al network training, one is not interested in learning the “exact representation

of the training data itself, but rather to build a statistical model of the process which generates the data”

(page 332 of [32]). Furthermore, it should be noted tha t learning is not necessar ily the same as optimizing

because the solution with optimal ﬁtness does not nec e ssarily imply it has the best generalization (unless

an acc urate metric to measure generalization is used in the ﬁtness function) [33].

Co-evolutionary learning, like other learning systems, a lso uses a small training sample during the evo-

lutionary (training) process to produce a solution to the problem. However, co-evolutionary learning is

diﬀerent fr om other learning systems in that the tra ining samples are not ﬁxed, but instead, are instances

of solutions that are changing (evolving) and are interacting with each other strategically to guide the

evolutionary proc e ss (i.e., learning). Despite this diﬀerence, the generalization fra mework can be applied

to co-evolutionary lea rning systems to provide a global view in performance. Although the notion of

generalizatio n is well-understo od in machine learning, there is a lack of theoretical framework to justify

how the generalization performance of a co-evolutionary learning system is determined with resp e c t to

the problem. For example, past studies such as [33, 34] have used a large sample of randomly o bta ined

test cases to estimate the generalization performance. It is not known how accurate such an estimation

is, i.e., how close the estimated generalizatio n performance (using test samples randomly obtained from

the se arch space) to the true generalization performance (using the entire search space).

In this paper, we introduce a theoretica l fr amework that addresses the problem of determining the

generalizatio n performance of co-evolutionary learning in general. We present the framework in terms

of game-playing, i.e., learning game strategies that generalize well to the game (e.g., defeat a large

number of strategies that exist). However, our theoretical framework is general, i.e., the problem can

be put in the context of test-based evaluations (e.g., comparisons between solutions) where so me tests

can reﬂect the underlying properties (objectives) of the problem that are unknown (game-playing is one

such problem) [12]. We ﬁrst deﬁne generalization performance of a strategy as its average performance

against all test strategies. With this deﬁnition, it follows that the best gener alization p erformance for a

co-evolutionary learning s ystem is the one that produces evolved strategies with the maximum average

performance against all strategies .

Although this deﬁnition is s imple, the generalization performance ca n be diﬃcult to determine due

to two reasons. First, the analytical function for game outcomes can be unknown, and as such, we

cannot determine the generalization performance by solving analytically a closed-for m formula. Second,

the strategy space can be very large (although ﬁnite), thus making it computationally prohibitive. To

address this problem, we propose the alternative of estimating the generalization performance by taking

the average performance of the evolved strategy against a sample of test strategies that are ra ndomly

drawn from the strategy space.

We show through a mathematical analysis using Chebyshev’s Theorem that the probability that the

absolute diﬀerence between the estimated and true values exceeding a give n error (precision value) is

bounded by a value that reciprocally depends only on the square of the error and the size of the r andom

test sample. However, this probability bound assumes the worst-case of having maximum va riance fo r the

distribution o f the random variable over a bounded interval. In general, the true varia nce is smaller than

the maximum value. As such, we perform a mathematical analysis and show how a second estimation

of the variance can be used to obtain a tighter b ound. In addition, we also show that for some games,

the true variance of a strategy performance with resp e c t to the stra tegy space is smaller, and as such,

requires a smaller sample size to make the same statistical claim.

With this framework, it is now shown that a sample of randomly obtained test strategies that is much

smaller in size compa red with the total number of strategies in the space is suﬃcient to estimate the

generalizatio n p e rformance of co-evolutionary learning, and that this estimation is close enough to the

true value. We apply the framework to the co-evolutionary learning of the IPD games to illustrate

both the advantage of the framework and how it can be used in general. In particular, we inve stigate

two diﬀerent deﬁnitions of generalization performance for the IPD game based on diﬀerent performance

criteria, e.g., in terms of the number of wins based on individual outcomes and in terms of average payoﬀ.

It is well-known that rather than considering the average performance for all cases, one may be more

interested in the average pe rformance for speciﬁc c ases that are more commo n or that arise naturally.

In the context of game-playing, one is more interested in the generalization performance of the co-

evolutionary learning system against “good” strategies, and not the average performance against all

strategies since it is possible that a large proportion of strategies in the space are poor or mediocre. To

determine generalization performance against this biased sample of good test strategies, we introduce

a simple a pproach to obtain such a test sample through the multiple partial enumerative search of the

strategy space. Each partial enumerative search uses a population size that is much larger than the total

number of possible unique strategies that c an be searched during co-evolution to produce a best strategy.

This approach does not require human expertise (as in generating some arbitrarily designed strateg ies)

and is more comprehensive than the previous single partial enumer ative search that we introduced

earlier in [35, 36, 37]. We show that fo r the co-evolutionary learning of IPD games, the generalization

performance for the case o f a biased and diverse sample is lower compared to the case of an unbiased

sample.

This paper is org anized as follows. Section 2 presents our theoretical framework of generalization pe r-

formance of co-evolutionary learning. Section 3 illustrates how the framework can be applied to the

IPD game. Section 4 investigates the applica tion o f the framework to estimate the generalization per-

formance of co-evolutionary learning of simple IPD games where the true generalization performance

can be determined. Section 5 presents an empiric al study on estimating generalizatio n p e rformance of

co-evolutionary learning of slightly more complex games where the true generalization perfo rmance can-

not be determined. The section also compares the results of estimates based on using an unbiased test

sample and that of using a biased and diverse sample of “good” test strategies. Section 6 discusses the

implications of the framework and concludes the paper.

2 Generalization Framework for Co-evolutionary Leaning

2.1 A N eed for a Consistent and General Approach to Estimate the Gener-

alization Performance of Co-evolutionary Learning

Darwen and Yao [33, 34, 35] were among the ﬁrst to explicitly investigate co-evolutionary learning through

the framework of generalization from machine learning (others include [38]). However, they [33, 34, 35]

studied the is sue through an empiric al approach only. In particular, they investigated the utility of

using some random sample of test cases to estimate the generalization performance of a co-evolutionary

learning system.

There a re other studies that can be related to generalization in the context of co-e volutionary learning.

Wiegand and Potter [39] studied the notion of r obustness (of individual co mponents) in a cooperative

co-evolution setting. Ficici and Pollack [23] studied the notion of solution concepts, e.g., a partitioning

of search space into solutions (that are wanted) and non-solutions for a problem from measuring some

properties and establishing some criteria that the searched point is a solution. Bowling [40] studied the

notion of regret, i.e., measures performance diﬀerence b e tween a lea rning algorithm with the best static

strategy during training. Powers and Shoham [41] studied the estimation of best-response through the

use of ra ndom samples. Studies in [42, 43, 44] inve stigated formalisms of monotonic improvement in

co-evolutionary learning and developed algorithmic frameworks that guarantee monotonic improvements

of co-evolving solutions based on archive of test cases.

Here, our main motivation is to develop a framework for a rigorous quantitative analysis of performance

in co-evolutionary learning using the notion of genera lization from machine learning. We are motivated to

address the need for a principled approach to estimate the generalization performance of co- e volutionary

learning. The framework aims to allow one to estimate the generalization performance in general for

problems co-evolutionary learning is us e d to solve and at any point in the evolutionary process wher e

generalizatio n performance is measured.

There are two reasons why measuring generalization performance of co-evolutionary learning is necessary

and important. First, it is used to provide an absolute quality measure on how well a co-evolutionary

learning system is performing with respect to the problem, i.e., how well the co-evolutionary learning

generalizes. Second, it can be used as a means to compare the generalization pe rformance of diﬀerent

co-evolutionary learning systems with respect to the problem.

We ﬁrst introduce a theore tical framework that deﬁnes explicitly the generalization performance for co-

evolutionary learning, and how it can be determined, i.e., measured. However, it is noted that obtaining

the true generalization performance for a co-evolutionary learning system is very diﬃcult. As such,

through the theoretical framework, we pr ovide an a lternative of a consistent and general procedure to

estimate the ge neralization performance. We show a mathematical analysis of how the generalization

performance can be estimated using a random sample o f test cases. We demonstrate the utility of the

estimation procedure by determining the statistical claim that one can make of how conﬁdent one is with

the accuracy of the estimate compared to the true generalization for a random test sample of a given

size.

2.2 Estimating Generalization Performance

In co- e volutionary learning, the quality of a solution is determined relative to other solutions. This is

achieved through c omparisons, i.e., interactions between solutions. These interactions can be framed in

terms of game-playing, i.e., an interaction is a ga me played betwe e n two strategies (solutions). The game

outcome of a strategy i against the opponent strategy j is G

(j), and conversely, the game outco me of

strategy j a gainst strategy i is denoted G

(i). Strategy i is said to solve the test provided by strategy j

if G

(j) ≥ G

(i)

Here, the absolute quality (generalization performance) o f a strategy i is meas ured with respect to its

exp ected performance in solving tests provided by strategies j. The goal of co-evolutionary learning

for the problem of game-playing is to learn a strategy i with the best generalization performance. In

the fo llowing, we present a theoretical framework for estimating the generalization p erformance of co-

evolutionary learning. It should be noted that the framework is presented in the context of the complete

solution. As such, the framework is directly applicable for the estimation of generalization performance

of a complete solution obtained either by a competitive or a cooper ative co-evolutionary learning system.

2.2.1 True Generalization Performance

The generalization performance of co-evolutionary learning is determined with respect to the evolved

strategy that is produced. The true generalization performance of co-evolutionary learning is deﬁned as

the expected performance of strategy i that is produced after a learning process (co-evolution) against

all str ategies j in the strategy space S. The true generalization pe rformance of s trategy i, G

, can be

written as follows:

= E

(j)

(j)] =

(j)P

(j)dj, (1)

where G

is the expectation of strategy’s i performance against j, G

(j), with respect to the distribution

(j) over strategy space S (i.e., the distribution with which opponent strategies j are drawn). Note that

this deﬁnition of generalization does not imply hav ing to compare with all strategies that exist. Instead,

(j), which can be s peciﬁed by the strateg y representation (e.g., neural networks), can be induced to

the strategy space S. As such, some strategies j that exist may not be included to determine G

because

(j) = 0.

There are two diﬃculties to apply Equation 1 directly. First, the analytical form for G

(j) is not known

or diﬃcult to obtain (even for simple games). Second, the distribution P

(j) over S may be unknown

(at best we can only sample from S from a str ategy representation that is used).

However, it is possible for these games to have a ﬁnite number of possible unique strategies, e.g., S is

discrete and ﬁnite, or that we can consider a subset of S that is discrete and ﬁnite by inducing some

strategy distribution P

(j) to S. For the purpose of presentation and simplicity, we assume a uniform

strategy distribution in S.

With this, one can compute the true g e neralization per fo rmance of co-evolutionary learning through a

strategy i, which is simply its average performance against all strategies j, and can be written in the

For example, in a zero-sum game such as chess, one can say that strategy i solves the test provided by str ategy j if i

defeats j, i.e., G

(j) > G

(i).

Measuring Generalization Performance in Coevolutionary Learning

Figures

Citations

The Theory of Probability.

Higher Order Mutation Testing

Large Scale Global Optimization using Differential Evolution with self-adaptation and cooperative co-evolution

A Manifesto for Higher Order Mutation Testing

The Effect of Memory Size on the Evolutionary Stability of Strategies in Iterated Prisoner's Dilemma

References

Neural networks for pattern recognition

The Evolution of Cooperation

Neural Networks for Pattern Recognition

The Evolution of Cooperation

An introduction to simulated evolutionary optimization

Related Papers (5)

Co-evolution in iterated prisoner's dilemma with intermediate levels of cooperation: application to missile defense

The Evolution of Cooperation

Behavioral diversity, choices and noise in the iterated prisoner's dilemma

Coevolutionary dynamics in a minimal substrate

The Evolution of Strategies in the Iterated Prisoner's Dilemma

Frequently Asked Questions (6)

Q1. What are the contributions in "Measuring generalization performance in co-evolutionary learning" ?

Q2. What is the maximum variance for a random variable over a compact interval?

Q3. How many combinations of tables can be used to calculate the proportion of “all cooperate” in S?

Q4. What is the true generalization performance of co-evolutionary learning?

Q5. Why do the authors require a larger population size for each partial enumerative search?

Q6. How can the authors obtain a tighter Chebyshev’s bound?