What contributions have the authors mentioned in the paper "Domain of competence of xcs classifier system in complexity measurement space" ?

This paper investigates the domain of competence of XCS by means of a methodology that characterizes the complexity of a classification problem by a set of geometrical descriptors. In a study of 392 classification problems along with their complexity characterization, the authors are able to identify difficult and easy domains for XCS. The authors also compare the relative performance of XCS with other traditional classifier schemes.

What is the purpose of the search component in XCS?

The search component in XCS is responsible for improving the ruleset, by discovering new promising classifiers and deleting the ones that do not contribute to the knowledge.

What is the role of the reinforcement component?

The role of the reinforcement component is to evaluate the current classifiers, so that highly fit classifiers correspond to consistent (accurate) descriptions of the target concept.

how is the xcs classification system based on the generalization hypothesis?

In XCS, this is achieved via a niche GA, by means of: a) the selection operator, which applies local fitness pressure within niches, b) crossover, which is restricted to related classifiers and c) deletion, which tries to balance the size of the niches.

what causes the evolution of maximally general classifiers?

The fact that the GA takes place in the action sets rather than in the whole population produces a generalization pressure, which leads to the evolution of maximally general classifiers.

What is the probability of a classifier being deleted?

If one of the classifier’s parents is sufficiently experienced ( & 4 !#"%$ , where7 !#"%$ is a threshold set by the user), accurate ( 5 5$" ) and more general than the classifier, then the classifier is discarded, and the parent’s numerosity is incremented by one.

What other parameters are used to qualify each classifier?

There are other5 parameters qualifying each classifier, such as: the experience of the classifier (denoted as exp), the average size of the action sets where the classifier has participated (as), the time-step of the last application of the genetic algorithm (ts) and the number of actual micro-classifiers this macroclassifier2 represents, called numerosity (num).

What is the description of the XCS system?

The codification based on the ternary alphabet (as described in section II-A) has proved to be well suited for a varied range of domains with binary attributes.

Why was subsumption introduced in the first place?

Subsumption was introduced in [2] in order to eliminate some specialized classifiers from the population which were already covered by other accurate and more general classifiers.

What is the inverse function of the classifier’s error?

the prediction error is updated:5 5 4 5 % (2) Then, the classifier’s accuracy is computed as an inverse function of the classifier’s error:.

(Open Access) Domain of competence of XCS classifier system in complexity measurement space (2005) | Ester Bernadó-Mansilla

Q: What are the main parameters of a classifier?

Three main parameters estimate the quality of each classifier: a) the payoff prediction 4 , an estimate of the payoff that the classifier will receive if its condition matches the input and its action is selected, b) the prediction error 5 , which estimates the average error between the classifier’s prediction and the received payoff and c) the fitness 6 , an estimate of the accuracy of the payoff prediction.

Domain of Competence of XCS

Classiﬁer System

in Complexity Measurement Space

Ester Bernad´o Mansilla



and Tin Kam Ho





Computer Engineering Department

Enginyeria i Arquitectura La Salle, Ramon Llull University

Quatre Camins, 2. 08022 Barcelona, Spain

E-mail: esterb@salleurl.edu Tel: +34 932 902 433



Computing Sciences Research Center, Bell Laboratories, Lucent Technologies

700 Mountain Avenue, 2C-425

Murray Hill, NJ 07974-0636 USA

E-mail: tkh@research.bell-labs.com Tel: +1 908 582 5989

Abstract

The XCS classiﬁer system has recently shown a high degree of competence on a variety of data mining

problems. But to what kind of problems XCS is well and poorly suited is seldom understood, especially

for real-world classiﬁcation problems. The major inconvenience has been attributed to the difﬁculty of

determining the intrinsic characteristics of real-world classiﬁcation problems. This paper investigates the

domain of competence of XCS by means of a methodology that characterizes the complexity of a classi-

ﬁcation problem by a set of geometrical descriptors. In a study of 392 classiﬁcation problems along with

their complexity characterization, we are able to identify difﬁcult and easy domains for XCS. We focus

on XCS with hyperrectangle codiﬁcation, which has been predominantly used for real-attributed domains.

The results show high correlations between XCS’s performance and measures of length of class boundaries,

compactness of classes and non-linearities of decision boundaries. We also compare the relative performance

of XCS with other traditional classiﬁer schemes. Besides conﬁrming the high degree of competence of XCS

in these problems, we are able to relate the behavior of the different classiﬁer schemes to the geometrical

complexity of the problem. Moreover, the results highlight certain regions of the complexity measurement

space where a classiﬁer scheme excels, establishing a ﬁrst step towards determining the best classiﬁer

scheme for a given classiﬁcation problem.

Index Terms

Learning classiﬁer systems, geometrical complexity, genetic algorithms, pattern recognition, machine

learning, classiﬁcation.

I. INTRODUCTION

XCS [1], [2] is a classiﬁer system that combines reinforcement learning [3] and genetic algorithms

(GA) [4], [5] to evolve a set of rules representing the target concept. XCS descends from the lineage of

learning classiﬁer systems (LCS), which were ﬁrst introduced by Holland [4], [6], [7]. Its success and

robust performance in a variety of domains establish XCS as one of the major developments of learning

classiﬁer systems.

Recent investigations of XCS have been focused on a variety of aspects, with the common goal of

improving our understanding of the system. The result has often been better performance and wider

applicability in several domains. Some of these studies investigate XCS from an algorithmic point of

view. Some remarkable efforts in this direction are [8], [9]. They investigate the dynamics of the different

components of XCS, and how these components interact to evolve optimal rules that represent the target

concept in an accurate and compact way. Other studies focus on certain components of the algorithm,

such as the deletion algorithms [10], the deﬁnition of ﬁtness [11], or the knowledge representation [12]–

[14]. Most of these studies are restricted to artiﬁcially designed problems, because they are easy to analyze

and allow control of their degree of complexity to some extent.

Other types of investigations focus on the applicability of XCS to real-world domains. In [15], XCS is

applied to a data mining problem with a high degree of performance, both in terms of accuracy rate

and explanatory capabilities. This study is extended in [16] with a varied set of data mining problems,

where XCS appears to perform competitively with respect to other classiﬁer techniques, such as nearest

neighbors and decision trees. These studies provide a comparison of accuracy rates between XCS and

other classiﬁer schemes

and show that XCS is competitive. However, this information is of limited use,

as there is a lack of a deeper understanding of the general behavior of XCS in real-world problems,

i.e., what types of problems XCS is well or poorly suited to, and the reasons of the success or failure

of XCS compared to other classiﬁer schemes. Moreover, it is well known that there is not any classiﬁer

scheme that globally dominates in every domain. Instead, there might be some types of problems where

a particular classiﬁer scheme excels [17]. Therefore, a major issue of current research is to identify the

domains of competence of a particular classiﬁer scheme. One of the major obstacles in this kind of

In the context of XCS, a classiﬁer is a rule and a set of associated parameters. XCS evolves a set of such classiﬁers. The term

classiﬁer is also used by the pattern recognition community to refer to the whole system that classiﬁes (e.g., nearest neighbor

classiﬁer, linear classiﬁer, etc). In section II we use this term following the terminology used in XCS. In the rest of the paper the

term classiﬁer is used in the sense of the whole system, unless properly indicated.

investigation is the difﬁculty of characterizing the differences between various real-world problems, and

relating the classiﬁer’s behavior to such differences.

This paper is in line of recent efforts in the pattern recognition community to characterize the behavior

of a classiﬁer system related to the features of the problem. We identify the features of a problem that

are most relevant to classiﬁcation accuracy as measurements of the problem’s geometrical complexity.

The paper analyzes the domain of applicability of XCS in such a measurement space. In particular, the

paper addresses the following issues:

1) What is the complexity of a real-world classiﬁcation problem? We will analyze the sources of difﬁculty

in a classiﬁcation problem. We will focus on the description of the geometrical complexity of the

problem, such as the degree of clustering of the points of the same class, the proximity between the

classes, and other factors that are critical for classiﬁcation accuracy. We emphasize that we consider

only those descriptors that can be extracted directly from the dataset. Therefore the analysis can be

applied to arbitrary real-world data without reliance on an assumption of the generating model.

2) Given a real-world classiﬁcation problem characterized by a set of complexity descriptors, is XCS well suited

for that problem? That is, given a certain classiﬁcation problem, is XCS applicable? Will XCS be

able to extract an accurate knowledge representation? This issue has special relevance when we

apply XCS to a real-world classiﬁcation problem, where the intrinsic difﬁculties of the problem are

often unknown, and are difﬁcult to separate from the inadequacy of the classiﬁcation algorithm.

With the characterization of a real-world classiﬁcation problem by a set of complexity descriptors

that are not directly dependent on the classiﬁer, we are able to investigate the relation between

XCS’s performance and problem complexity. This study will identify which kind of problems XCS

is particularly well suited and poorly suited to. In order to infer conclusions which show the general

tendency of XCS in real-world problems, we will use an extended set of 392 problems. This gives

better coverage on the variety of problems than some previous studies on XCS’s performance [16]

that relied on a small set of problems (about 20 typically) or only artiﬁcial ones [18].

3) Given a classiﬁcation problem, what is the best suited classiﬁer scheme? As observed previously, there

is not an outstanding classiﬁer scheme that dominates in all sorts of classiﬁcation problems. There

are certain types of classiﬁcation problems for which particular kinds of classiﬁers are best suited.

Thus, a central issue is to identify the good matches between classiﬁer schemes and problems. We

will analyze XCS’s performance in comparison to other learning algorithms, and relate the results to

the complexity of the problem. Previous studies could hardly explain why XCS was better or worse

than a particular classiﬁer, and in which cases this happened. Relating the respective performances

to the complexity characterization of each problem will give us a handle in understanding what

kind of problems are best suited for a particular classiﬁer and, even more important, why.

This study is aimed at a better understanding of XCS’s behavior in real-world classiﬁcation problems.

Detecting where XCS has difﬁculties may lead to improvement of the method. The study will also lay

the bases for the characterization of the problems in a space of complexity metrics, and the identiﬁcation

of what kind of classiﬁers are more appropriate for certain regions of the measurement space.

The rest of this paper is structured as follows. First, we present a brief overview of XCS, describing

how the different components are designed to achieve the learning goals. Next, since XCS’s performance

depends both on the algorithmic components and the knowledge representation, we analyze the knowl-

edge representation used in domains with real valued attributes. We investigate some cases where XCS

encounters different levels of difﬁculty due to the geometry of the problem. Section IV analyzes the

sources of difﬁculty of real-world classiﬁcation problems, and proposes several measures that represent

different aspects of the problem complexity. Next, we characterize the behavior of XCS with respect to the

complexity of the problem, and identify regions in the measurement space where the easiest problems and

the most difﬁcult problems for XCS are located. Section VI compares XCS with other classiﬁer schemes,

trying to determine the domains of competence of each classiﬁer scheme. Finally, we present our main

conclusions and discuss future work.

II. DESCRIPTION OF XCS

XCS represents the knowledge extracted from the problem in a set of rules. This ruleset is incrementally

evaluated by means of interacting with the environment, through a reinforcement learning scheme, and is

improved by a search mechanism based on a genetic algorithm. The following is a brief description of

XCS. Although XCS is applicable to single-step and multi-step problems, we restrict our description of

XCS to single-step tasks like classiﬁcation problems within the scope of this paper. For more details, the

reader is referred to [1] and [2] for an introduction of XCS, and [19] for an algorithmic description.

A. Representation

XCS evolves a population [P] of classiﬁers where each classiﬁer has a rule and a set of associated

parameters estimating the quality of the rules. Each rule consists of a condition part and an action part:



. The condition speciﬁes the set of input states where the classiﬁer can be applied.

For binary inputs, the condition is usually represented in the ternary alphabet:



, where



is the

length of the input string. In this case, a condition







!

"#"$"$



%

matches an input example

'&

(&

"#"$"#)&

*%

if and only if

,-/.

-102-3.



. The symbol



, called don’t care, allows the formation of generalizations

in the rule’s condition. The action part of the rule speciﬁes the action or class that the classiﬁer proposes

when its condition is satisﬁed. It is coded as an integer.

Three main parameters estimate the quality of each classiﬁer: a) the payoff prediction

, an estimate

of the payoff that the classiﬁer will receive if its condition matches the input and its action is selected,

b) the prediction error

, which estimates the average error between the classiﬁer’s prediction and the

received payoff and c) the ﬁtness

, an estimate of the accuracy of the payoff prediction. There are other

parameters qualifying each classiﬁer, such as: the experience of the classiﬁer (denoted as exp), the average

size of the action sets where the classiﬁer has participated (as), the time-step of the last application of

the genetic algorithm (ts) and the number of actual micro-classiﬁers this macroclassiﬁer

represents, called

numerosity (num). These are described in the following.

B. Performance Component

At each time step, an input

is presented to the system. Given

, the system builds a match set [M],

which is formed by all the classiﬁers in [P] whose conditions are satisﬁed by the input example. If the

number of actions represented in [M] is less than a threshold







, then covering is triggered. Covering

creates new classiﬁers with a condition matching the current input and an action selected randomly

from those not present in [M]. From the resulting match set, an action must be selected and sent to

the environment. For this purpose, a payoff prediction







is computed for each action



in [M].







estimates the payoff that the system will receive if action



is chosen. It is computed as a ﬁtness-weighted

average of the predictions of all classiﬁers proposing that action. The system chooses the winning action

based on these prediction values. The chosen action determines the action set [A] which consists of all

the classiﬁers in [M] advocating this action.

In classiﬁcation, the winning action is usually selected using either pure explore mode or pure exploit

mode. In pure explore mode, the action is selected randomly. This makes sense during training, i.e., when

the system is learning the consequences of all possible actions for a given input. In pure exploit mode,

the action is selected deterministically according to the highest prediction. This is used in test, that is,

when the system classiﬁes new unseen instances based on the knowledge it has acquired.

C. Reinforcement Component

Once the action is sent to the environment, the environment returns a reward



, which is used to update

the parameters of the classiﬁers in [A]. First, the prediction of each classiﬁer is adjusted as follows:









(1)

where



(







) is the learning rate. Next, the prediction error is updated:







 



(2)

Then, the classiﬁer’s accuracy is computed as an inverse function of the classiﬁer’s error:





 

 5

"

$#&%

"



otherwise

(3)

Classiﬁers in XCS are macroclassiﬁers, i.e., each classiﬁer represents num micro-classiﬁers having identical conditions and actions

[19].

Domain of competence of XCS classifier system in complexity measurement space

Figures

Citations

KEEL: a software tool to assess evolutionary algorithms for data mining problems

An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes

A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability

Evolutionary undersampling for classification with imbalanced datasets: Proposals and taxonomy

Evolutionary rule-based systems for imbalanced data sets

References

Genetic algorithms in search, optimization, and machine learning

Reinforcement Learning: An Introduction

Adaptation in natural and artificial systems

Bagging predictors

UCI Repository of machine learning databases

Related Papers (5)

Complexity measures of supervised classification problems

Classifier fitness based on accuracy

UCI Machine Learning Repository

Statistical Comparisons of Classifiers over Multiple Data Sets

C4.5: Programs for Machine Learning

Frequently Asked Questions (12)

Q1. What contributions have the authors mentioned in the paper "Domain of competence of xcs classifier system in complexity measurement space" ?

Q2. What is the purpose of the search component in XCS?

Q3. What is the role of the reinforcement component?

Q4. What is the reward used to update the parameters of the classifiers in [A]?

Q5. how is the xcs classification system based on the generalization hypothesis?

Q6. what causes the evolution of maximally general classifiers?

Q7. What is the probability of a classifier being deleted?

Q8. What other parameters are used to qualify each classifier?

Q9. What is the description of the XCS system?

Q10. What are the main parameters of a classifier?

Q11. Why was subsumption introduced in the first place?

Q12. What is the inverse function of the classifier’s error?