What contributions have the authors mentioned in the paper "Evolving rule-based models: a tool for intelligent adaptation" ?

In this paper, a summary of existing approaches for ( off-line ) identification of Takagi-Sugeno ( TS ) models is given with emphasis on non-iterative combination of clustering and linear least squares.

What is the way to test and validate the model simplification method?

Engineering applications to a time series prediction problem based on data from a real indoor climate control system has been considered to test and validate the model simplification method.

What is the effect of similarity measures on the model?

Application of similarity measures additionally improves the transparency and simplicity of the model, without significant degradation in the model precision.

What is the potential of a data point to be a centre of a cluster?

Pi = Nj ijD 1 , (2a)2ji zz ij eD , i=1,2,…N , (2b)where Pi is the potential of the data point zi=[xi;,yi] to be a cluster centre, Dij denotes the contribution of every single distance, N is the number of training data samples and = 4/r 2 ;r is the cluster radii.

What is the simplest way to identify a TS model?

1. (Off-line) identification of TS modelsTakagi-Sugeno model [4] could be represented as:R i: IF (x1 is LV i1) … AND (xn is LV in) THEN (yi= pi1x1+…+pinxn+qi); i=1,…,NR , (1)where Ri denotes the i th fuzzy rule, NR is the number of rules, x is the input vector x=[x1,x2,…,xn]

(Open Access) Evolving rule-based models: A tool for intelligent adaptation (2001) | Plamen Angelov

This item was submitted to Loughborough’s Institutional Repository

(https://dspace.lboro.ac.uk/) by the author and is made available under the

following Creative Commons Licence conditions.

For the full text of this licence, please go to:

http://creativecommons.org/licenses/by-nc-nd/2.5/

Evolving Rule-based Models: A Tool for Intelligent Adaptation

Plamen Angelov

Richard Buswell

Dept of Civil and Building Engineering

Loughborough University

Dept of Civil and Building Engineering

Loughborough University

Loughborough, Leicestershire, LE11 3TU,UK

Loughborough, Leicestershire, LE11 3TU, UK

e-mail: P.P.Angelov@Lboro.ac.UK

e-mail R.A.Buswell@Lboro.ac.UK

corresponding author

Abstract

An on-line approach for rule-base evolution by

recursive adaptation of rule structure and parameters in

is described in the paper. An integral part of the

procedure is to maximise the model transparency by

simplifying the fuzzy linguistic descriptions of the

input variables. The rule base evolves over time and

utilising direct calculation approaches and hence

minimising the reliance on the use of computationally

expensive techniques, such as genetic algorithms. An

on-line version of subtractive clustering recently

introduced by the authors [1] is used for determination

of the antecedent part of the fuzzy rules. Recursive

least squares estimation [2]-[3] is employed to

determine the parameters of the consequent part of

each rule. The use of these efficient non-iterative

techniques is the key to the low computational

demands of the algorithm. The application of similarity

measures improves the interpretability and

compactness of the resulting eR model, with no

significant detriment to the model precision. A time

series prediction problem on data from a real indoor

climate control (ICC) system has been considered to

test and validate the proposed model simplification

method.

1. Introduction

Fuzzy rule-based (FRB) models, and especially

Takagi-Sugeno (TS) models [4], are widely used to

represent complex non-linear systems. These models

are also (relatively) easy to identify and their structure

can be readily analysed [5]. Effective identification

techniques treating the antecedent and consequent

parts of the model [6]-[8] and methods for analysis of

the stability of controllers based on these models [9]

has been developed.

Alternative techniques for identification of both model

structure and parameters that are, in principle, non-

linear optimisation problems, include direct use of

genetic algorithms [10]-[12] or gradient-based back-

propagation [13]. The advantage of the latter is the

higher precision that is gained by solving the

parameter and structure identification simultaneously.

These approaches include identification of the

antecedent and the consequent part of fuzzy rules and

their parameters [10]-[11]. The former approach ([1],

[6] and [8]) is superior in terms of computational

requirement. This is particularly evident when non-

iterative clustering approaches (mountain [8] or

subtractive [6] clustering instead of fuzzy-C-means

[14]) are used. Both approaches are sometimes

combined [1], [7] for this reason.

All these methods could be classified as data-driven

rule/knowledge extraction. Expert knowledge plays a

minor, if any role. This tendency in fuzzy model

identification is typical in recent research, particularly

over the past few years. One reason for the growing

interest in these techniques could be due to the ease by

which data can be gathered and distributed. At present,

the real issue in many industries and organisations is

how to effectively cope with the information in

exponentially growing data-bases. This is especially so

where the information is qualitative and imprecise.

One important aspect of real problems is the necessity

to adapt models and systems in accordance with the

changing environmental conditions. Current techniques

do not accommodate this requirement [15]. Linear

models and linear control theory have been developed

to the point of effective solutions for these problems

[16]. Complex, non-linear and linguistic models and

systems have not. In [1] authors originally introduced

an effective approach for recursive on-line

identification of TS models. In this paper this approach

is developed further by the application of a model

simplification methodology.

A brief summary of existing approaches for (off-line)

identification of TS models is given with emphasis on

non-iterative combination of clustering and linear least

squares. evolving Rule-based (eR) models are

considered as a tool for intelligent adaptation of

complex systems description. The basic mechanism for

rule-base evolution is presented followed by the model

simplification procedure. In section 4 a time-series

prediction problem is considered for testing and

validation of the simplification methodology. After the

analysis of the results conclusions are given.

1. (Off-line) identification of TS models

Takagi-Sugeno model [4] could be represented as:

: IF (x

is LV

) … AND (x

is LV

)

THEN (y

= p

+…+p

); i=1,…,NR , (1)

where R

denotes the i

fuzzy rule, NR is the number

of rules, x is the input vector x=[x

,…,x

]

and LV

denotes j

linguistic variable of the antecedent part for

the i

fuzzy rule (j=1,2,…,n). y

is the output of the i

rule and p

and q

are parameters of the consequence.

The model output is calculated by weighted averaging

of the individual rule contributions using the centre of

area de-fuzzification operator.

TS models are quasi-linear in nature [17]; they result

in smooth transition between linear sub-models, which

are responsible for separate sub-space of states. This

property allows separating the identification problem

into two sub-problems:

 appropriate partitioning of the state space of

interest by clustering;

 parameter identification of the consequent part.

As the output functions y

are normally linear or

singletons (constants), the second sub-problem is easy

solvable by applying least square technique [2]-[3].

The first sub-problem uses clustering since it is more

efficient than grid partitioning. Intuitively grid

partitioning is closer to the linguistic concept of fuzzy

variables, but it is impractical for larger dimensions,

due to the so-called curse of dimensionality [10]-[11].

Fuzzy C-means have been used [7], but requires

iterations. Mountain clustering [8] and its

modification, subtractive clustering [6] are therefore,

preferred [1].

Subtractive clustering is based on the notion of

potential of a data point to be a centre of a cluster. The

following formula is used to express the potential as a

sum of contributions of Euclidean distances between a

given point and all other data points [6]:





, (2a)







, i=1,2,…N , (2b)

where P

is the potential of the data point z

i;,

] to

be a cluster centre, D

denotes the contribution of

every single distance, N is the number of training data

samples and  = 4/r

;r is the cluster radii.

Inspection of Equations 2a and 2b show that the

potential of a data point to be a cluster centre is higher

when more data points are closer to a specific

candidate. The highest potential is called reference

potential [1].

The procedure called subtractive clustering [6] is

based on the successive process of determination of the

point with highest potential. Potential of all other

points are then reduced with an amount proportional to

the potential of the chosen point and to the distance to

this point:

ikk

old

new

DPPP 

i=1,2,…,N, (3)

where

denote the potential of the k

centre and

is the modified contribution, which differs from D

by the parameter  [1], [6].

When a data point is selected as a new cluster centre

and its indices become the centres of new membership

functions. The point is accepted as a centre if its

potential is higher than certain threshold which is

determined as a function of the reference. If the

potential is less than a lower threshold (also a function

of the reference potential) the point is rejected. If the

potential falls between these limits and is sufficiently

far away from the current centres, the point is also

rejected. The distance criterion is based on the shortest

of the distances (d

min

) between the new candidate to be

a cluster centre (

) and all previously found cluster

centres. The following inequality, expresses the trade-

off between the potential value and the closeness to the

previous centres [1], [6]:

min

drr



(4)

Second sub-problem (this of parameters of the

consequent part estimation) is easily solved by

applying linear least square technique [1]-[3],[6]. It

should be mentioned that parameters of the antecedent

part can be further simplified and optimised, but only

through the application of iterative non-linear

approaches like GA [7], [10]-[12] or gradient-based

techniques [13]. It is possible to improve precision up

to 2 times and the model structure could be further

simplified and optimised. The disadvantage, however,

is computational expense.

2. Intelligent adaptation of rule-based models

In real-life problems a non-linear model which adapts

to the changes in the environment and adapts to the

object of modelling/control could be the basis for

building intelligent systems that are able to learn more

effectively. eR models (rule-based TS models evolving

in structure and parameters) are seen as a promising

candidate for this purpose [1]. A procedure for on-line

recursive identification of TS models has been

developed in [1]. Basically it consists of:

 calculation of the potential of new data points:















)1(

kNj

k=0,1,2,… (5)

where k denotes the on-line time sampling

 on-line recursive up-date of the potential of

existing cluster (membership functions) centres:

1*1*1*

][][][





PPAP



; l=1,2,…,R (6)

where

1





;

][

)1(









kNj



 on-line recursive up-date of the reference potential

 

,max





kNl

reference

PPP

(7)

 on-line recursive estimation of parameters of the

consequent part.

In order to avoid overloading of memory a moving

window has been introduced [1]. This is critical only

for calculation of the potential of the new data point

(5).

The fuzzy rule-based model depicted in Equation 1 is

generated automatically, on-line. It is used as a good

initial estimate of the non-linear mapping between

inputs and the output(s). Its optimality could be

guaranteed by using a non-linear numerical

optimisation algorithm, such as a GA [7], [10]-[11].

FRB model simplification

It is desirable, especially in fault diagnosis, to have

transparent models that are as simple as possible while

maintaining a desired level of precision. In order to

maximise the transparency, which also minimises the

memory cost, it is necessary to minimise the number of

membership functions describing each input variable.

This procedure is depicted in Figure 1.

Figure 1.

With the on-line TS model identification, both the

rules and parameters are considered [1]. The number

of rules influences the precision of the model and is

determined by the potential of data samples. The

model simplification process seeks to minimise the

number of membership functions associated with each

input variable.

The approach allows for the optimisation of the

membership function parameters, if the process is

considered to be beneficial to the model. Often,

however, this process will lead to over-fitting, with no

real gain in the model representation of the process.

After the on-line search has yielded a new centre, the

similarity of the membership functions of the new data

point is compared to the existing model. Since the

spreads of the membership functions are the same, the

similarity can be judged on the values of the centre

parameters alone. The centres are deemed to be similar

if the distance is less than a given threshold (which is a

predetermined percentage of the variable range;

10%15% of the whole range seem to be reasonable

values to use). Each membership function centre

parameter in each input variable is sequentially

checked against the new membership function. If the

new membership function is similar to one that is

existing, the new rule is rewritten to reflect the

existing, similar membership function. If no similar

membership function exists, it is added to the model. It

should be noted that the selection of the distance

threshold in the simplification criterion should not be

too stringent, or the model precision will suffer.

Conversely, it should not be too low, otherwise there

will not be sufficient simplification of the model.

3. Results and Discussion

To demonstrate the reduction in the number of model

parameters, through similarity, the approach was

applied to a time series prediction problem. Figure 2

shows the training data used in the problem. The data

was collected from a real system.

Figure 2

The plot shows the control signal (top plot) to a valve

that controls the mass flow rate of water through a heat

exchanger. The heat exchanger cools the warm air that

flows on to the coil. The cool air is used to maintain

comfortable conditions in an occupied space. One of

the principle loads on the coil is generated due to the

supply of ambient air; required to maintain a minimum

standard of indoor air quality. The test system is shown

in Figure 3.

Figure 3

The ambient air (T

) and supply air (T

) temperatures

(shown in the bottom plot) are sampled and the current

and previous time intervals, as is the control signal.

These are the model inputs. The model is then used to

predict the control signal 20 samples ahead of the

current sample. The sample interval is 1 minute. Data

from the same system, but from a different day was

used to validate the models.

Using a batch estimation approach on the training data,

the subtractive clustering generated a model with 7

rules and 7 membership functions describing each

input variable. Figure 4 shows these functions for each

input. The number of membership functions in the test

case is 7x6=42. Applying the similarity simplification

(with a threshold value of 10%), this number is

reduced to 12, as shown in Figure 5.

Figure 6 shows the correlation between the model

predictions and the data for the training and validation

cases. The correlation coefficients are noted on each

plot. Figure 7 demonstrates the predictions compared

to the measured data. The loss of precision of the

model that this simplification results in is negligible.

The root mean squared prediction error of the initial

and simplified models on the training data was 0.030

and 0.031 respectively (no units for control signal).

Application to the validation data set yields errors of

Evolving rule-based models: A tool for intelligent adaptation

Figures

Citations

Evolving Fuzzy Systems from Data Streams in Real-Time

Handling drifts and shifts in on-line data streams with evolving fuzzy systems

Evolving Takagi‐Sugeno Fuzzy Systems from Streaming Data (eTS+)

A fuzzy controller with evolving structure

A fast learning algorithm for evolving neo-fuzzy neuron

References

System Identification: Theory for the User

Fuzzy identification of systems and its applications to modeling and control

ANFIS: adaptive-network-based fuzzy inference system

Computer-Controlled Systems: Theory and Design

Fuzzy Model Identification Based on Cluster Estimation

Related Papers (5)

An approach to online identification of Takagi-Sugeno fuzzy models

DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction

Fuzzy identification of systems and its applications to modeling and control

Evolving Fuzzy-Rule-Based Classifiers From Data Streams

FLEXFIS: A Robust Incremental Learning Approach for Evolving Takagi–Sugeno Fuzzy Models

Frequently Asked Questions (15)

Q1. What contributions have the authors mentioned in the paper "Evolving rule-based models: a tool for intelligent adaptation" ?

Q2. What is the way to reduce the number of membership functions?

Q3. What is the principle load on the coil?

Q4. What are the common types of fuzzy models?

Q5. How many rules and 7 membership functions were generated?

Q6. What is the way to test and validate the model simplification method?

Q7. What are the advantages of using genetic algorithms?

Q8. What is the effect of similarity measures on the model?

Q9. How is the output of the TS model calculated?

Q10. What is the advantage of the latter?

Q11. How many parameters are used to demonstrate the reduction in the number of model parameters?

Q12. What is the root mean squared prediction error of the initial and simplified models on the training data?

Q13. What is the potential of a data point to be a centre of a cluster?

Q14. What is the real issue in many industries and organisations?

Q15. What is the simplest way to identify a TS model?