What are the contributions mentioned in the paper "Statistical evaluation of rough set dependency analysis" ?

Q: What are the contributions mentioned in the paper "Statistical evaluation of rough set dependency analysis" ?

In this paper, the authors employ statistical methods which are compatible with the rough set philosophy to evaluate the `` prediction quality '' of rough set dependency analysis.

(Open Access) Statistical evaluation of rough set dependency analysis (1997) | Ivo Düntsch

Statistical Evaluation of Rough Set Dependency Analysis

Ivo Düntsch

School of Information and Software Engineering

University of Ulster

Newtownabbey, BT 37 0QB, N.Ireland

I.Duentsch@ulst.ac.uk

Günther Gediga

FB Psychologie / Methodenlehre

Universität Osnabrück

49069 Osnabrück, Germany

gg@Luce.Psycho.Uni-Osnabrueck.DE

and

Institut für semantische Informationsverarbeitung

Universität Osnabrück

December 12, 1996

Equal authorship implied

Summary

Rough set data analysis (RSDA) has recently become a frequently studied symbolic method in data

mining. Among other things, it is being used for the extraction of rules from databases; it is, however,

not clear from within the methods of rough set analysis, whether the extracted rules are valid.

In this paper, we suggest to enhance RSDA by two simple statistical procedures, both based on ran-

domization techniques, to evaluate the validity of prediction based on the approximation quality of

attributes of rough set dependency analysis. The ﬁrst procedure tests the casualness of a prediction

to ensure that the prediction is not based on only a few (casual) observations. The second procedure

tests the conditional casualness of an attribute within a prediction rule.

The procedures are applied to three data sets, originally publishedin the context of rough set analysis.

We argue that several claims of these analyses need to be modiﬁed because of lacking validity, and

that other possibly signiﬁcant results were overlooked.

Keywords: Rough sets, dependency analysis, statistical evaluation, validation, randomization test

1 Introduction

Rough set analysis, an emerging technology in artiﬁcial intelligence (Pawlak et al. (1995)), has been

compared with statistical models, see for example Wong et al. (1986), Krusi´nska et al. (1992a) or

Krusi´nska et al. (1992b). One area of application of rough set theory is the extraction of rules from

databases; these rules then are sometimes claimed tobe usefulfor future decisionmaking or prediction

of events. However, if such a rule is based on only a few observations, its usefulness for prediction is

arguable (see also Krusi´nska et al. (1992a), p 253 in this context).

The aim of this paper is to employ statistical methods which are compatible with the rough set phi-

losophy to evaluate the “prediction quality” of rough set dependency analysis. The methods will be

applied to three different data sets:

• The ﬁrst set was publishedin Pawlak et al. (1986) and Słowi´nski & Słowi´nski (1990). It utilizes

rough set analysisto describe patientsafter highlyselectivevagotomy (HSV) for duodenalulcer.

The statistical validity of the conclusions will be discussed.

• The second example is the discussion of earthquake data published by Teghem & Charlet

(1992). The main reason why we use this example is that it demonstrates the applicability of

our approach in the situation when the prediction success is perfect in terms of rough analysis.

• The third example is used by Teghem & Benjelloun (1992) to compare statistical and rough set

methods. We show how statistical methods within rough set analysis highlight some of their

results in a different way.

2 Rough set data analysis

A major area of application of rough set theory is the study of dependencies among attributes of

information systems. An information system S = hU, Ω,V

,fi

q∈Ω

consists of

1. A set U of objects,

2. A ﬁnite set Ω of attributes,

3. For each q ∈ Ω asetV

of attribute values,

4. An information function f : U × Ω → V

def

q∈Q

with f(x, q) ∈ V

for all x ∈ U, q ∈ Ω.

We think of the descriptor f(x, q) as the value which object x takes at attribute q.

With each Q ⊆ Ω we associate an equivalence relation θ

on U by

x ≡ y (θ

)

def

⇐⇒ f(x, q)=f(y, q) for all q ∈ Q.

If x ∈ U ,thenθ

x is the equivalence class of θ

containing x.

Intuitively, x ≡ y (θ

) if the objects x and y are indiscernible with respect to the values of their

attributes from Q. If X ⊆ U,thenthe lower approximation of X by Q

[

{θ

x : θ

x ⊆ X}

is the set of all correctly classiﬁed elements of X with respect toθ

, i.e. with the information available

from the attributes given in Q.

Suppose that P, Q ⊆ Ω. We say that P is dependent on Q – written as Q → P – if every class of θ

is a union of classes of θ

. In other words, the classiﬁcation of U induced by θ

can be expressed by

the classiﬁcation induced by θ

In order to simplify notation we shall in the sequel usually write Q → p instead of Q →{p} and θ

instead of θ

{p}

Each dependency Q → P leads to a set of rules as follows: Suppose that Q

def

= {q

,...,q

},and

def

= {p

,...,p

}. For each set {t

,...,t

} where t

∈ V

there is a uniquely determined set

,...,s

} with s

∈ V

such that

(∀x ∈ U)[f (x, q

)=t

∧···∧f(x, q

)=t

) ⇒ (f(x, p

)=s

∧···∧f(x, p

)=s

)].(2.1)

Of particular interest in rough set dependency theory are those sets Q which use the least number of

attributes, and still have Q → P . A set with this property called a minimal determining set for P .In

other words, a set Q is minimal determining for P ,ifQ → P ,andR 6→ P for all R

(

If such Q is a subset of P we call Q a reduct of P. It is not hard to see, that each P ⊆ Ω has a reduct,

though this need not be unique. The intersection of all reducts of P is called the core of P .UnlessP

has only one reduct, the core of P is not itself a reduct.

For each R ⊆ Ω let P

be the partition of U induced by θ

.Deﬁne

(P )=

X∈P

|U |

.(2.2)

(P ) is the relative frequency of the number of correctly Q–classiﬁed elements with respect to

the partition induced by P . It is usually interpreted in rough set analysis as a measurement of the

prediction success of a set of inference rules based on value combinations of Q and value combinations

of P of the form given in (2.1). The prediction success is perfect, if γ

(P )=1; in this case, Q → P .

Suppose that Q is a reduct of P ,sothatQ → P ,andQ \{q}6→P for any q ∈ Q. In rough

set theory, the impact of attribute q on the fact that Q → P is usually measured by the drop of the

approximation function γ from 1 to γ

Q\{q}

(P ): The larger the difference, the more important one

regards the contribution of q. We shall show below that this interpretation needs to be taken with care

in some cases, and additional statistical evidence may be needed.

3 Casual rules and randomization analysis

3.1 Casual dependencies

In the sequel we consider the case that a rule Q → P was given before performing the data analysis,

and not obtained by optimizing the quality of approximation. The latter needs additional treatment

andwillbediscussedbrieﬂyinSection3.5.

Suppose that θ

is the identity relation id

on U. Then, θ

⊆ θ

for all P ⊆ Ω,i.e.Q → P for

all P ⊆ Ω. Furthermore, each class of θ

consists of exactly one element, and therefore, any rule

Q → P is based on exactly one observation. We call such a rule deterministic casual.

If a rule is not deterministic casual, it nevertheless may be based on a few observationsonly, and thus,

its prediction quality could be limited; such rules may be called casual. Therefore, the need arises for

a statistical procedure which tests the casualness of a rule based on mechanisms of rough set analysis.

Assume that theinformation system is the realization of a randomprocessin which the attribute values

of Q and P are realized independently of each other. If no additional information is present, it may be

assumed that the attribute value combinations within Q and P are ﬁxed and the matching of the Q, P

– combinations is drawn at random.

Let σ be a permutation of U ,andQ ⊆ Ω. We deﬁne a new information function f

σ(Q)

(x, r)

def







f(σ(x),r)), if r ∈ Q,

f(x, r), otherwise,

and let γ

σ(Q)

(P ) be the approximation of the prediction of P by Q in the new information system.

Note that the structure of the equivalence relation θ

σ(Q)

determined by Q in the revised system is the

same as that of the original θ

. In other words, there is a bijective mapping

τ : {θ

σ(Q)

x : x ∈ U}→{θ

x : x ∈ U}

which preserves the cardinality of the classes. In particular, if θ

is the identity on U ,soisθ

σ(Q)

.It

follows that for a rule Q → p with θ

= id

,wehaveγ

σ(Q)

(p)=1as well for all permutations σ of

The distribution of the prediction success is given by the set

P,Q

def

= { γ

σ(Q)

(P ):σ a permutation of U }.

Let H be the null hypothesis;we have to estimate the position of the observed approximation quality

obs

def

= γ

(P ) in the set R

P,Q

, i.e. to estimate the probability p(γ

≥ γ

obs

|H). Standard ran-

domization techniques – for example Manly (1991), Chapter 1 – can now be applied to estimate this

probability.

If p(γ

≥ γ

obs

|H) is low – conventionally in the upper 5% region –, the assumption of randomness

can be rejected, otherwise, if

p(γ

≥ γ

obs

|H) > 0.05,

we call the rule (random) casual.

Statistical evaluation of rough set dependency analysis

Figures

Citations

Reducing the Memory Size of a Fuzzy Case-Based Reasoning System Applying Rough Set Techniques

Rough approximation quality revisited

Rough set approach for attribute reduction and rule generation: a case of patients with suspected breast cancer

Rough Sets in KDD

Introduction: What You Always Wanted to Know about Rough Sets

References

An introduction to the bootstrap

The use of multiple measurements in taxonomic problems

Rough sets

Randomization tests

Randomization and Monte Carlo methods in biology

Related Papers (5)

Rough Sets: Theoretical Aspects of Reasoning about Data

Variable precision rough set model

Rough sets

Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory

The Discernibility Matrices and Functions in Information Systems

Frequently Asked Questions (1)

Q1. What are the contributions mentioned in the paper "Statistical evaluation of rough set dependency analysis" ?