scispace - formally typeset
Open AccessJournal ArticleDOI

Statistical evaluation of rough set dependency analysis

TLDR
This paper proposes to enhance RSDA by two simple statistical procedures, both based on randomization techniques, to evaluate the validity of prediction based on the approximation quality of attributes of rough set dependency analysis.
Abstract
Rough set data analysis (RSDA) has recently become a frequently studied symbolic method in data mining. Among other things, it is being used for the extraction of rules from databases; it is, however, not clear from within the methods of rough set analysis, whether the extracted rules are valid.In this paper, we suggest to enhance RSDA by two simple statistical procedures, both based on randomization techniques, to evaluate the validity of prediction based on the approximation quality of attributes of rough set dependency analysis. The first procedure tests the casualness of a prediction to ensure that the prediction is not based on only a few (casual) observations. The second procedure tests the conditional casualness of an attribute within a prediction rule.The procedures are applied to three data sets, originally published in the context of rough set analysis. We argue that several claims of these analyses need to be modified because of lacking validity, and that other possibly significant results were overlooked.

read more

Content maybe subject to copyright    Report

Statistical Evaluation of Rough Set Dependency Analysis
Ivo Düntsch
1
School of Information and Software Engineering
University of Ulster
Newtownabbey, BT 37 0QB, N.Ireland
I.Duentsch@ulst.ac.uk
Günther Gediga
1
FB Psychologie / Methodenlehre
Universität Osnabck
49069 Osnabrück, Germany
gg@Luce.Psycho.Uni-Osnabrueck.DE
and
Institut für semantische Informationsverarbeitung
Universität Osnabck
December 12, 1996
1
Equal authorship implied

Summary
Rough set data analysis (RSDA) has recently become a frequently studied symbolic method in data
mining. Among other things, it is being used for the extraction of rules from databases; it is, however,
not clear from within the methods of rough set analysis, whether the extracted rules are valid.
In this paper, we suggest to enhance RSDA by two simple statistical procedures, both based on ran-
domization techniques, to evaluate the validity of prediction based on the approximation quality of
attributes of rough set dependency analysis. The first procedure tests the casualness of a prediction
to ensure that the prediction is not based on only a few (casual) observations. The second procedure
tests the conditional casualness of an attribute within a prediction rule.
The procedures are applied to three data sets, originally publishedin the context of rough set analysis.
We argue that several claims of these analyses need to be modified because of lacking validity, and
that other possibly significant results were overlooked.
Keywords: Rough sets, dependency analysis, statistical evaluation, validation, randomization test

1 Introduction
Rough set analysis, an emerging technology in artificial intelligence (Pawlak et al. (1995)), has been
compared with statistical models, see for example Wong et al. (1986), Krusi´nska et al. (1992a) or
Krusi´nska et al. (1992b). One area of application of rough set theory is the extraction of rules from
databases; these rules then are sometimes claimed tobe usefulfor future decisionmaking or prediction
of events. However, if such a rule is based on only a few observations, its usefulness for prediction is
arguable (see also Krusi´nska et al. (1992a), p 253 in this context).
The aim of this paper is to employ statistical methods which are compatible with the rough set phi-
losophy to evaluate the “prediction quality” of rough set dependency analysis. The methods will be
applied to three different data sets:
The rst set was publishedin Pawlak et al. (1986) and Słowi´nski & Słowi´nski (1990). It utilizes
rough set analysisto describe patientsafter highlyselectivevagotomy (HSV) for duodenalulcer.
The statistical validity of the conclusions will be discussed.
The second example is the discussion of earthquake data published by Teghem & Charlet
(1992). The main reason why we use this example is that it demonstrates the applicability of
our approach in the situation when the prediction success is perfect in terms of rough analysis.
The third example is used by Teghem & Benjelloun (1992) to compare statistical and rough set
methods. We show how statistical methods within rough set analysis highlight some of their
results in a different way.
2 Rough set data analysis
A major area of application of rough set theory is the study of dependencies among attributes of
information systems. An information system S = hU, ,V
q
,fi
q
consists of
1. A set U of objects,
2. A nite set of attributes,
3. For each q asetV
q
of attribute values,
4. An information function f : U × V
def
=
S
qQ
V
q
with f(x, q) V
q
for all x U, q .
We think of the descriptor f(x, q) as the value which object x takes at attribute q.
With each Q we associate an equivalence relation θ
Q
on U by
x y (θ
Q
)
def
⇐⇒ f(x, q)=f(y, q) for all q Q.
If x U ,thenθ
Q
x is the equivalence class of θ
Q
containing x.
1

Intuitively, x y (θ
Q
) if the objects x and y are indiscernible with respect to the values of their
attributes from Q. If X U,thenthe lower approximation of X by Q
X
θ
Q
=
[
{θ
Q
x : θ
Q
x X}
is the set of all correctly classified elements of X with respect toθ
Q
, i.e. with the information available
from the attributes given in Q.
Suppose that P, Q . We say that P is dependent on Q written as Q P if every class of θ
P
is a union of classes of θ
Q
. In other words, the classification of U induced by θ
P
can be expressed by
the classification induced by θ
Q
.
In order to simplify notation we shall in the sequel usually write Q p instead of Q →{p} and θ
p
instead of θ
{p}
.
Each dependency Q P leads to a set of rules as follows: Suppose that Q
def
= {q
0
,...,q
n
},and
P
def
= {p
0
,...,p
k
}. For each set {t
0
,...,t
n
} where t
i
V
q
i
there is a uniquely determined set
{s
0
,...,s
k
} with s
i
V
p
i
such that
(x U)[f (x, q
0
)=t
0
···∧f(x, q
n
)=t
n
) (f(x, p
0
)=s
0
···∧f(x, p
k
)=s
k
)].(2.1)
Of particular interest in rough set dependency theory are those sets Q which use the least number of
attributes, and still have Q P . A set with this property called a minimal determining set for P .In
other words, a set Q is minimal determining for P ,ifQ P ,andR 6→ P for all R
(
Q.
If such Q is a subset of P we call Q a reduct of P. It is not hard to see, that each P has a reduct,
though this need not be unique. The intersection of all reducts of P is called the core of P .UnlessP
has only one reduct, the core of P is not itself a reduct.
For each R let P
R
be the partition of U induced by θ
R
.Dene
γ
Q
(P )=
P
X∈P
P
|X
θ
Q
|
|U |
.(2.2)
γ
Q
(P ) is the relative frequency of the number of correctly Q–classified elements with respect to
the partition induced by P . It is usually interpreted in rough set analysis as a measurement of the
prediction success of a set of inference rules based on value combinations of Q and value combinations
of P of the form given in (2.1). The prediction success is perfect, if γ
Q
(P )=1; in this case, Q P .
Suppose that Q is a reduct of P ,sothatQ P ,andQ \{q}6P for any q Q. In rough
set theory, the impact of attribute q on the fact that Q P is usually measured by the drop of the
approximation function γ from 1 to γ
Q\{q}
(P ): The larger the difference, the more important one
regards the contribution of q. We shall show below that this interpretation needs to be taken with care
in some cases, and additional statistical evidence may be needed.
2

3 Casual rules and randomization analysis
3.1 Casual dependencies
In the sequel we consider the case that a rule Q P was given before performing the data analysis,
and not obtained by optimizing the quality of approximation. The latter needs additional treatment
andwillbediscussedbrieyinSection3.5.
Suppose that θ
Q
is the identity relation id
U
on U. Then, θ
Q
θ
P
for all P ,i.e.Q P for
all P . Furthermore, each class of θ
Q
consists of exactly one element, and therefore, any rule
Q P is based on exactly one observation. We call such a rule deterministic casual.
If a rule is not deterministic casual, it nevertheless may be based on a few observationsonly, and thus,
its prediction quality could be limited; such rules may be called casual. Therefore, the need arises for
a statistical procedure which tests the casualness of a rule based on mechanisms of rough set analysis.
Assume that theinformation system is the realization of a randomprocessin which the attribute values
of Q and P are realized independently of each other. If no additional information is present, it may be
assumed that the attribute value combinations within Q and P are fixed and the matching of the Q, P
combinations is drawn at random.
Let σ be a permutation of U ,andQ . We define a new information function f
σ(Q)
by
f
σ(Q)
(x, r)
def
=
f(σ(x),r)), if r Q,
f(x, r), otherwise,
and let γ
σ(Q)
(P ) be the approximation of the prediction of P by Q in the new information system.
Note that the structure of the equivalence relation θ
σ(Q)
determined by Q in the revised system is the
same as that of the original θ
Q
. In other words, there is a bijective mapping
τ : {θ
σ(Q)
x : x U}→{θ
Q
x : x U}
which preserves the cardinality of the classes. In particular, if θ
Q
is the identity on U ,soisθ
σ(Q)
.It
follows that for a rule Q p with θ
Q
= id
U
,wehaveγ
σ(Q)
(p)=1as well for all permutations σ of
U.
The distribution of the prediction success is given by the set
R
P,Q
def
= { γ
σ(Q)
(P ):σ a permutation of U }.
Let H be the null hypothesis;we have to estimate the position of the observed approximation quality
γ
obs
def
= γ
Q
(P ) in the set R
P,Q
, i.e. to estimate the probability p(γ
R
γ
obs
|H). Standard ran-
domization techniques for example Manly (1991), Chapter 1 can now be applied to estimate this
probability.
If p(γ
R
γ
obs
|H) is low conventionally in the upper 5% region –, the assumption of randomness
can be rejected, otherwise, if
p(γ
R
γ
obs
|H) > 0.05,
we call the rule (random) casual.
3

Citations
More filters
Journal ArticleDOI

Rough set methods in feature selection and recognition

TL;DR: The algorithm for feature selection is based on an application of a rough set method to the result of principal components analysis (PCA) used for feature projection and reduction.
Journal ArticleDOI

Uncertainly measures of rough set prediction

TL;DR: This work presents three model selection criteria, using information theoretic entropy in the spirit of the minimum description length principle, based on the principle of indifference combined with the maximum entropy principle, thus keeping external model assumptions to a minimum.
Journal ArticleDOI

Measuring relevance between discrete and continuous features based on neighborhood mutual information

TL;DR: It is shown that the proposed measure is a natural extension of classical mutual information which reduces to the classical one if features are discrete; thus the new measure can also be used to compute the relevance between discrete variables.

Knowledge discovery by application of rough set models

TL;DR: This Chapter discusses selected rough set based solutions to two main knowledge discovery problems, namely the description problem and the classification (prediction) problem.
Book ChapterDOI

Knowledge discovery by application of rough set models

TL;DR: In this article, the authors discuss selected rough set based solutions to two main knowledge discovery problems, namely the description problem and the classification (prediction) problem, which are the subject of the field of knowledge discovery in databases.
References
More filters
Book

Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory

TL;DR: The use of 'Rough Sets' Methods to draw Premonitory Factors for Earthquakes by emphasising Gas Geochemistry: The Case of a Low Seismic Activity Context in Belgium J.T. Polkowski is used.
Journal ArticleDOI

Randomization and Monte Carlo Methods in Biology

Minoo Niknian
- 01 Feb 1993 - 
TL;DR: In this article, Randomization and Monte Carlo Methods in Biology have been used to perform Monte Carlo methods in the field of biology, and the results show that the Monte Carlo method is effective.
Journal ArticleDOI

Rough classification of patients after highly selective vagotomy for duodenal ulcer

TL;DR: Using the method of rough classification it is shown that the given norms ensure a good classification of patients and some minimum sets of attributes significant for high-quality classification are obtained.
Journal ArticleDOI

Comparison of rough-set and statistical methods in inductive learning

TL;DR: The main objective of this paper is to show that the concept of “approximate classification” of a set is closely related to the statistical approach.
Journal ArticleDOI

Exact Power of Conditional and Unconditional Tests: Going beyond the 2 × 2 Contingency Table

TL;DR: In this paper, the authors examined the question of whether one should condition on both margins of a contingency table for exact inference from a fresh, computational perspective, and found that the power advantage of the unconditional test rapidly vanishes when the power computations are extended to 2 × 3 contingency tables.
Frequently Asked Questions (1)
Q1. What are the contributions mentioned in the paper "Statistical evaluation of rough set dependency analysis" ?

In this paper, the authors employ statistical methods which are compatible with the rough set philosophy to evaluate the `` prediction quality '' of rough set dependency analysis.