Statistical evaluation of rough set dependency analysis
Summary (3 min read)
1 Introduction
- The methods will be applied to three different data sets: .
- It utilizes rough set analysis to describe patients after highly selective vagotomy (HSV) for duodenal ulcer.
- The authors show how statistical methodswithin rough set analysis highlight some of their results in a different way.
2 Rough set data analysis
- Of particular interest in rough set dependency theory are those setsQ which use the least number of attributes, and still haveQ → P .
- The intersection of all reducts ofP is called thecore ofP .
- For eachR ⊆ Ω letPR be the partition ofU induced byθR. Define γQ(P ) = ∑ X∈PP |XθQ | |U | .(2.2) γQ(P ) is the relative frequency of the number of correctlyQ–classified elements with respect to the partition induced byP .
- The larger the difference, the more important one regards the contribution ofq.
3.1 Casual dependencies
- In the sequel the authors consider the case that a ruleQ → P was givenbeforeperforming the data analysis, and not obtained by optimizing the quality of approximation.
- The latter needs additional treatment and will be discussed briefly in Section 3.5.
- U} which preserves the cardinality of the classes.
- Standard randomization techniques – for example Manly (1991), Chapter 1 – can now be applied to estimate this probability.
- To decide whether the given rule is casual under the statistical assumption, the authors have to consider all 720 possible rules{σ(p), σ(q)} → d and their approximation qualities.
3.2 How the randomization procedure works
- The proposed randomization test procedure is one way to model errors in terms of a statistical approach.
- Because their approach is aimed to test the casualness of a rule system – and assume for a moment that this assumption really holds –, the assumption of representativeness is a problem of any analysis in most real life data bases.
- Any observation within the other six classes ofθQ was randomly assigned to one of the three classes ofθP .
- The percentage of the three rules – which is the true value of the approximation qualityγ – is varied by γ 0.0 0.1 0.2 0.3 Figure 1 shows the problem of granularity: GivenN = 10 observations and a true value ofγ = 0.0, the expectation of̂γ is about0.32; the granularity overshoot vanishes at aboutN = 40.
- The power curves of an effectγ > 0.0 show that the randomization test has a reasonable power – at least in the chosen situation.
3.3 Computational considerations
- It is well known that randomization is a rather expensive procedure, and one might have objections against this technique because of its cost in real life applications.
- Iff(N ) is the time complexity for performing the computation of γ, the time complexity of the simulation based randomization procedure is1000f(N ).
- If randomization is too costly for a data set, RSDA itself will not be applicable in this case.
- Some simple short cuts such as a check whether the entropy of theQ partition is nearlog2(N ) may avoid superfluous computation.
- For their re-analysis of the published data sets below it was not necessary to speed up the computations.
3.4 Conditional casual attributes
- In rough set analysis, the decline of the approximation quality when omitting one attribute is usually used to determine whether an attribute within a minimal determining set is of high value for the prediction.
- This approach does not take into account that the decline of approximation quality may be due to chance.
- Assume that an additional attributer is conceptualized in three different ways: A fine grained measurer1 using 8 categories, A medium grained descriptionr2 using 4 categories.
- Therefore the authors cannot trust the rules derived from the description{q, r1} → p, because the attributer1 is exchangeable with any random generated attributes = σ(r1).
- Whereas the statistical evaluation of the additional predictive power of the three chosen attribute differs, the analysis of the decline of the approximation quality tells us nothingabout these differences.
3.5 Cross validation of learned dependencies
- If rough set analysis is used to learn the best subset ofΩ to determineP , a simple randomization procedure is not sufficient, because it does not reflect the optimization of the learning procedure.
- Within the test subset the same procedure can be used to validate the chosen attributes.
- If the test procedure does not show a significant result, there are too few rules which can be used to predict the decision attributes from the learned attributes.
- Note, that these rules need not be the same as those in the learning subset!.
- If the additional attribute is conditional casual, the hypothesis that the rules in both sets of objects are identical should be kept.
4.1 Duodenal ulcer data
- All data used in this paper are obtainable fromftp://luce.psycho.uni-osnabrueck.de/.
- Pawlak et al. (1986) obtained – using rough set analysis – that the attribute setR, consisting of 3: Duration of disease 4: Complication 5: Basic HCI concentration 6: Basic Vol. of gastric juice 9: Stimulated HCI concentration 10: Stimulated Vol. of gastric juice suffices to predict attribute 12 (“Visick grading”).
- The attribute set discussed in Pawlak et al. (1986) was based on a reduct searching procedure.
- In order to discuss the cross validation procedure, the authors split the data set into 2 subsets containing 61 cases each.
- Furthermore, the result suggests a reduction of the number of attributes withinR, because all attributes are conditional casual.
4.2 Earthquake data
- In Teghem & Benjelloun (1992), the authors search for premonitory factors for earthquakes by emphasizing gas geochemistry.
- The partition attribute (attribute 16) was the seismic activity on 155 days measured on the Richter scale.
- The other attributes were radon concentration measured at 8 different locations (attributes 1-8) and 7 measures of climatic factors (attributes 9-15).
- A problem with the information system was that it has an empty core with respect to attribute 16, and that an evaluation of some reducts turned out to be difficult.
- The statistical evaluation of some of the information systems proposed by Teghem & Benjelloun (1992) gives us additional insights (Tab. 6).
5 Conclusion
- Gathering evidence in procedures of Artificial Intelligence should not be based upon casual observations.
- The authors approach shows how – in principle – a system using the rough set dependency analysis will defend itself against randomness.
- The reanalysis of three published data sets shows that there is an urgent need for such a technique: Parts of the claimed results using the first two data sets are invalidated, some promising dependencies are overlooked and, as the authors show using data of Study 1, their proposed cross–validation technique offers a new horizon for the interpretation.
- Concerning Study 3, the conclusions of the authors are validated.
Did you find this useful? Give us your feedback
Citations
36 citations
35 citations
Cites background from "Statistical evaluation of rough set..."
...Düntsch and Gediga [9] pointed out that Pawlak’s model, as defined above, is inadequate....
[...]
...Despite all these models, the problem discussed in [9] is no closer to being solved....
[...]
...However, as Düntsch and Gediga [9] pointed out, Pawlak’s model is inadequate in the computation of the dependency degree....
[...]
32 citations
Cites background from "Statistical evaluation of rough set..."
...Assume w.l.o.g. thatfq(x) = v, fq(y) = w, andv 6= w....
[...]
...There is a long standing tradition (for example [1,9]) to distinguish betweensymmetricandasymmetricbinary attributes....
[...]
32 citations
Cites background or methods from "Statistical evaluation of rough set..."
...Details and more examples can be found in [ 15 ]....
[...]
...One the other hand, the statistical rough set analysis of [ 15 ] presented in the next section shows that there is no evidence that this dependency is not due to chance....
[...]
...In [ 15 ] we have developed two procedures, both based on randomization techniques, which evaluate the validity of prediction based on the approximation quality of attributes of rough set dependency analysis....
[...]
References
37,183 citations
14,009 citations
"Statistical evaluation of rough set..." refers methods in this paper
...Teghem & Charlet (1992) use the famous Iris data first published by Fisher (1936) to show the applicability of rough set dependency analysis for problems normally treated by discriminant analysis....
[...]
[...]
7,185 citations
[...]
1,999 citations
1,705 citations