scispace - formally typeset
Search or ask a question

Showing papers on "Rough set published in 2001"


Journal ArticleDOI
TL;DR: The original rough set approach proved to be very useful in dealing with inconsistency problems following from information granulation, but is failing when preference-orders of attribute domains (criteria) are to be taken into account and it cannot handle inconsistencies following from violation of the dominance principle.

1,544 citations


Journal ArticleDOI
Yiyu Yao1
TL;DR: The granulation structures used by standard rough set theory and the corresponding approximation structures are reviewed and the notion of neighborhood systems is also explored.
Abstract: Information granulation and concept approximation are some of the fundamental issues of granular computing. Granulation of a universe involves grouping of similar elements into granules to form coarse-grained views of the universe. Approximation of concepts, represented by subsets of the universe, deals with the descriptions of concepts using granules. In the context of rough set theory, this paper examines the two related issues. The granulation structures used by standard rough set theory and the corresponding approximation structures are reviewed. Hierarchical granulation and approximation structures are studied, which results in stratified rough set approximations. A nested sequence of granulations induced by a set of nested equivalence relations leads to a nested sequence of rough set approximations. A multi-level granulation, characterized by a special class of equivalence relations, leads to a more general approximation structure. The notion of neighborhood systems is also explored.

515 citations


Journal ArticleDOI
01 Aug 2001
TL;DR: This paper introduces two generalisations of the rough sets theory that introduce the use of a non symmetric similarity relation in order to formalise the idea of absent value semantics and shows that for the valued tolerance approach it is possible to obtain more informative approximations and decision rules.
Abstract: The rough set theory, based on the original definition of the indiscernibility relation, is not useful for analysing incomplete information tables where some values of attributes are unknown. In this paper we distinguish two different semantics for incomplete information: the "missing value" semantics and the "absent value" semantics. The already known approaches, e.g. based on the tolerance relations, deal with the missing value case. We introduce two generalisations of the rough sets theory to handle these situations. The first generalisation introduces the use of a non symmetric similarity relation in order to formalise the idea of absent value semantics. The second proposal is based on the use of valued tolerance relations. A logical analysis and the computational experiments show that for the valued tolerance approach it is possible to obtain more informative approximations and decision rules than using the approach based on the simple tolerance relation.

354 citations


Journal ArticleDOI
05 Oct 2001
TL;DR: An algorithm which is using rough set theory with greedy heuristics for feature selection and selects the features that do not damage the performance of induction is proposed.
Abstract: Practical machine learning algorithms are known to degrade in performance (prediction accuracy) when faced with many features (sometimes attribute is used instead of feature) that are not necessary for rule discovery. To cope with this problem, many methods for selecting a subset of features have been proposed. Among such methods, the filter approach that selects a feature subset using a preprocessing step, and the wrapper approach that selects an optimal feature subset from the space of possible subsets of features using the induction algorithm itself as a part of the evaluation function, are two typical ones. Although the filter approach is a faster one, it has some blindness and the performance of induction is not considered. On the other hand, the optimal feature subsets can be obtained by using the wrapper approach, but it is not easy to use because of the complexity of time and space. In this paper, we propose an algorithm which is using rough set theory with greedy heuristics for feature selection. Selecting features is similar to the filter approach, but the evaluation criterion is related to the performance of induction. That is, we select the features that do not damage the performance of induction.

295 citations


01 Jan 2001
TL;DR: An additional condition is suggested for finding β-reducts which assures a more general level knowledge equivalent to that of the full set of attributes in the variable precision rough sets model.
Abstract: Abstract One fundamental aspect of the variable precision rough sets (VPRS) model involves a search for subsets of condition attributes which provide the same information for classification purposes as the full set of available attributes. Such subsets are labelled `approximate reducts' or `β-reducts', being defined for a specified classification error denoted by β. This paper undertakes a further investigation of the criteria for a β-reduct within VPRS. Certain anomalies and interesting implications are identified. An additional condition is suggested for finding β-reducts which assures a more general level knowledge equivalent to that of the full set of attributes.

289 citations


Journal ArticleDOI
TL;DR: In this article, the authors investigated the criteria for a β-reduct within variable precision rough sets (VPRS) and suggested an additional condition for finding βreducts which assures a more general level knowledge equivalent to that of the full set of attributes.

278 citations


Journal ArticleDOI
TL;DR: This article investigates the applicability of RS theory to the IF/IR application domain and compares this applicability with respect to various existing TC techniques, and investigates the ability of the approach to generalize, given a minimum of training data.
Abstract: The volume of electronically stored information increases exponentially as the state of the art progresses. Automated information filtering (IF) and information retrieval (IR) systems are therefore acquiring rapidly increasing prominence. However, such systems sacrifice efficiency to boost effectiveness. Such systems typically have to cope with sets of vectors of many tens of thousands of dimensions. Rough set (RS) theory can be applied to reducing the dimensionality of data used in IF/IR tasks, by providing a measure of the information content of datasets with respect to a given classification. This can aid IF/IR systems that rely on the acquisition of large numbers of term weights or other measures of relevance. This article investigates the applicability of RS theory to the IF/IR application domain and compares this applicability with respect to various existing TC techniques. The ability of the approach to generalize, given a minimum of training data is also addressed. The background of RS theory is p...

273 citations


Journal ArticleDOI
TL;DR: It is concluded that VPRS is a promising addition to existing methods in that it is a practical tool, which generates explicit probabilistic rules from a given information system, with the rules offering the decision maker informative insights into classification problems.
Abstract: Since the seminal work of Pawlak (International Journal of Information and Computer Science, 11 (1982) 341-356) rough set theory (RST) has evolved into a rule-based decision-making technique. To date, however, relatively little empirical research has been conducted on the efficacy of the rough set approach in the context of business and finance applications. This paper extends previous research by employing a development of RST, namely the variable precision rough sets (VPRS) model, in an experiment to predict between failed and non-failed UK companies. It also utilizes the FUSINTER discretisation method which neglates the influence of an 'expert' opinion. The results of the VPRS analysis are compared to those generated by the classical logit and multivariate discriminant analysis, together with more closely related non-parametric decision tree methods. It is concluded that VPRS is a promising addition to existing methods in that it is a practical tool, which generates explicit probabilistic rules from a given information system, with the rules offering the decision maker informative insights into classification problems.

254 citations


Journal ArticleDOI
TL;DR: The completeness of the algorithms for Pawlak reduct and the uniqueness for a given order of the attributes are proved and their optimal paradigms ensure the completeness as long as they satisfy some conditions.
Abstract: In this paper, we present reduction algorithms based on the principle of Skowron’s discernibility matrix — the ordered attributes method. The completeness of the algorithms for Pawlak reduct and the uniqueness for a given order of the attributes are proved. Since a discernibility matrix requires the size of the memory of |U|2,U is a universe of objects, it would be impossible to apply these algorithms directly to a massive object set. In order to solve the problem, a so-called quasi-discernibility matrix and two reduction algorithms are proposed. Although the proposed algorithms are incomplete for Pawlak reduct, their optimal paradigms ensure the completeness as long as they satisfy some conditions. Finally, we consider the problem on the reduction of distributive object sets.

216 citations


Journal ArticleDOI
Andrew Kusiak1
TL;DR: The rough set theory offers a viable approach for extraction of decision rules from data sets that can be used for making predictions in the semiconductor industry and other applications and a new rule-structuring algorithm is proposed.
Abstract: The growing volume of information poses interesting challenges and calls for tools that discover properties of data. Data mining has emerged as a discipline that contributes tools for data analysis, discovery of new knowledge, and autonomous decisionmaking. In this paper, the basic concepts of rough set theory and other aspects of data mining are introduced. The rough set theory offers a viable approach for extraction of decision rules from data sets. The extracted rules can be used for making predictions in the semiconductor industry and other applications. This contrasts other approaches such as regression analysis and neural networks where a single model is built. One of the goals of data mining is to extract meaningful knowledge. The power, generality, accuracy, and longevity of decision rules can be increased by the application of concepts from systems engineering and evolutionary computation introduced in this paper. A new rule-structuring algorithm is proposed. The concepts presented in the paper are illustrated with examples.

197 citations


Book
01 Jan 2001
TL;DR: Interval arithmetic and interval analysis: An Introduction and Interval and Ellipsoidal Uncertainty Models as discussed by the authors, Nonlinear Bounded-Error Parameter Estimation Using Interval Computation.
Abstract: Interval Arithmetic and Interval Analysis: An Introduction.- Interval and Ellipsoidal Uncertainty Models.- Nonlinear Bounded-Error Parameter Estimation Using Interval Computation.- Random Sets: Theory and Applications.- Rough Sets and Boolean Reasoning..- Granulation and Nearest Neighborhoods: Rough Set Approach..- An Inquiry into the Theory of Defuzzification.- Fuzzy Partitioning Methods.- A Coding Method to Handle Linguistic Variables.- A Formal Theory of Fuzzy Natural Language Quantification and its Role in Granular Computing..- Granularity and Specificity in Fuzzy Rule-Based Systems.- Granular Computing in Neural Networks.- Fuzzy Clustering for Multiple-Model Approaches in System Identification and Control..- Information Granulation in Automated Modeling.- Optical Music Recognition: the Case of Granular Computing.- Modeling MPEG VBR Video Traffic Using Type-2 Fuzzy Logic Systems.- Induction of Rules about Complications with the Use of Rough Sets.

Journal Article
TL;DR: Rough set theory is introduced as a new and effective soft computation method that could combine with fuzzy set, parallel algorism and expert system organically and the conditions of combination are described.
Abstract: Rough set theory is introduced as a new and effective soft computation method. Its basic theoretic scheme is explained while its typical applications are discussed. Rough set could combine with fuzzy set, parallel algorism and expert system organically and the conditions of combination are described. Furthermore, the configurations of some intelligent process system are presented.

Journal ArticleDOI
TL;DR: A new framework for vocabulary mining that derives from the combination of rough sets and fuzzy sets is investigated, which supports the systematic study and application of different vocabulary views in information retrieval.
Abstract: Vocabulary mining in information retrieval refers to the utilization of the domain vocabulary towards improving the user’s query. Most often queries posed to information retrieval systems are not optimal for retrieval purposes. Vocabulary mining allows one to generalize, specialize or perform other kinds of vocabulary-based transformations on the query in order to improve retrieval performance. This paper investigates a new framework for vocabulary mining that derives from the combination of rough sets and fuzzy sets. The framework allows one to use rough set-based approximations even when the documents and queries are described using weighted, i.e., fuzzy representations. The paper also explores the application of generalized rough sets and the variable precision models. The problem of coordination between multiple vocabulary views is also examined. Finally, a preliminary analysis of issues that arise when applying the proposed vocabulary mining framework to the Unified Medical Language System (a state-of-the-art vocabulary system) is presented. The proposed framework supports the systematic study and application of different vocabulary views in information retrieval.

Journal Article
TL;DR: The paper presents numerical results of face recognition experiments using the learning vector quantization neural network, with feature selection based on the proposed principal components analysis and rough sets methods.
Abstract: The paper presents an application of rough sets and statistical methods to feature reduction and pattern recognition. The presented description of rough sets theory emphasizes the role of rough sets reducts in feature selection and data reduction in pattern recognition. The overview of methods of feature selection emphasizes feature selection criteria, including rough set-based methods. The paper also contains a description of the algorithm for feature selection and reduction based on the rough sets method proposed jointly with Principal Component Analysis. Finally, the paper presents numerical results of face recognition experiments using the learning vector quantization neural network, with feature selection based on the proposed principal components analysis and rough sets methods.

Book
14 Nov 2001
TL;DR: Introduction.
Abstract: Introduction. Neural Networks. Neural Networks: Other Models. Genetic Algorithms. Fuzzy Systems. Rough Sets. Chaos.

Journal ArticleDOI
TL;DR: The numerical experiments show the ability of rough sets to select reduced set of pattern's features (minimizing the pattern size), while providing better generalization of neural-network texture classifiers.

Journal Article
TL;DR: This paper presents two different approaches to the concept approximation, one based on rough set theory while the other based on a similarity measure, and presents algorithms for the two approaches.
Abstract: The formal concept analysis gives a mathematical definition of a formal concept. However, in many real-life applications, the problem under investigation cannot be described by formal concepts. Such concepts are called the non-definable concepts (Saquer and Deogun, 2000a). The process of finding formal concepts that best describe non-definable concepts is called the concept approximation. In this paper, we present two different approaches to the concept approximation. The first approach is based on rough set theory while the other is based on a similarity measure. We present algorithms for the two approaches.

Journal Article
TL;DR: A rule discovery process that is based on rough set theory is discussed, using a slope-collapse database as an example showing how rules can be discovered from a large, real-life database.
Abstract: The knowledge discovery from real-life databases is a multi-phase process consisting of numerous steps, including attribute selection, discretization of realvalued attributes, and rule induction. In the paper, we discuss a rule discovery process that is based on rough set theory. The core of the process is a soft hybrid induction system called the Generalized Distribution Table and Rough Set System (GDT-RS) for discovering classification rules from databases with uncertain and incomplete data. The system is based on a combination of Generalization Distribution Table (GDT) and the Rough Set methodologies. In the preprocessing, two modules, i.e. Rough Sets with Heuristics (RSH) and Rough Sets with Boolean Reasoning (RSBR), are used for attribute selection and discretization of real-valued attributes, respectively. We use a slope-collapse database as an example showing how rules can be discovered from a large, real-life database.

Journal ArticleDOI
01 Aug 2001
TL;DR: This work presents applications of Rough Mereology to the important theoretical idea put forth by Lotfi Zadeh, i.e., Granularity of Knowledge, and defines granules of knowledge by means of the operator of mereological class and extends the idea of a granule over complex objects like decision rules as well as decision algorithms.
Abstract: Rough Mereology is a paradigm allowing for a synthesis of main ideas of two potent paradigms for reasoning under uncertainty: Fuzzy Set Theory and Rough Set Theory. Approximate reasoning is based in this paradigm on the predicate of being a part to a degree. We present applications of Rough Mereology to the important theoretical idea put forth by Lotfi Zadeh (1996, 1997), i.e., Granularity of Knowledge: We define granules of knowledge by means of the operator of mereological class and we extend the idea of a granule over complex objects like decision rules as well as decision algorithms. We apply these notions and methods in the distributed environment discussing complex problems of knowledge and granule fusion. We express the mechanism of complex granule formation by means of a formal grammar called Synthesis Grammar defined over granules of knowledge, granules of classifying rules, or over granules of classifying algorithms. We finally propose hybrid rough-neural schemes bridging rough and neural computations.

Journal ArticleDOI
05 Oct 2001
TL;DR: This paper describes how genetic algorithms can be used to develop rough sets and the proposed rough set theoretic genetic encoding will be especially useful in unsupervised learning.
Abstract: The rough set is a useful notion for the classification of objects when the available information is not adequate to represent classes using precise sets Rough sets have been successfully used in information systems for learning rules from an expert This paper describes how genetic algorithms can be used to develop rough sets The proposed rough set theoretic genetic encoding will be especially useful in unsupervised learning A rough set genome consists of upper and lower bounds for sets in a partition The partition may be as simple as the conventional expert class and its complement or a more general classification scheme The paper provides a complete description of design and implementation of rough set genomes The proposed design and implementation is used to provide an unsupervised rough set classification of highway sections

Journal Article
TL;DR: It is proved that the problem of finding optimal set of classifying agents based on approximate reducts is NP-hard; the genetic algorithm is applied to find the suboptimal set.
Abstract: The problem of improving rough set based expert systems by modifying a notion of reduct is discussed The notion of approximate reduct is introduced, as well as some proposals of quality measure for such a reduct The complete classifying system based on approximate reducts is presented and discussed It is proved that the problem of finding optimal set of classifying agents based on approximate reducts is NP-hard; the genetic algorithm is applied to find the suboptimal set Experimental results show that the classifying system is effective and relatively fast

Journal ArticleDOI
TL;DR: An extensive experimental study on several well‐known data sets was performed where two different approaches were compared: the popular rough set based rule induction algorithm LEM2 generating classification rules, and the own algorithm Explore—specific for discovery perspective.
Abstract: This paper discusses induction of decision rules from data tables representing information about a set of objects described by a set of attributes. If the input data contains inconsistencies, rough sets theory can be used to handle them. The most popular perspectives of rule induction are classification and knowledge discovery. The evaluation of decision rules is quite different depending on the perspective. Criteria for evaluating the quality of a set of rules are presented and discussed. The degree of conflict and the possibility of achieving a satisfying compromise between criteria relevant to classification and criteria relevant to discovery are then analyzed. For this purpose, we performed an extensive experimental study on several well-known data sets where we compared two different approaches: (1) the popular rough set based rule induction algorithm LEM2 generating classification rules, (2) our own algorithm Explore - specific for discovery perspective.

Proceedings ArticleDOI
29 Nov 2001
TL;DR: A novel approach to constructing a good ensemble of classifiers using rough set theory and database operations, where each reduct is a minimum subset of attributes and has the same classification ability as the entire attributes.
Abstract: The article presents a novel approach to constructing a good ensemble of classifiers using rough set theory and database operations. Ensembles of classifiers are formulated precisely within the framework of rough set theory and constructed very efficiently by using set-oriented database operations. Our method first computes a set of reducts which include all the indispensable attributes required for the decision categories. For each reduct, a reduct table is generated by removing those attributes which are not in the reduct. Next, a novel rule induction algorithm is used to compute the maximal generalized rules for each reduct table and a set of reduct classifiers is formed based on the corresponding reducts. The distinctive features of our method as compared to other methods of constructing ensembles of classifiers are: (1) presents a theoretical model to explain the mechanism of constructing ensemble of classifiers; (2) each reduct is a minimum subset of attributes and has the same classification ability as the entire attributes; (3) each reduct classifier constructed from the corresponding reduct has a minimal set of classification rules, and is as accurate and complete as possible and at the same time as diverse as possible from the other classifiers; (4) the test indicates that the number of classifiers used to improve the accuracy is much less than other methods.

Journal ArticleDOI
TL;DR: This paper re-interpret the classical in terms of a classic measure based on sets, the Marczewski-Steinhaus metric, and also in Terms of "proportional reduction of errors" (PRE) measures.

Journal ArticleDOI
TL;DR: The validity of the proposed classification method is tested by applying it to the IRIS data classification and its classification performance and processing time are compared with those of other classification methods such as BPNN, OFUNN, and FCM.

Journal ArticleDOI
01 Aug 2001
TL;DR: The focus of the paper is on the extraction of decision table‐based predictive models from data, their relationship to conjunctive rules and probabilistic assessment of decision confidence with such rules.
Abstract: The Variable Precision Rough Set Model (VPRS) is an extension of the original rough set model. This extension is directed towards deriving decision table-based predictive models from data with parametrically adjustable degrees of accuracy. The imprecise nature of such models leads to quite significant modification of the classical notion of decision table. This is accomplished by introducing the idea of approximation region-based, or probabilistic decision table which is a tabular specification of three, in general uncertain, disjunctive decision rules corresponding to rough approximation regions: positive, boundary and negative regions. The focus of the paper is on the extraction of such decision tables from data, their relationship to conjunctive rules and probabilistic assessment of decision confidence with such rules.

Proceedings ArticleDOI
22 May 2001
TL;DR: The paper presented is a continuation of the first paper mentioned above by applying the classical diamond and box operators to fuzzy sets, i.e. by using the concepts of rough fuzzy sets.
Abstract: In two previous papers we have developed axiomatic characterizations of approximation operators which are defined by the classical diamond and box operator of the modal logic on the one hand and are defined by the "fuzzified" diamond and box operator in applying to crisp sets, i.e. by using the concept of fuzzy rough sets on the other hand. The paper presented is a continuation of the first paper mentioned above by applying the classical diamond and box operators to fuzzy sets, i.e. by using the concepts of rough fuzzy sets.

Book ChapterDOI
01 Jan 2001
TL;DR: Application of rough set methods for feature selection, feature extraction, discovery of patterns and their applications for decomposition of large data tables are presented as well as the relationship of rough sets with association rules.
Abstract: In recent years we witness a rapid growth of interest in rough set theory and its applications, worldwide. The theory has been followed by the development of several software systems that implement rough set operations, in particular for solving knowledge discovery and data mining tasks. Rough sets are applied in domains, such as, for instance, medicine, finance, telecommunication, vibration analysis, conflict resolution, intelligent agents, pattern recognition, control theory, signal analysis, process industry, marketing, etc.We introduce basic notions and discuss methodologies for analyzing data and surveys some applications. In particular we present applications of rough set methods for feature selection, feature extraction, discovery of patterns and their applications for decomposition of large data tables as well as the relationship of rough sets with association rules. Boolean reasoning is crucial for all the discussed methods.We also present an overview of some extensions of the classical rough set approach. Among them is rough mereology developed as a tool for synthesis of objects satisfying a given specification in a satisfactory degree. Applications of rough mereology in such areas like granular computing, spatial reasoning and data mining in distributed environment are outlined.

Journal ArticleDOI
TL;DR: This paper presents an integrated approach that combines rough set theory, genetic algorithms and Boolean algebra, for inductive learning, and develops a prototype system (RClass-Plus) that discovers rules from inconsistent empirical data.

Journal ArticleDOI
01 Aug 2001
TL;DR: The output of a rough membership function neuron results from the computation performed by arough membership function in determining degree of overlap between an upper approximation set representing approximate knowledge about inputs and a set of measurements representing certain knowledge about a particular class of objects.
Abstract: This paper introduces an application of a particular form of rough neural computing in signal analysis. The form of rough neural network used in this study is based on rough sets, rough membership functions, and decision rules. Two forms of neurons are found in such a network: rough membership function neurons and decider neurons. Each rough membership function neuron constructs upper and lower approximation equivalence classes in response to input signals as an aid to classifying inputs. In this paper, the output of a rough membership function neuron results from the computation performed by a rough membership function in determining degree of overlap between an upper approximation set representing approximate knowledge about inputs and a set of measurements representing certain knowledge about a particular class of objects. Decider neurons implement granules derived from decision rules extracted from data sets using rough set theory. A decider neuron instantiates approximate reasoning in assessing rough membership function values gleaned from input data. An introduction to the basic concepts underlying rough membership neural networks is briefly given. An application of rough neural computing in classifying the power system faults is considered.