scispace - formally typeset
Search or ask a question

Showing papers on "Rough set published in 2000"


Journal ArticleDOI
TL;DR: New definitions of lower and upper approximations are proposed, which are basic concepts of the rough set theory and are shown to be more general, in the sense that they are the only ones which can be used for any type of indiscernibility or similarity relation.
Abstract: This paper proposes new definitions of lower and upper approximations, which are basic concepts of the rough set theory. These definitions follow naturally from the concept of ambiguity introduced in this paper. The new definitions are compared to the classical definitions and are shown to be more general, in the sense that they are the only ones which can be used for any type of indiscernibility or similarity relation.

963 citations


Journal ArticleDOI
TL;DR: This article proposes to bring the various neuro-fuzzy models used for rule generation under a unified soft computing framework, and includes both rule extraction and rule refinement in the broader perspective of rule generation.
Abstract: The present article is a novel attempt in providing an exhaustive survey of neuro-fuzzy rule generation algorithms. Rule generation from artificial neural networks is gaining in popularity in recent times due to its capability of providing some insight to the user about the symbolic knowledge embedded within the network. Fuzzy sets are an aid in providing this information in a more human comprehensible or natural form, and can handle uncertainties at various levels. The neuro-fuzzy approach, symbiotically combining the merits of connectionist and fuzzy approaches, constitutes a key component of soft computing at this stage. To date, there has been no detailed and integrated categorization of the various neuro-fuzzy models used for rule generation. We propose to bring these together under a unified soft computing framework. Moreover, we include both rule extraction and rule refinement in the broader perspective of rule generation. Rules learned and generated for fuzzy reasoning and fuzzy control are also considered from this wider viewpoint. Models are grouped on the basis of their level of neuro-fuzzy synthesis. Use of other soft computing tools like genetic algorithms and rough sets are emphasized. Rule generation from fuzzy knowledge-based networks, which initially encode some crude domain knowledge, are found to result in more refined rules. Finally, real-life application to medical diagnosis is provided.

726 citations


Book ChapterDOI
01 Dec 2000
TL;DR: Some algorithms, based on rough set theory, that can be used for the problem of new cases classification, and several methods for computation of decision rules based on reducts for real value attribute discretization are presented.
Abstract: We we present some algorithms, based on rough set theory, that can be used for the problem of new cases classification. Most of the algorithms were implemented and included in Rosetta system [43]. We present several methods for computation of decision rules based on reducts. We discuss the problem of real value attribute discretization for increasing the performance of algorithms and quality of decision rules. Finally we deal with a problem of resolving conflicts between decision rules classifying a new case to different categories (classes). Keywords: knowledge discovery, rough sets, classification algorithms, reducts, decision rules, real value attribute discretization

446 citations


Journal ArticleDOI
TL;DR: A hybrid intelligent system that predicts the failure of firms based on the past financial performance data, combining rough set approach and neural network is proposed, which implies that the number of evaluation criteria such as financial ratios and qualitative variables is reduced with no information loss through roughSet approach.
Abstract: This paper proposes a hybrid intelligent system that predicts the failure of firms based on the past financial performance data, combining rough set approach and neural network. We can get reduced information table, which implies that the number of evaluation criteria such as financial ratios and qualitative variables is reduced with no information loss through rough set approach. And then, this reduced information is used to develop classification rules and train neural network to infer appropriate parameters. The rules developed by rough set analysis show the best prediction accuracy if a case does match any of the rules. The rationale of our hybrid system is using rules developed by rough sets for an object that matches any of the rules and neural network for one that does not match any of them. The effectiveness of our methodology was verified by experiments comparing traditional discriminant analysis and neural network approach with our hybrid approach. For the experiment, the financial data of 2400 Korean firms during the period 1994–1997 were selected, and for the validation, k-fold validation was used.

308 citations


Dissertation
01 Jan 2000
TL;DR: This thesis examines how discernibility-based methods can be equipped to posses several qualities that are needed for analyzing tabular medical data, and how these models can be evaluated according to these qualities.
Abstract: This thesis examines how discernibility-based methods can be equipped to posses several qualities that are needed for analyzing tabular medical data, and how these models can be evaluated according ...

281 citations


Book
01 Dec 2000
TL;DR: This book discusses Rough Sets and Rough Logic: A KDD Perspective from a Rough Set Perspective, which aims to provide a perspective on knowledge discovery in Information Systems from a rough set perspective.
Abstract: 1. Introduction.- Introducing the Book.- 1. A Rough Set Perspective on Knowledge Discovery in Information Systems: An Essay on the Topic of the Book.- 2. Methods and Applications: Reducts, Similarity, Mereology.- 2. Rough Set Algorithms in Classification Problem.- 3. Rough Mereology in Information Systems. A Case Study: Qualitative Spatial Reasoning.- 4. Knowledge Discovery by Application of Rough Set Models.- 5. Various Approaches to Reasoning with Frequency Based Decision Reducts: A Survey.- 3. Methods and Applications: Regular Pattern Extraction, Concurrency.- 6. Regularity Analysis and its Applications in Data Mining.- 7. Rough Set Methods for the Synthesis and Analysis of Concurrent Processes.- 4. Methods and Applications: Algebraic and Statistical Aspects, Conflicts, Incompleteness.- 8. Conflict Analysis.- 9. Logical and Algebraic Techniques for Rough Set Data Analysis.- 10. Statistical Techniques for Rough Set Data Analysis.- 11. Data Mining in Incomplete Information Systems from Rough Set Perspective.- 5. Afterword.- 12. Rough Sets and Rough Logic: A KDD Perspective.- Appendix: Selected Bibliofgraphy on Rough Sets.

272 citations



Journal ArticleDOI
TL;DR: In this paper, a new method, Rough Sets Theory, is used to diagnose the valve fault for a multi-cylinder diesel engine and it is shown that this new method is effective for valve fault diagnosis.

183 citations


Book ChapterDOI
TL;DR: The relaxation introduced in this paper to the DRSA model admits some inconsistent objects to the lower approximations; the range of this relaxation is controlled by an index called consistency level, and the resulting model is called variable-consistency model (VC-DRSA).
Abstract: Consideration of preference-orders requires the use of an extended rough set model called Dominance-based Rough Set Approach (DRSA). The rough approximations defined within DRSA are based on consistency in the sense of dominance principle. It requires that objects having not-worse evaluation with respect to a set of considered criteria than a referent object cannot be assigned to a worse class than the referent object. However, some inconsistencies may decrease the cardinality of lower approximations to such an extent that it is impossible to discover strong patterns in the data, particularly when data sets are large. Thus, a relaxation of the strict dominance principle is worthwhile. The relaxation introduced in this paper to the DRSA model admits some inconsistent objects to the lower approximations; the range of this relaxation is controlled by an index called consistency level. The resulting model is called variable-consistency model (VC-DRSA). We concentrate on the new definitions of rough approximations and their properties, and we propose a new syntax of decision rules characterized by a confidence degree not less than the consistency level. The use of VC-DRSA is illustrated by an example of customer satisfaction analysis referring to an airline company.

168 citations


Book ChapterDOI
TL;DR: Rough Set Exploration System - a set of software tools featuring a library of methods and a graphical user interface is presented.
Abstract: Rough Set Exploration System - a set of software tools featuring a library of methods and a graphical user interface is presented. Methods, features and abilities of the implemented software are discussed and illustrated with a case study in data analysis.

168 citations


Book ChapterDOI
TL;DR: An algorithm is intoduce that induces a minimal set of generalized decision rules consistent with the dominance principle that is an extension of this algorithm for a variable consistency model of dominance based rough set approach.
Abstract: Induction of decision rules within the dominance-based rough set approach to the multiple-criteria sorting decision problem is discussed in this paper. We intoduce an algorithm called DOMLEM that induces a minimal set of generalized decision rules consistent with the dominance principle. An extension of this algorithm for a variable consistency model of dominance based rough set approach is also presented.

Journal ArticleDOI
TL;DR: A new approach is presented that incorporates the rough set theory to model the relations that exist among a set of mixed numeric and non-numeric tourism shopping data and shows significant difference between the forecasted values and their actual counterparts.

Journal ArticleDOI
TL;DR: The paper presents a transition from the crisp rough set theory to a fuzzy one, called Alpha Rough Set Theory or, in short, a-RST, which leads naturally to the new concept of alpha rough sets which represents sets with fuzzy non-empty boundaries.

Journal ArticleDOI
TL;DR: In this article, a rough sets-based bankruptcy prediction model is proposed for dealing with this type of prediction problem, which is based on rough sets theory, and the model was shown to be 93% accurate in predicting bankruptcy on a 100 company developmental sample and 88% accurate on the overall separate 100-company holdout sample.
Abstract: The high individual and social costs encountered in corporate bankruptcies make this decision problem very important to parties such as auditors, management, government policy makers, and investors. Bankruptcy is a worldwide problem and the number of bankruptcies can be considered an index of the robustness of individual country economies. The costs associated with this problem have led to special disclosure responsibilities for both management and auditors. Bankruptcy prediction is a problematic issue for all parties associated with corporate reporting since the development of a cause–effect relationship between the many attributes that may cause or be related to bankruptcy and the actual occurrence of bankruptcy is difficult. An approach that has been proposed for dealing with this type of prediction problem is rough sets theory. Rough sets theory involves a calculus of partitions. A rough sets theory based model has the following advantages: (1) the rough sets data analysis process results in the information contained in a large number of cases being reduced to a model containing a generalized description of knowledge, (2) the model is a set of easily understandable decision rules which do not normally need interpretation, (3) each decision rule is supported by a set of real examples, (4) additional information like probabilities in statistics or grade of membership in fuzzy set theory is not required. In keeping with the philosophy of building on prior research, variables identified in prior recursive partitioning research were used to develop a rough sets bankruptcy prediction model. The model was 93% accurate in predicting bankruptcy on a 100-company developmental sample and 88% accurate on the overall separate 100-company holdout sample. This was superior to the original recursive partitioning model which was only 65% accurate on the same data set. The current research findings are also compared, both in terms of predictive results and variables identified, to three prior rough sets empirical bankruptcy prediction studies. The model produced by the current research had a significantly higher prediction accuracy on its validation sample and employed fewer variables. This research significantly extends prior rough sets bankruptcy prediction research by using a larger sample size and data from U.S. public companies. Implications for both bankruptcy prediction and future research are explored. Copyright © 2000 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: This work defines an r-approximate hitting set as a set that intersects at least a fraction r of the sets in C, and extends the case, where C is a weighted multiset, and properties of r are explored with respect to simplification of C by absorption of supersets.

Journal ArticleDOI
TL;DR: A measure of fuzziness in rough sets is introduced and some characterizations of this measure are made with examples.

Journal ArticleDOI
TL;DR: A model for information filtering on the Web with rough set decision theory is proposed, and it shows that the rough set based model can provide an efficient approach to solve the information overload problem.
Abstract: Machine-learning techniques play the important roles for information filtering. The main objective of machine-learning is to obtain users' profiles. To decrease the burden of on-line learning, it is important to seek suitable structures to represent user information needs. This paper proposes a model for information filtering on the Web. The user information need is described into two levels in this model: profiles on category level, and Boolean queries on document level. To efficiently estimate the relevance between the user information need and documents, the user information need is treated as a rough set on the space of documents. The rough set decision theory is used to classify the new documents according to the user information need. In return for this, the new documents are divided into three parts: positive region, boundary region, and negative region. An experimental system JobAgent is also presented to verify this model, and it shows that the rough set based model can provide an efficient approach to solve the information overload problem.

Journal ArticleDOI
01 Dec 2000
TL;DR: A novel approach for autonomous decision-making is developed based on the rough set theory of data mining based on a medical data set for patients with lung abnormalities referred to as solitary pulmonary nodules (SPNs).
Abstract: The researchers and practitioners of today create models, algorithms, functions, and other constructs defined in abstract spaces. The research of the future will likely be data driven. Symbolic and numeric data that are becoming available in large volumes will define the need for new data analysis techniques and tools. Data mining is an emerging area of computational intelligence that offers new theories, techniques, and tools for analysis of large data sets. In this paper, a novel approach for autonomous decision-making is developed based on the rough set theory of data mining. The approach has been tested on a medical data set for patients with lung abnormalities referred to as solitary pulmonary nodules (SPNs). The two independent algorithms developed in this paper either generate an accurate diagnosis or make no decision. The methodology discussed in the paper depart from the developments in data mining as well as current medical literature, thus creating a variable approach for autonomous decision-making.

Journal ArticleDOI
TL;DR: A highly modular framework for data-driven fuzzy ruleset induction incorporating a dimensionality-reduction step based on rough set theory is proposed, which removes redundant and information-poor attributes from the data, thereby significantly increasing the speed of the induction algorithm.

Journal ArticleDOI
TL;DR: It is suggested that this approach to rough classification should be further investigated as it can be used in a range of applications within geographic information science from data acquisition and analysis to metadata organization.
Abstract: In search for methods to handle imprecision in geographical information this paper explores the use of rough classification to represent uncertainty. Rough classification is based on rough set theory, where an uncertain set is specified by giving an upper and a lower approximation. Novel measures are presented to assess a single rough classification, to compare a rough classification to a crisp one and to compare two rough classifications. An extension to the error matrix paradigm is also presented, both for the rough-crisp and the roughrough cases. An experiment on vegetation and soil data demonstrates the viability of rough classification, comparing two incompatible vegetation classifications covering the same area. The potential uses of rough sets and rough classification are discussed and it is suggested that this approach should be further investigated as it can be used in a range of applications within geographic information science from data acquisition and analysis to metadata organization.

Journal ArticleDOI
01 Aug 2000-Infor
TL;DR: Describing of the rough set approach to the multicriteria sorting problem is concentrated on, illustrated by a case study of airline company financial ratings.
Abstract: The original version of the rough sets theory has proved to be particularly useful in the analysis of multiattribute classification problems under inconsistency following from information granulation, i.e. objects having the same description but belonging to different classes. It fails, however, when attributes with preference-ordered domains (criteria) have to be taken into account. In order to deal with problems of multicriteria decision analysis (MCDA), such as sorting, choice or ranking, the authors have extended the original rough sets theory in a number of directions. The main extension is the substitution of the indiscernibility relation by a dominance relation which permits approximation of ordered decision classes in multicriteria sorting. Second extension was necessary to approximate preference relations in multicriteria choice and ranking problems; it requires substitution of the data table by a pairwise comparison table, where each row corresponds to a pair of actions described by bina...

Book ChapterDOI
01 Dec 2000
TL;DR: In this article, the authors discuss selected rough set based solutions to two main knowledge discovery problems, namely the description problem and the classification (prediction) problem, which are the subject of the field of knowledge discovery in databases.
Abstract: The amount of electronic data available is growing very fast and this explosive growth in databases has generated a need for new techniques and tools that can intelligently and automatically extract implicit, previously unknown, hidden and potentially useful information and knowledge from these data These tools and techniques are the subject of the field of Knowledge Discovery in Databases In this Chapter we discuss selected rough set based solutions to two main knowledge discovery problems, namely the description problem and the classification (prediction) problem

Book ChapterDOI
01 Jan 2000
TL;DR: An extension of the rough set methodology to the analysis of incomplete data tables and the adapted relations of indiscernibility or dominance between a pair of objects are considered as directional statements where a subject is compared to a referent object.
Abstract: Rough sets methodology is a useful tool for analysis of decision problems concerning a set of objects described in a data table by a set of condition attributes and by a set of decision attributes. In practical applications, however, the data table is often not complete because some data are missing. To deal with this case, we propose an extension of the rough set methodology to the analysis of incomplete data tables. The adaptation concerns both the classical rough set approach based on the use of indiscernibility relations and the new rough set approach based on the use of dominance relations. While the first approach deals with the multi-attribute classification problem, the second approach deals with the multi-criteria sorting problem. In the latter, condition attributes have preference-ordered scales, and thus are called criteria, and the classes defined by the decision attributes are also preference-ordered. The adapted relations of indiscernibility or dominance between a pair of objects are considered as directional statements where a subject is compared to a referent object. We require that the referent object has no missing data The two adapted rough set approaches boil down to the original approaches when there are no missing data. The rules induced from the newly defined rough approximations defined are either exact or approximate, depending whether they are supported by consistent objects or not, and they are robust in a sense that each rule is supported by at least one object with no missing data on the condition attributes or criteria represented in the rule.

Journal ArticleDOI
TL;DR: The proposed classification method is applied to the handwritten numeral character classification problem and its classification performance and learning time are compared with those of the feedforward neural network's backpropagation algorithm.
Abstract: Proposes a data classification method based on the tolerant rough set that extends the existing equivalent rough set. A similarity measure between two data is described by a distance function of all constituent attributes and they are defined to be tolerant when their similarity measure exceeds a similarity threshold value. The determination of optimal similarity threshold value is very important for accurate classification. So, we determine it optimally by using the genetic algorithm (GA), where the goal of evolution is to balance two requirements such that: 1) some tolerant objects are required to be included in the same class as many as possible; and 2) some objects in the same class are required to be tolerant as much as possible. After finding the optimal similarity threshold value, a tolerant set of each object is obtained and the data set is grouped into the lower and upper approximation set depending on the coincidence of their classes. We propose a two-stage classification method such that all data are classified by using the lower approximation at the first stage and then the nonclassified data at the first stage are classified again by using the rough membership functions obtained from the upper approximation set. We apply the proposed classification method to the handwritten numeral character classification problem and compare its classification performance and learning time with those of the feedforward neural network's backpropagation algorithm.

Journal ArticleDOI
TL;DR: Describes a way of designing a hybrid decision support system in soft computing paradigm for detecting the different stages of cervical cancer using rough set theory and the Interactive Dichotomizer 3 (ID3) algorithm.
Abstract: Describes a way of designing a hybrid decision support system in soft computing paradigm for detecting the different stages of cervical cancer. Hybridization includes the evolution of knowledge-based subnetwork modules with genetic algorithms (CIAs) using rough set theory and the Interactive Dichotomizer 3 (ID3) algorithm. Crude subnetworks obtained via rough set theory and the ID3 algorithm are evolved using CAs. The evolution uses a restricted mutation operator which utilizes the knowledge of the modular structure, already generated, for faster convergence. The CA tunes the network weights and structure simultaneously. The aforesaid integration enhances the performance in terms of classification score, network size and training time, as compared to the conventional multilayer perceptron. This methodology also helps in imposing a structure on the weights, which results in a network more suitable for extraction of logical rules and human interpretation of the inferencing procedure.

BookDOI
01 Jan 2000
TL;DR: This book discusses multi-Objective Fuzzy Linear Programming, an Extension of the Axioms of Utility Theory Based on Fuzzed Rationality Measures, and Monotone Functions on Finite Lattices, an Ordinal Approach to Capacities, Belief and Necessity Functions.
Abstract: (P,Q,I,J) - Preference Structures.- Multi-Objective Fuzzy Linear Programming: The MOFAC Method.- An Extension of the Axioms of Utility Theory Based on Fuzzy Rationality Measures.- Hybrid Probabilistic-Possibilistic Mixtures and Utility Functions.- Additive Recursive Rules.- Maximizing the Information Obtained from Data Fusion.- Social Choice under Fuzziness: A Perspective.- Fuzzy Extension of the Rough Set Approach to Multicriteria and Multiattribute Sorting.- Behavioral Analysis of Aggregation in Multicriteria Decision Aid.- To be Symmetric or Asymmetric? A Dilemma in Decision Making.- Monotone Functions on Finite Lattices: An Ordinal Approach to Capacities, Belief and Necessity Functions.

Journal ArticleDOI
TL;DR: A new theoretical concept, strong compressibility, is defined, and the mathematical foundation for an efficient algorithm, the Expansion Algorithm, is presented, for generation of all reducts of an information system.
Abstract: When data sets are analyzed, statistical pattern recognition is often used to find the information hidden in the data. Another approach to information discovery is data mining. Data mining is concerned with finding previously undiscovered relationships in data sets. Rough set theory provides a theoretical basis from which to find these undiscovered relationships. We define a new theoretical concept, strong compressibility, and present the mathematical foundation for an efficient algorithm, the Expansion Algorithm, for generation of all reducts of an information system. The process of finding reducts has been proven to be NP-hard. Using the elimination method, problems of size 13 could be solved in reasonable times. Using our Expansion Algorithm, the size of problems that can be solved has grown to 40. Further, by using the strong compressibility property in the Expansion Algorithm, additional savings of up to 50% can be achieved. This paper presents this algorithm and the simulation results obtained from randomly generated information systems.

Journal ArticleDOI
TL;DR: The paper presented has solved the problem under which additional assumptions a given closure operator can be represented by an upper approximation within the concepts of rough set theory.


Book ChapterDOI
01 Jan 2000
TL;DR: This work uses a similarity relation which is only reflexive, relaxing therefore the properties of symmetry and transitivity to approximate a given set represented by objects having the same description in terms of decision attributes by means of an indiscernibility binary relation.
Abstract: The rough sets theory has proved to be a very useful tool for analysis of information tables describing objects by means of disjoint subsets of condition and decision attributes. The key idea of rough sets is approximation of knowledge expressed by decision attributes using knowledge expressed by condition attributes. From a formal point of view, the rough sets theory was originally founded on the idea of approximating a given set represented by objects having the same description in terms of decision attributes, by means of an indiscernibility binary relation linking pairs of objects having the same description by condition attributes. The indiscernibility relation is an equivalence binary relation (reflexive, symmetric and transitive) and implies an impossibility to distinguish two objects having the same description in terms of the condition attributes. It produces crisp granules of knowledge that are used to build approximations. In reality, due to vagueness of the available information about objects, small differences are not considered significant. This situation may be formally modelled by similarity or tolerance relations instead of the indiscernibility relation. We are using a similarity relation which is only reflexive, relaxing therefore the properties of symmetry and transitivity.