scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Rough set theory: a data mining tool for semiconductor manufacturing

Andrew Kusiak1
01 Jan 2001-IEEE Transactions on Electronics Packaging Manufacturing (IEEE)-Vol. 24, Iss: 1, pp 44-50
TL;DR: The rough set theory offers a viable approach for extraction of decision rules from data sets that can be used for making predictions in the semiconductor industry and other applications and a new rule-structuring algorithm is proposed.
Abstract: The growing volume of information poses interesting challenges and calls for tools that discover properties of data. Data mining has emerged as a discipline that contributes tools for data analysis, discovery of new knowledge, and autonomous decisionmaking. In this paper, the basic concepts of rough set theory and other aspects of data mining are introduced. The rough set theory offers a viable approach for extraction of decision rules from data sets. The extracted rules can be used for making predictions in the semiconductor industry and other applications. This contrasts other approaches such as regression analysis and neural networks where a single model is built. One of the goals of data mining is to extract meaningful knowledge. The power, generality, accuracy, and longevity of decision rules can be increased by the application of concepts from systems engineering and evolutionary computation introduced in this paper. A new rule-structuring algorithm is proposed. The concepts presented in the paper are illustrated with examples.
Citations
More filters
Journal ArticleDOI
TL;DR: In this article, a review of data mining applications in manufacturing engineering is presented, in particular production processes, operations, fault detection, maintenance, decision support, and product quality improvement.
Abstract: The paper reviews applications of data mining in manufacturing engineering, in particular production processes, operations, fault detection, maintenance, decision support, and product quality improvement. Customer relationship management, information integration aspects, and standardization are also briefly discussed. This review is focused on demonstrating the relevancy of data mining to manufacturing industry, rather than discussing the data mining domain in general. The volume of general data mining literature makes it difficult to gain a precise view of a target area such as manufacturing engineering, which has its own particular needs and requirements for mining applications. This review reveals progressive applications in addition to existing gaps and less considered areas such as manufacturing planning and shop floor control.

499 citations

Journal ArticleDOI
TL;DR: There is a rapid growth in the application of data mining in the context of manufacturing processes and enterprises in the last 3 years, and a review of the literature reveals the progressive applications and existing gaps identified.
Abstract: In modern manufacturing environments, vast amounts of data are collected in database management systems and data warehouses from all involved areas, including product and process design, assembly, materials planning, quality control, scheduling, maintenance, fault detection etc. Data mining has emerged as an important tool for knowledge acquisition from the manufacturing databases. This paper reviews the literature dealing with knowledge discovery and data mining applications in the broad domain of manufacturing with a special emphasis on the type of functions to be performed on the data. The major data mining functions to be performed include characterization and description, association, classification, prediction, clustering and evolution analysis. The papers reviewed have therefore been categorized in these five categories. It has been shown that there is a rapid growth in the application of data mining in the context of manufacturing processes and enterprises in the last 3 years. This review reveals the progressive applications and existing gaps identified in the context of data mining in manufacturing. A novel text mining approach has also been used on the abstracts and keywords of 150 papers to identify the research gaps and find the linkages between knowledge area, knowledge type and the applied data mining tools and techniques.

450 citations


Cites background from "Rough set theory: a data mining too..."

  • ...Kusiak [119] discussed the basic concept of RST as a prediction model....

    [...]

Journal ArticleDOI
TL;DR: A data mining framework based on decision tree and association rules to generate useful rules for personnel selection is developed and can provide decision rules relating personnel information with work performance and retention.
Abstract: The quality of human capital is crucial for high-tech companies to maintain competitive advantages in knowledge economy era. However, high-technology companies suffering from high turnover rates often find it hard to recruit the right talents. In addition to conventional human resource management approaches, there is an urgent need to develop effective personnel selection mechanism to find the talents who are the most suitable to their own organizations. This study aims to fill the gap by developing a data mining framework based on decision tree and association rules to generate useful rules for personnel selection. The results can provide decision rules relating personnel information with work performance and retention. An empirical study was conducted in a semiconductor company to support their hiring decision for indirect labors including engineers and managers with different job functions. The results demonstrated the practical viability of this approach. Moreover, based on discussions among domain experts and data miner, specific recruitment and human resource management strategies were created from the results.

411 citations


Cites background from "Rough set theory: a data mining too..."

  • ...However, the applications of data mining in the semiconductor industry are mostly related to engineering data analysis and yield enhancement (Braha & Shmilovici, 2002; Kusiak, 2001; Chien, Hsiao, & Wang, 2004; Chien, Wang, & Cheng, 2007)....

    [...]

Journal ArticleDOI
TL;DR: An attribute generalization and its relation to feature selection and feature extraction are discussed and a new approach for incrementally updating approximations of a concept is presented under the characteristic relation-based rough sets.
Abstract: Any attribute set in an information system may be evolving in time when new information arrives. Approximations of a concept by rough set theory need updating for data mining or other related tasks. For incremental updating approximations of a concept, methods using the tolerance relation and similarity relation have been previously studied in literature. The characteristic relation-based rough sets approach provides more informative results than the tolerance-and-similarity relation based approach. In this paper, an attribute generalization and its relation to feature selection and feature extraction are firstly discussed. Then, a new approach for incrementally updating approximations of a concept is presented under the characteristic relation-based rough sets. Finally, the approach of direct computation of rough set approximations and the proposed approach of dynamic maintenance of rough set approximations are employed for performance comparison. An extensive experimental evaluation on a large soybean database from MLC shows that the proposed approach effectively handles a dynamic attribute generalization in data mining.

277 citations

Journal ArticleDOI
01 Dec 2007
TL;DR: This research proposes a new algorithm for clustering categorical data, termed Min-Min-Roughness (MMR), based on Rough Set Theory (RST), which has the ability to handle the uncertainty in the clustering process.
Abstract: A variety of cluster analysis techniques exist to group objects having similar characteristics. However, the implementation of many of these techniques is challenging due to the fact that much of the data contained in today's databases is categorical in nature. While there have been recent advances in algorithms for clustering categorical data, some are unable to handle uncertainty in the clustering process while others have stability issues. This research proposes a new algorithm for clustering categorical data, termed Min-Min-Roughness (MMR), based on Rough Set Theory (RST), which has the ability to handle the uncertainty in the clustering process.

163 citations


Cites background from "Rough set theory: a data mining too..."

  • ...Examples include semiconductor manufacturing [20,35], the automobile industry [21], business failure predictions [3], customer retention [19], intelligent image filtering [42], clinical databases [36], classification of highway sections [22], and web mining [15] just to name a few....

    [...]

References
More filters
Book
31 Oct 1991
TL;DR: Theoretical Foundations.
Abstract: I. Theoretical Foundations.- 1. Knowledge.- 1.1. Introduction.- 1.2. Knowledge and Classification.- 1.3. Knowledge Base.- 1.4. Equivalence, Generalization and Specialization of Knowledge.- Summary.- Exercises.- References.- 2. Imprecise Categories, Approximations and Rough Sets.- 2.1. Introduction.- 2.2. Rough Sets.- 2.3. Approximations of Set.- 2.4. Properties of Approximations.- 2.5. Approximations and Membership Relation.- 2.6. Numerical Characterization of Imprecision.- 2.7. Topological Characterization of Imprecision.- 2.8. Approximation of Classifications.- 2.9. Rough Equality of Sets.- 2.10. Rough Inclusion of Sets.- Summary.- Exercises.- References.- 3. Reduction of Knowledge.- 3.1. Introduction.- 3.2. Reduct and Core of Knowledge.- 3.3. Relative Reduct and Relative Core of Knowledge.- 3.4. Reduction of Categories.- 3.5. Relative Reduct and Core of Categories.- Summary.- Exercises.- References.- 4. Dependencies in Knowledge Base.- 4.1. Introduction.- 4.2. Dependency of Knowledge.- 4.3. Partial Dependency of Knowledge.- Summary.- Exercises.- References.- 5. Knowledge Representation.- 5.1. Introduction.- 5.2. Examples.- 5.3. Formal Definition.- 5.4. Significance of Attributes.- 5.5. Discernibility Matrix.- Summary.- Exercises.- References.- 6. Decision Tables.- 6.1. Introduction.- 6.2. Formal Definition and Some Properties.- 6.3. Simplification of Decision Tables.- Summary.- Exercises.- References.- 7. Reasoning about Knowledge.- 7.1. Introduction.- 7.2. Language of Decision Logic.- 7.3. Semantics of Decision Logic Language.- 7.4. Deduction in Decision Logic.- 7.5. Normal Forms.- 7.6. Decision Rules and Decision Algorithms.- 7.7. Truth and Indiscernibility.- 7.8. Dependency of Attributes.- 7.9. Reduction of Consistent Algorithms.- 7.10. Reduction of Inconsistent Algorithms.- 7.11. Reduction of Decision Rules.- 7.12. Minimization of Decision Algorithms.- Summary.- Exercises.- References.- II. Applications.- 8. Decision Making.- 8.1. Introduction.- 8.2. Optician's Decisions Table.- 8.3. Simplification of Decision Table.- 8.4. Decision Algorithm.- 8.5. The Case of Incomplete Information.- Summary.- Exercises.- References.- 9. Data Analysis.- 9.1. Introduction.- 9.2. Decision Table as Protocol of Observations.- 9.3. Derivation of Control Algorithms from Observation.- 9.4. Another Approach.- 9.5. The Case of Inconsistent Data.- Summary.- Exercises.- References.- 10. Dissimilarity Analysis.- 10.1. Introduction.- 10.2. The Middle East Situation.- 10.3. Beauty Contest.- 10.4. Pattern Recognition.- 10.5. Buying a Car.- Summary.- Exercises.- References.- 11. Switching Circuits.- 11.1. Introduction.- 11.2. Minimization of Partially Defined Switching Functions.- 11.3. Multiple-Output Switching Functions.- Summary.- Exercises.- References.- 12. Machine Learning.- 12.1. Introduction.- 12.2. Learning From Examples.- 12.3. The Case of an Imperfect Teacher.- 12.4. Inductive Learning.- Summary.- Exercises.- References.

7,826 citations


"Rough set theory: a data mining too..." refers background in this paper

  • ...A reduct is a minimal sufficient subset of features RED such that (Shan et al. [22] after Pawlak [ 21 ]): a) (RED) , i.e., RED produces the same classification of objects as the collection of all features; b) for any feature RED, (RED , i.e., a reduct is a minimal subset with respect to the property a);...

    [...]

  • ...discourse [ 21 ]. Objects described by the same properly selected information (referred in this paper as features) are indiscernible....

    [...]

Journal ArticleDOI
TL;DR: In this article, a generalized form of the cross-validation criterion is applied to the choice and assessment of prediction using the data-analytic concept of a prescription, and examples used to illustrate the application are drawn from the problem areas of univariate estimation, linear regression and analysis of variance.
Abstract: SUMMARY A generalized form of the cross-validation criterion is applied to the choice and assessment of prediction using the data-analytic concept of a prescription. The examples used to illustrate the application are drawn from the problem areas of univariate estimation, linear regression and analysis of variance.

7,385 citations


"Rough set theory: a data mining too..." refers methods in this paper

  • ...The cross-validation method discussed in [20] suggests dividing the set of all objects into disjoint groups, usually of equal size....

    [...]

Journal ArticleDOI
TL;DR: Bernard Rosner's FUNDAMENTALS of BIOSTATISTICS is a practical introduction to the methods, techniques, and computation of statistics with human subjects that prepares students for their future courses and careers.
Abstract: Bernard Rosner's FUNDAMENTALS OF BIOSTATISTICS is a practical introduction to the methods, techniques, and computation of statistics with human subjects. It prepares students for their future courses and careers by introducing the statistical methods most often used in medical literature. Rosner minimizes the amount of mathematical formulation (algebra-based) while still giving complete explanations of all the important concepts. As in previous editions, a major strength of this book is that every new concept is developed systematically through completely worked out examples from current medical research problems.

4,624 citations


"Rough set theory: a data mining too..." refers background in this paper

  • ...Accuracy is defined as the total number of true positives added to the total number of true negatives divided by the total number of patients studied [25], i....

    [...]

  • ...16 the following metrics are defined in addition to accuracy [25]:...

    [...]

Book
01 Jan 1982
TL;DR: Bernard Rosner's "Fundamentals of BIOSTATISTICS" as mentioned in this paper is a practical introduction to the methods, techniques, and computation of statistics with human subjects.
Abstract: Bernard Rosner's FUNDAMENTALS OF BIOSTATISTICS is a practical introduction to the methods, techniques, and computation of statistics with human subjects. It prepares students for their future courses and careers by introducing the statistical methods most often used in medical literature. Rosner minimizes the amount of mathematical formulation (algebra-based) while still giving complete explanations of all the important concepts. As in previous editions, a major strength of this book is that every new concept is developed systematically through completely worked out examples from current medical research problems.

4,438 citations

Book
01 Oct 1998

2,830 citations