scispace - formally typeset
Search or ask a question
Author

Chieh-Jen Wang

Bio: Chieh-Jen Wang is an academic researcher from Huafan University. The author has contributed to research in topics: Support vector machine & Feature selection. The author has an hindex of 2, co-authored 2 publications receiving 1884 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: This research presents a genetic algorithm approach for feature selection and parameters optimization to solve the problem of optimizing parameters and feature subset without degrading the SVM classification accuracy.
Abstract: Support Vector Machines, one of the new techniques for pattern classification, have been widely used in many application areas. The kernel parameters setting for SVM in a training process impacts on the classification accuracy. Feature selection is another factor that impacts classification accuracy. The objective of this research is to simultaneously optimize the parameters and feature subset without degrading the SVM classification accuracy. We present a genetic algorithm approach for feature selection and parameters optimization to solve this kind of problem. We tried several real-world datasets using the proposed GA-based approach and the Grid algorithm, a traditional method of performing parameters searching. Compared with the Grid algorithm, our proposed GA-based approach significantly improves the classification accuracy and has fewer input features for support vector machines. q 2005 Elsevier Ltd. All rights reserved.

1,316 citations

Journal ArticleDOI
TL;DR: Experimental results show that SVM is a promising addition to the existing data mining methods and three strategies to construct the hybrid SVM-based credit scoring models are used.
Abstract: The credit card industry has been growing rapidly recently, and thus huge numbers of consumers' credit data are collected by the credit department of the bank. The credit scoring manager often evaluates the consumer's credit with intuitive experience. However, with the support of the credit classification model, the manager can accurately evaluate the applicant's credit score. Support Vector Machine (SVM) classification is currently an active research area and successfully solves classification problems in many domains. This study used three strategies to construct the hybrid SVM-based credit scoring models to evaluate the applicant's credit score from the applicant's input features. Two credit datasets in UCI database are selected as the experimental data to demonstrate the accuracy of the SVM classifier. Compared with neural networks, genetic programming, and decision tree classifiers, the SVM classifier achieved an identical classificatory accuracy with relatively few input features. Additionally, combining genetic algorithms with SVM classifier, the proposed hybrid GA-SVM strategy can simultaneously perform feature selection task and model parameters optimization. Experimental results show that SVM is a promising addition to the existing data mining methods.

766 citations


Cited by
More filters
Journal ArticleDOI
08 Aug 2019
TL;DR: A comprehensive overview and analysis of the most recent research in machine learning principles, algorithms, descriptors, and databases in materials science, and proposes solutions and future research paths for various challenges in computational materials science.
Abstract: One of the most exciting tools that have entered the material science toolbox in recent years is machine learning. This collection of statistical methods has already proved to be capable of considerably speeding up both fundamental and applied research. At present, we are witnessing an explosion of works that develop and apply machine learning to solid-state systems. We provide a comprehensive overview and analysis of the most recent research in this topic. As a starting point, we introduce machine learning principles, algorithms, descriptors, and databases in materials science. We continue with the description of different machine learning approaches for the discovery of stable materials and the prediction of their crystal structure. Then we discuss research in numerous quantitative structure–property relationships and various approaches for the replacement of first-principle methods by machine learning. We review how active learning and surrogate-based optimization can be applied to improve the rational design process and related examples of applications. Two major questions are always the interpretability of and the physical understanding gained from machine learning models. We consider therefore the different facets of interpretability and their importance in materials science. Finally, we propose solutions and future research paths for various challenges in computational materials science.

1,301 citations

Journal ArticleDOI
TL;DR: The authors define interpretability in the context of machine learning and introduce the predictive, descriptive, relevant (PDR) framework for discussing interpretations, with three overarching desiderata for evaluation: predictive accuracy, descriptive accuracy, and relevancy, with relevance judged relative to a human audience.
Abstract: Machine-learning models have demonstrated great success in learning complex patterns that enable them to make predictions about unobserved data. In addition to using models for prediction, the ability to interpret what a model has learned is receiving an increasing amount of attention. However, this increased focus has led to considerable confusion about the notion of interpretability. In particular, it is unclear how the wide array of proposed interpretation methods are related and what common concepts can be used to evaluate them. We aim to address these concerns by defining interpretability in the context of machine learning and introducing the predictive, descriptive, relevant (PDR) framework for discussing interpretations. The PDR framework provides 3 overarching desiderata for evaluation: predictive accuracy, descriptive accuracy, and relevancy, with relevancy judged relative to a human audience. Moreover, to help manage the deluge of interpretation methods, we introduce a categorization of existing techniques into model-based and post hoc categories, with subgroups including sparsity, modularity, and simulatability. To demonstrate how practitioners can use the PDR framework to evaluate and understand interpretations, we provide numerous real-world examples. These examples highlight the often underappreciated role played by human audiences in discussions of interpretability. Finally, based on our framework, we discuss limitations of existing methods and directions for future work. We hope that this work will provide a common vocabulary that will make it easier for both practitioners and researchers to discuss and choose from the full range of interpretation methods.

851 citations

Journal ArticleDOI
TL;DR: Experimental results demonstrate that the classification accuracy rates of the developed approach surpass those of grid search and many other approaches, and that the developed PSO+SVM approach has a similar result to GA+S VM, Therefore, the PSO + SVM approach is valuable for parameter determination and feature selection in an SVM.
Abstract: Support vector machine (SVM) is a popular pattern classification method with many diverse applications. Kernel parameter setting in the SVM training procedure, along with the feature selection, significantly influences the classification accuracy. This study simultaneously determines the parameter values while discovering a subset of features, without reducing SVM classification accuracy. A particle swarm optimization (PSO) based approach for parameter determination and feature selection of the SVM, termed PSO+SVM, is developed. Several public datasets are employed to calculate the classification accuracy rate in order to evaluate the developed PSO+SVM approach. The developed approach was compared with grid search, which is a conventional method of searching parameter values, and other approaches. Experimental results demonstrate that the classification accuracy rates of the developed approach surpass those of grid search and many other approaches, and that the developed PSO+SVM approach has a similar result to GA+SVM. Therefore, the PSO+SVM approach is valuable for parameter determination and feature selection in an SVM.

802 citations

Journal ArticleDOI
TL;DR: The study of Baesens et al. (2003) is updated and several novel classification algorithms to the state-of-the-art in credit scoring are compared, providing an independent assessment of recent scoring methods and offering a new baseline to which future approaches can be compared.

692 citations

Journal ArticleDOI
01 Sep 2008
TL;DR: Experimental results showed the proposed PSO-SVM model can correctly select the discriminating input features and also achieve high classification accuracy.
Abstract: This study proposed a novel PSO-SVM model that hybridized the particle swarm optimization (PSO) and support vector machines (SVM) to improve the classification accuracy with a small and appropriate feature subset. This optimization mechanism combined the discrete PSO with the continuous-valued PSO to simultaneously optimize the input feature subset selection and the SVM kernel parameter setting. The hybrid PSO-SVM data mining system was implemented via a distributed architecture using the web service technology to reduce the computational time. In a heterogeneous computing environment, the PSO optimization was performed on the application server and the SVM model was trained on the client (agent) computer. The experimental results showed the proposed approach can correctly select the discriminating input features and also achieve high classification accuracy.

499 citations