scispace - formally typeset
Search or ask a question
Journal Article•DOI•

A GA-based feature selection and parameters optimizationfor support vector machines

TL;DR: This research presents a genetic algorithm approach for feature selection and parameters optimization to solve the problem of optimizing parameters and feature subset without degrading the SVM classification accuracy.
Abstract: Support Vector Machines, one of the new techniques for pattern classification, have been widely used in many application areas. The kernel parameters setting for SVM in a training process impacts on the classification accuracy. Feature selection is another factor that impacts classification accuracy. The objective of this research is to simultaneously optimize the parameters and feature subset without degrading the SVM classification accuracy. We present a genetic algorithm approach for feature selection and parameters optimization to solve this kind of problem. We tried several real-world datasets using the proposed GA-based approach and the Grid algorithm, a traditional method of performing parameters searching. Compared with the Grid algorithm, our proposed GA-based approach significantly improves the classification accuracy and has fewer input features for support vector machines. q 2005 Elsevier Ltd. All rights reserved.
Citations
More filters
Journal Article•DOI•
TL;DR: Experimental results demonstrate that the classification accuracy rates of the developed approach surpass those of grid search and many other approaches, and that the developed PSO+SVM approach has a similar result to GA+S VM, Therefore, the PSO + SVM approach is valuable for parameter determination and feature selection in an SVM.
Abstract: Support vector machine (SVM) is a popular pattern classification method with many diverse applications. Kernel parameter setting in the SVM training procedure, along with the feature selection, significantly influences the classification accuracy. This study simultaneously determines the parameter values while discovering a subset of features, without reducing SVM classification accuracy. A particle swarm optimization (PSO) based approach for parameter determination and feature selection of the SVM, termed PSO+SVM, is developed. Several public datasets are employed to calculate the classification accuracy rate in order to evaluate the developed PSO+SVM approach. The developed approach was compared with grid search, which is a conventional method of searching parameter values, and other approaches. Experimental results demonstrate that the classification accuracy rates of the developed approach surpass those of grid search and many other approaches, and that the developed PSO+SVM approach has a similar result to GA+SVM. Therefore, the PSO+SVM approach is valuable for parameter determination and feature selection in an SVM.

802 citations


Cites methods from "A GA-based feature selection and pa..."

  • ...Moreover, Huang et al. (2006) utilized the GA-based feature selection and parameter optimization for credit scoring....

    [...]

  • ...The results obtained by the developed PSO + SVM approach with/without feature selection were compared with those of GA + SVM developed by Huang et al. (2006) The classification accuracy rates are cited from their original papers....

    [...]

Journal Article•DOI•
01 Sep 2008
TL;DR: Experimental results showed the proposed PSO-SVM model can correctly select the discriminating input features and also achieve high classification accuracy.
Abstract: This study proposed a novel PSO-SVM model that hybridized the particle swarm optimization (PSO) and support vector machines (SVM) to improve the classification accuracy with a small and appropriate feature subset. This optimization mechanism combined the discrete PSO with the continuous-valued PSO to simultaneously optimize the input feature subset selection and the SVM kernel parameter setting. The hybrid PSO-SVM data mining system was implemented via a distributed architecture using the web service technology to reduce the computational time. In a heterogeneous computing environment, the PSO optimization was performed on the application server and the SVM model was trained on the client (agent) computer. The experimental results showed the proposed approach can correctly select the discriminating input features and also achieve high classification accuracy.

499 citations

Journal Article•DOI•
TL;DR: This study aims to investigate the impact of fourteen data normalization methods on classification performance considering full feature set, feature selection, and feature weighting and suggests a set of the best and the worst methods combining the normalization procedure and empirical analysis of results.

469 citations

Journal Article•DOI•
TL;DR: The empirical results demonstrate that the proposed FOA-SVM method can obtain much more appropriate model parameters as well as significantly reduce the computational time, which generates a high classification accuracy.
Abstract: In this paper, a new support vector machines (SVM) parameter tuning scheme that uses the fruit fly optimization algorithm (FOA) is proposed. Termed as FOA-SVM, the scheme is successfully applied to medical diagnosis. In the proposed FOA-SVM, the FOA technique effectively and efficiently addresses the parameter set in SVM. Additionally, the effectiveness and efficiency of FOA-SVM is rigorously evaluated against four well-known medical datasets, including the Wisconsin breast cancer dataset, the Pima Indians diabetes dataset, the Parkinson dataset, and the thyroid disease dataset, in terms of classification accuracy, sensitivity, specificity, AUC (the area under the receiver operating characteristic (ROC) curve) criterion, and processing time. Four competitive counterparts are employed for comparison purposes, including the particle swarm optimization algorithm-based SVM (PSO-SVM), genetic algorithm-based SVM (GA-SVM), bacterial forging optimization-based SVM (BFO-SVM), and grid search technique-based SVM (Grid-SVM). The empirical results demonstrate that the proposed FOA-SVM method can obtain much more appropriate model parameters as well as significantly reduce the computational time, which generates a high classification accuracy. Promisingly, the proposed method can be regarded as a useful clinical tool for medical decision making.

456 citations


Cites background from "A GA-based feature selection and pa..."

  • ...Recently, biologically-inspired metaheuristics (such as he genetic algorithm [17], particle swarm optimization (PSO) [18], nd bacterial foraging optimization (BFO) [19]) have been shown o be more likely to determine the global optimum solution than he traditional aforementioned methods....

    [...]

Journal Article•DOI•
TL;DR: A novel PSO-SVM model has been proposed that hybridized the particle swarm optimization (PSO) and SVM to improve the EMG signal classification accuracy and validate the superiority of the SVM method compared to conventional machine learning methods.

422 citations

References
More filters
Book•
01 Sep 1988
TL;DR: In this article, the authors present the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields, including computer programming and mathematics.
Abstract: From the Publisher: This book brings together - in an informal and tutorial fashion - the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields Major concepts are illustrated with running examples, and major algorithms are illustrated by Pascal computer programs No prior knowledge of GAs or genetics is assumed, and only a minimum of computer programming and mathematics background is required

52,797 citations

Journal Article•DOI•
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

40,826 citations

Book•
Vladimir Vapnik1•
01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

40,147 citations


"A GA-based feature selection and pa..." refers methods in this paper

  • ...Support vector machines (SVM) were first suggested by Vapnik (1995) and have recently been used in a range of problems including pattern recognition (Pontil and Verri, 1998), bioinformatics (Yu, Ostrouchov, Geist, & Samatova, 1999), and text categorization (Joachims, 1998)....

    [...]

Journal Article•DOI•
Christopher John Burges1•
TL;DR: There are several arguments which support the observed high accuracy of SVMs, which are reviewed and numerous examples and proofs of most of the key theorems are given.
Abstract: The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.

15,696 citations

01 Jan 1998

12,940 citations