Topic
Selection (genetic algorithm)
About: Selection (genetic algorithm) is a research topic. Over the lifetime, 72443 publications have been published within this topic receiving 1327417 citations.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: It is shown that random forest has comparable performance to other classification methods, including DLDA, KNN, and SVM, and that the new gene selection procedure yields very small sets of genes (often smaller than alternative methods) while preserving predictive accuracy.
Abstract: Selection of relevant genes for sample classification is a common task in most gene expression studies, where researchers try to identify the smallest possible set of genes that can still achieve good predictive performance (for instance, for future use with diagnostic purposes in clinical practice). Many gene selection approaches use univariate (gene-by-gene) rankings of gene relevance and arbitrary thresholds to select the number of genes, can only be applied to two-class problems, and use gene selection ranking criteria unrelated to the classification algorithm. In contrast, random forest is a classification algorithm well suited for microarray data: it shows excellent performance even when most predictive variables are noise, can be used when the number of variables is much larger than the number of observations and in problems involving more than two classes, and returns measures of variable importance. Thus, it is important to understand the performance of random forest with microarray data and its possible use for gene selection. We investigate the use of random forest for classification of microarray data (including multi-class problems) and propose a new method of gene selection in classification problems based on random forest. Using simulated and nine microarray data sets we show that random forest has comparable performance to other classification methods, including DLDA, KNN, and SVM, and that the new gene selection procedure yields very small sets of genes (often smaller than alternative methods) while preserving predictive accuracy. Because of its performance and features, random forest and gene selection using random forest should probably become part of the "standard tool-box" of methods for class prediction and gene selection with microarray data.
2,610 citations
••
TL;DR: An algorithm which automates the purposeful selection of covariates within which an analyst makes a variable selection decision at each step of the modeling process and has the capability of retaining important confounding variables, resulting potentially in a slightly richer model.
Abstract: Background
The main problem in many model-building situations is to choose from a large set of covariates those that should be included in the "best" model. A decision to keep a variable in the model might be based on the clinical or statistical significance. There are several variable selection algorithms in existence. Those methods are mechanical and as such carry some limitations. Hosmer and Lemeshow describe a purposeful selection of covariates within which an analyst makes a variable selection decision at each step of the modeling process.
2,577 citations
••
TL;DR: The particle swarm optimization algorithm is analyzed using standard results from the dynamic system theory and graphical parameter selection guidelines are derived, resulting in results superior to previously published results.
2,554 citations
••
01 Jan 1991TL;DR: A number of selection schemes commonly used in modern genetic algorithms are compared on the basis of solutions to deterministic difference or differential equations, verified through computer simulations to provide convenient approximate or exact solutions and useful convergence time and growth ratio estimates.
Abstract: This paper considers a number of selection schemes commonly used in modern genetic algorithms. Specifically, proportionate reproduction, ranking selection, tournament selection, and Genitor (or “steady state”) selection are compared on the basis of solutions to deterministic difference or differential equations, which are verified through computer simulations. The analysis provides convenient approximate or exact solutions as well as useful convergence time and growth ratio estimates. The paper recommends practical application of the analyses and suggests a number of paths for more detailed analytical investigation of selection techniques.
2,531 citations
•
01 Jan 2004
2,526 citations