scispace - formally typeset
Search or ask a question
Topic

Selection (genetic algorithm)

About: Selection (genetic algorithm) is a research topic. Over the lifetime, 72443 publications have been published within this topic receiving 1327417 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: It is shown that random forest has comparable performance to other classification methods, including DLDA, KNN, and SVM, and that the new gene selection procedure yields very small sets of genes (often smaller than alternative methods) while preserving predictive accuracy.
Abstract: Selection of relevant genes for sample classification is a common task in most gene expression studies, where researchers try to identify the smallest possible set of genes that can still achieve good predictive performance (for instance, for future use with diagnostic purposes in clinical practice). Many gene selection approaches use univariate (gene-by-gene) rankings of gene relevance and arbitrary thresholds to select the number of genes, can only be applied to two-class problems, and use gene selection ranking criteria unrelated to the classification algorithm. In contrast, random forest is a classification algorithm well suited for microarray data: it shows excellent performance even when most predictive variables are noise, can be used when the number of variables is much larger than the number of observations and in problems involving more than two classes, and returns measures of variable importance. Thus, it is important to understand the performance of random forest with microarray data and its possible use for gene selection. We investigate the use of random forest for classification of microarray data (including multi-class problems) and propose a new method of gene selection in classification problems based on random forest. Using simulated and nine microarray data sets we show that random forest has comparable performance to other classification methods, including DLDA, KNN, and SVM, and that the new gene selection procedure yields very small sets of genes (often smaller than alternative methods) while preserving predictive accuracy. Because of its performance and features, random forest and gene selection using random forest should probably become part of the "standard tool-box" of methods for class prediction and gene selection with microarray data.

2,610 citations

Journal ArticleDOI
TL;DR: An algorithm which automates the purposeful selection of covariates within which an analyst makes a variable selection decision at each step of the modeling process and has the capability of retaining important confounding variables, resulting potentially in a slightly richer model.
Abstract: Background The main problem in many model-building situations is to choose from a large set of covariates those that should be included in the "best" model. A decision to keep a variable in the model might be based on the clinical or statistical significance. There are several variable selection algorithms in existence. Those methods are mechanical and as such carry some limitations. Hosmer and Lemeshow describe a purposeful selection of covariates within which an analyst makes a variable selection decision at each step of the modeling process.

2,577 citations

Journal ArticleDOI
TL;DR: The particle swarm optimization algorithm is analyzed using standard results from the dynamic system theory and graphical parameter selection guidelines are derived, resulting in results superior to previously published results.

2,554 citations

Book ChapterDOI
01 Jan 1991
TL;DR: A number of selection schemes commonly used in modern genetic algorithms are compared on the basis of solutions to deterministic difference or differential equations, verified through computer simulations to provide convenient approximate or exact solutions and useful convergence time and growth ratio estimates.
Abstract: This paper considers a number of selection schemes commonly used in modern genetic algorithms. Specifically, proportionate reproduction, ranking selection, tournament selection, and Genitor (or “steady state”) selection are compared on the basis of solutions to deterministic difference or differential equations, which are verified through computer simulations. The analysis provides convenient approximate or exact solutions as well as useful convergence time and growth ratio estimates. The paper recommends practical application of the analyses and suggests a number of paths for more detailed analytical investigation of selection techniques.

2,531 citations


Network Information
Related Topics (5)
Inference
36.8K papers, 1.3M citations
69% related
Sampling (statistics)
65.3K papers, 1.2M citations
67% related
Locus (genetics)
42.7K papers, 2M citations
67% related
Cluster analysis
146.5K papers, 2.9M citations
67% related
Markov chain
51.9K papers, 1.3M citations
66% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20251
202416
20236,495
202213,752
20213,391
20203,543