scispace - formally typeset
Search or ask a question

Showing papers on "Selection (genetic algorithm) published in 2006"


Journal ArticleDOI
TL;DR: It is shown that neighborhood selection with the Lasso is a computationally attractive alternative to standard covariance selection for sparse high-dimensional graphs and is hence equivalent to variable selection for Gaussian linear models.
Abstract: The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at estimating those structural zeros from data. We show that neighborhood selection with the Lasso is a computationally attractive alternative to standard covariance selection for sparse high-dimensional graphs. Neighborhood selection estimates the conditional independence restrictions separately for each node in the graph and is hence equivalent to variable selection for Gaussian linear models. We show that the proposed neighborhood selection scheme is consistent for sparse high-dimensional graphs. Consistency hinges on the choice of the penalty parameter. The oracle value for optimal prediction does not lead to a consistent neighborhood estimate. Controlling instead the probability of falsely joining some distinct connectivity components of the graph, consistent estimation for sparse graphs is achieved (with exponential rates), even when the number of variables grows as the number of observations raised to an arbitrary power.

3,793 citations


Journal ArticleDOI
TL;DR: It is shown that random forest has comparable performance to other classification methods, including DLDA, KNN, and SVM, and that the new gene selection procedure yields very small sets of genes (often smaller than alternative methods) while preserving predictive accuracy.
Abstract: Selection of relevant genes for sample classification is a common task in most gene expression studies, where researchers try to identify the smallest possible set of genes that can still achieve good predictive performance (for instance, for future use with diagnostic purposes in clinical practice). Many gene selection approaches use univariate (gene-by-gene) rankings of gene relevance and arbitrary thresholds to select the number of genes, can only be applied to two-class problems, and use gene selection ranking criteria unrelated to the classification algorithm. In contrast, random forest is a classification algorithm well suited for microarray data: it shows excellent performance even when most predictive variables are noise, can be used when the number of variables is much larger than the number of observations and in problems involving more than two classes, and returns measures of variable importance. Thus, it is important to understand the performance of random forest with microarray data and its possible use for gene selection. We investigate the use of random forest for classification of microarray data (including multi-class problems) and propose a new method of gene selection in classification problems based on random forest. Using simulated and nine microarray data sets we show that random forest has comparable performance to other classification methods, including DLDA, KNN, and SVM, and that the new gene selection procedure yields very small sets of genes (often smaller than alternative methods) while preserving predictive accuracy. Because of its performance and features, random forest and gene selection using random forest should probably become part of the "standard tool-box" of methods for class prediction and gene selection with microarray data.

2,610 citations


Journal Article
TL;DR: The survival of the rapidly renewing tissues of long-lived animals like man requires that they be protected against the natural selection of fitter variant cells (that is, the spontaneous appearance of... as discussed by the authors.
Abstract: Survival of the rapidly renewing tissues of long-lived animals like man requires that they be protected against the natural selection of fitter variant cells (that is, the spontaneous appearance of...

1,440 citations


Journal ArticleDOI
TL;DR: This work identifies where new techniques can help estimate the relative roles of the various selection mechanisms that might work together in the evolution of mating preferences and attractive traits, and in sperm-egg interactions.
Abstract: The past two decades have seen extensive growth of sexual selection research. Theoretical and empirical work has clarified many components of pre- and postcopulatory sexual selection, such as aggressive competition, mate choice, sperm utilization and sexual conflict. Genetic mechanisms of mate choice evolution have been less amenable to empirical testing, but molecular genetic analyses can now be used for incisive experimentation. Here, we highlight some of the currently debated areas in pre- and postcopulatory sexual selection. We identify where new techniques can help estimate the relative roles of the various selection mechanisms that might work together in the evolution of mating preferences and attractive traits, and in sperm‐egg interactions.

1,129 citations


Journal ArticleDOI
16 Jun 2006-Science
TL;DR: The authors reviewed approaches to detect positive natural selection in humans, described results from recent analyses of genome-wide data, and discuss the prospects and challenges ahead as we expand our understanding of the role of natural selection on shaping the human genome.
Abstract: Positive natural selection is the force that drives the increase in prevalence of advantageous traits, and it has played a central role in our development as a species. Until recently, the study of natural selection in humans has largely been restricted to comparing individual candidate genes to theoretical expectations. The advent of genome-wide sequence and polymorphism data brings fundamental new tools to the study of natural selection. It is now possible to identify new candidates for selection and to reevaluate previous claims by comparison with empirical distributions of DNA sequence variation across the human genome and among populations. The flood of data and analytical methods, however, raises many new challenges. Here, we review approaches to detect positive natural selection, describe results from recent analyses of genome-wide data, and discuss the prospects and challenges ahead as we expand our understanding of the role of natural selection in shaping the human genome.

1,088 citations


Zhao, Yi, Adve, Raviraj, Lim, Teng 
01 Jan 2006
TL;DR: It is shown that at reasonable power levels the selection AF scheme maintains full diversity order, and has significantly better outage behavior and average throughput than the conventional scheme or that with optimal power allocation.

1,057 citations


Journal ArticleDOI
TL;DR: In this model, higher-level selection emerges as a byproduct of individual reproduction and population structure and can be extended to more than two levels of selection and to include migration.
Abstract: We propose a minimalist stochastic model of multilevel (or group) selection. A population is subdivided into groups. Individuals interact with other members of the group in an evolutionary game that determines their fitness. Individuals reproduce, and offspring are added to the same group. If a group reaches a certain size, it can split into two. Faster reproducing individuals lead to larger groups that split more often. In our model, higher-level selection emerges as a byproduct of individual reproduction and population structure. We derive a fundamental condition for the evolution of cooperation by group selection: if b/c > 1 + n/m, then group selection favors cooperation. The parameters b and c denote the benefit and cost of the altruistic act, whereas n and m denote the maximum group size and the number of groups. The model can be extended to more than two levels of selection and to include migration.

838 citations


Journal ArticleDOI
TL;DR: Genome-wide selection may become a popular tool for genetic improvement in livestock after a strategy that utilizes these advantages was compared with a traditional progeny testing strategy under a typical Canadian-like dairy cattle situation.
Abstract: Animals can be genotyped for thousands of single nucleotide polymorphisms (SNPs) at one time, where the SNPs are located at roughly 1-cM intervals throughout the genome. For each contiguous pair of SNPs there are four possible haplotypes that could be inherited from the sire. The effects of each interval on a trait can be estimated for all intervals simultaneously in a model where interval effects are random factors. Given the estimated effects of each haplotype for every interval in the genome, and given an animal's genotype, a 'genomic' estimated breeding value is obtained by summing the estimated effects for that genotype. The accuracy of that estimator of breeding values is around 80%. Because the genomic estimated breeding values can be calculated at birth, and because it has a high accuracy, a strategy that utilizes these advantages was compared with a traditional progeny testing strategy under a typical Canadian-like dairy cattle situation. Costs of proving bulls were reduced by 92% and genetic change was increased by a factor of 2. Genome-wide selection may become a popular tool for genetic improvement in livestock.

785 citations


Journal ArticleDOI
TL;DR: A simulation approach was used to clarify the application of random effects under three common situations for telemetry studies and found that random intercepts accounted for unbalanced sample designs, and models withrandom intercepts and coefficients improved model fit given the variation in selection among individuals and functional responses in selection.
Abstract: 1. Resource selection estimated by logistic regression is used increasingly in studies to identify critical resources for animal populations and to predict species occurrence. 2. Most frequently, individual animals are monitored and pooled to estimate population-level effects without regard to group or individual-level variation. Pooling assumes that both observations and their errors are independent, and resource selection is constant given individual variation in resource availability. 3. Although researchers have identified ways to minimize autocorrelation, variation between individuals caused by differences in selection or available resources, including functional responses in resource selection, have not been well addressed. 4. Here we review random-effects models and their application to resource selection modelling to overcome these common limitations. We present a simple case study of an analysis of resource selection by grizzly bears in the foothills of the Canadian Rocky Mountains with and without random effects. 5. Both categorical and continuous variables in the grizzly bear model differed in interpretation, both in statistical significance and coefficient sign, depending on how a random effect was included. We used a simulation approach to clarify the application of random effects under three common situations for telemetry studies: (a) discrepancies in sample sizes among individuals; (b) differences among individuals in selection where availability is constant; and (c) differences in availability with and without a functional response in resource selection. 6. We found that random intercepts accounted for unbalanced sample designs, and models with random intercepts and coefficients improved model fit given the variation in selection among individuals and functional responses in selection. Our empirical example and simulations demonstrate how including random effects in resource selection models can aid interpretation and address difficult assumptions limiting their generality. This approach will allow researchers to appropriately estimate marginal (population) and conditional (individual) responses, and account for complex grouping, unbalanced sample designs and autocorrelation.

718 citations


Journal ArticleDOI
TL;DR: It is illustrated that resource selection models are part of a broader collection of statistical models called weighted distributions and recommend some promising areas for future development.
Abstract: We review 87 articles published in the Journal of Wildlife Management from 2000 to 2004 to assess the current state of practice in the design and analysis of resource selection studies. Articles were classified into 4 study designs. In design 1, data are collected at the population level because individual animals are not identified. Individual animal selection may be assessed in designs 2 and 3. In design 2, use by each animal is recorded, but availability (or nonuse) is measured only at the population level. Use and availability (or unused) are measured for each animal in design 3. In design 4, resource use is measured multiple times for each animal, and availability (or nonuse) is measured for each use location. Thus, use and availability measures are paired for each use in design 4. The 4 study designs were used about equally in the articles reviewed. The most commonly used statistical analyses were logistic regression (40%) and compositional analysis (25%). We illustrate 4 problem areas in resource selection analyses: pooling of relocation data across animals with differing numbers of relocations, analyzing paired data as though they were independent, tests that do not control experiment wise error rates, and modeling observations as if they were independent when temporal or spatial correlations occurs in the data. Statistical models that allow for variation in individual animal selection rather than pooling are recommended to improve error estimation in population-level selection. Some researchers did not select appropriate statistical analyses for paired data, or their analyses were not well described. Researchers using one-resource-at-a-time procedures often did not control the experiment wise error rate, so simultaneous inference procedures and multivariate assessments of selection are suggested. The time interval between animal relocations was often relatively short, but existing analyses for temporally or spatially correlated data were not used. For studies that used logistic regression, we identified the data type employed: single sample, case control (used-unused), use-availability, or paired use-availability. It was not always clear whether studies intended to compare use to nonuse or use to availability. Despite the popularity of compositional analysis, we do not recommend it for multiple relocation data when use of one or more resources is low. We illustrate that resource selection models are part of a broader collection of statistical models called weighted distributions and recommend some promising areas for future development.

649 citations


Journal Article
TL;DR: A competitive model underlying memory formation is suggested, in which eligible neurons are selected to participate in a memory trace as a function of their relative CREB activity at the time of learning.

Posted Content
T. D. Stanley1
TL;DR: This study investigates the small‐sample performance of meta‐regression methods for detecting and estimating genuine empirical effects in research literatures tainted by publication selection and finds them to be robust against publication selection.
Abstract: This study investigates the small-sample performance of meta-regression methods for detecting and estimating genuine empirical effects in research literatures tainted by publication selection. Publication selection exists when editors, reviewers or researchers have a preference for statistically significant results. Meta-regression methods are found to be robust against publication selection. Even if a literature is dominated by large and unknown misspecification biases, precision-effect testing and joint precision-effect/meta-significance testing can provide viable strategies for detecting genuine empirical effects. Publication biases are greatly reduced by combining two biased estimates, the estimated meta-regression coefficient on precision (1/Se) and the unadjusted average effect.

Journal ArticleDOI
TL;DR: The traditional way of identifying targets of adaptive evolution has been to study a few loci that one hypothesizes a priori to have been under selection, but in principle, multilocus analyses can facilitate robust inferences of selection at individual loci.

Journal ArticleDOI
TL;DR: Tests for balancing selection in the current generation, the recent past, and the distant past provide a comprehensive approach for evaluating selective impacts and provide new ways to evaluate the long-term impact of selection on particular genes and the overall genome in natural populations.
Abstract: The selective mechanisms for maintaining polymorphism in natural populations has been the subject of theory, experiments, and review over the past half century. Advances in molecular genetic techniques have provided new insight into many examples of balancing selection. In addition, new theoretical developments demonstrate how diversifying selection over environments may maintain polymorphism. Tests for balancing selection in the current generation, the recent past, and the distant past provide a comprehensive approach for evaluating selective impacts. In particular, sequencedbased tests provide new ways to evaluate the long-term impact of selection on particular genes and the overall genome in natural populations. Overall, there appear to be many loci exhibiting the signal of adaptive directional selection from genomic scans, but the present evidence suggests that the proportion of loci where polymorphism is maintained by environmental heterogeneity is low. However, as more molecular genetic details become available, more examples of polymorphism maintained by selection in heterogeneous environments may be found.

Journal ArticleDOI
01 Jun 2006-Genetics
TL;DR: It is argued that the relaxation of natural selection due to modern medicine and reduced variance in family size is not likely to lead to a rapid decline in genetic quality, but that it will be very difficult to locate most of the genes involved in complex genetic diseases.
Abstract: The distribution of fitness effects of new mutations is a fundamental parameter in genetics. Here we present a new method by which the distribution can be estimated. The method is fairly robust to changes in population size and admixture, and it can be corrected for any residual effects if a model of the demography is available. We apply the method to extensively sampled single-nucleotide polymorphism data from humans and estimate the distribution of fitness effects for amino acid changing mutations. We show that a gamma distribution with a shape parameter of 0.23 provides a good fit to the data and we estimate that >50% of mutations are likely to have mild effects, such that they reduce fitness by between one one-thousandth and one-tenth. We also infer that <15% of new mutations are likely to have strongly deleterious effects. We estimate that on average a nonsynonymous mutation reduces fitness by a few percent and that the average strength of selection acting against a nonsynonymous polymorphism is ~9 × 10-5. We argue that the relaxation of natural selection due to modern medicine and reduced variance in family size is not likely to lead to a rapid decline in genetic quality, but that it will be very difficult to locate most of the genes involved in complex genetic diseases.

Journal ArticleDOI
01 Jan 2006
TL;DR: This paper introduces a new method for learning algorithm evaluation and selection, with empirical results based on classification, to generate rules, using the rule-based learning algorithm C5.0, to describeWhich types of algorithms are suited to solving which types of classification problems.
Abstract: This paper introduces a new method for learning algorithm evaluation and selection, with empirical results based on classification. The empirical study has been conducted among 8 algorithms/classifiers with 100 different classification problems. We evaluate the algorithms' performance in terms of a variety of accuracy and complexity measures. Consistent with the No Free Lunch theorem, we do not expect to identify the single algorithm that performs best on all datasets. Rather, we aim to determine the characteristics of datasets that lend themselves to superior modelling by certain learning algorithms. Our empirical results are used to generate rules, using the rule-based learning algorithm C5.0, to describe which types of algorithms are suited to solving which types of classification problems. Most of the rules are generated with a high confidence rating.

Journal ArticleDOI
TL;DR: It is concluded that selection on synonymous codon use in E. coli is largely due to selection for translational accuracy, to reduce the costs of both missense and nonsense errors.
Abstract: In many organisms, selection acts on synonymous codons to improve translation. However, the precise basis of this selection remains unclear in the majority of species. Selection could be acting to maximize the speed of elongation, to minimize the costs of proofreading, or to maximize the accuracy of translation. Using several data sets, we find evidence that codon use in Escherichia coli is biased to reduce the costs of both missense and nonsense translational errors. Highly conserved sites and genes have higher codon bias than less conserved ones, and codon bias is positively correlated to gene length and production costs, both indicating selection against missense errors. Additionally, codon bias increases along the length of genes, indicating selection against nonsense errors. Doublet mutations or replacement substitutions do not explain our observations. The correlations remain when we control for expression level and for conflicting selection pressures at the start and end of genes. Considering each amino acid by itself confirms our results. We conclude that selection on synonymous codon use in E. coli is largely due to selection for translational accuracy, to reduce the costs of both missense and nonsense errors.

Journal ArticleDOI
TL;DR: In this paper, a number of experiments has been conducted on various metric and non-metric dissimilarity representations and prototype selection methods, and it is found that systematic approaches lead to better results than the random selection.

Journal ArticleDOI
TL;DR: In this article, a new evidence algorithm known as nested sampling is proposed, which combines accuracy, generality of application, and computational feasibility, and applies it to some cosmological data sets and models.
Abstract: The abundance of cosmological data becoming available means that a wider range of cosmological models are testable than ever before. However, an important distinction must be made between parameter fitting and model selection. While parameter fitting simply determines how well a model fits the data, model selection statistics, such as the Bayesian evidence, are now necessary to choose between these different models, and in particular to assess the need for new parameters. We implement a new evidence algorithm known as nested sampling, which combines accuracy, generality of application, and computational feasibility, and we apply it to some cosmological data sets and models. We find that a five-parameter model with a Harrison-Zel'dovich initial spectrum is currently preferred.

Journal ArticleDOI
David Posada1
TL;DR: The ModelTest server is a web-based application for the selection of models of nucleotide substitution using the program ModelTest, which takes as input a text file with likelihood scores for the set of candidate models.
Abstract: ModelTest server is a web-based application for the selection of models of nucleotide substitution using the program ModelTest. The server takes as input a text file with likelihood scores for the set of candidate models. Models can be selected with hierarchical likelihood ratio tests, or with the Akaike or Bayesian information criteria. The output includes several statistics for the assessment of model selection uncertainty, for model averaging or to estimate the relative importance of model parameters. The server can be accessed at http://darwin.uvigo.es/software/ modeltest_server.html.

Journal ArticleDOI
TL;DR: In this paper, the authors review processes that increase phenotypic variation in response to disruptive selection and discuss some of the possible outcomes, such as sympatric species pairs, sexual dimorphisms, phenotypetic plasticity and altered community assemblages.
Abstract: Disruptive selection occurs when extreme phenotypes have a fitness advantage over more intermediate phenotypes. The phenomenon is particularly interesting when selection keeps a population in a disruptive regime. This can lead to increased phenotypic variation while disruptive selection itself is diminished or eliminated. Here, we review processes that increase phenotypic variation in response to disruptive selection and discuss some of the possible outcomes, such as sympatric species pairs, sexual dimorphisms, phenotypic plasticity and altered community assemblages. We also identify factors influencing the likelihoods of these different outcomes.

Journal ArticleDOI
TL;DR: The results suggest that the initial step in adaptive evolution—the production of novel beneficial mutants from which selection sorts—is very general, being characterized by an approximately exponential distribution with many mutations of small effect and few of large effect.
Abstract: The extent to which a population diverges from its ancestor through adaptive evolution depends on variation supplied by novel beneficial mutations. Extending earlier work1,2, recent theory makes two predictions that seem to be robust to biological details: the distribution of fitness effects among beneficial mutations before selection should be (i) exponential and (ii) invariant, meaning it is always exponential regardless of the fitness rank of the wild-type allele3,4. Here we test these predictions by assaying the fitness of 665 independently derived single-step mutations in the bacterium Pseudomonas fluorescens across a range of environments. We show that the distribution of fitness effects among beneficial mutations is indistinguishable from an exponential despite marked variation in the fitness rank of the wild type across environments. These results suggest that the initial step in adaptive evolution—the production of novel beneficial mutants from which selection sorts—is very general, being characterized by an approximately exponential distribution with many mutations of small effect and few of large effect. We also document substantial variation in the pleiotropic costs of antibiotic resistance, a result that may have implications for strategies aimed at eliminating resistant pathogens in animal and human populations.

Journal ArticleDOI
TL;DR: In this article, neighborhood selection with the Lasso is proposed as a computationally attractive alternative to standard covariance selection for sparse high-dimensional graphs, which is equivalent to variable selection for Gaussian linear models.
Abstract: The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at estimating those structural zeros from data. We show that neighborhood selection with the Lasso is a computationally attractive alternative to standard covariance selection for sparse high-dimensional graphs. Neighborhood selection estimates the conditional independence restrictions separately for each node in the graph and is hence equivalent to variable selection for Gaussian linear models. We show that the proposed neighborhood selection scheme is consistent for sparse high-dimensional graphs. Consistency hinges on the choice of the penalty parameter. The oracle value for optimal prediction does not lead to a consistent neighborhood estimate. Controlling instead the probability of falsely joining some distinct connectivity components of the graph, consistent estimation for sparse graphs is achieved (with exponential rates), even when the number of variables grows as the number of observations raised to an arbitrary power.

Journal ArticleDOI
01 Jun 2006-Nature
TL;DR: This study manipulated the frequencies of males with different colour patterns in three natural populations to estimate survival rates, and found that rare phenotypes had a highly significant survival advantage compared to common phenotypes.
Abstract: One of the trickiest problems in evolutionary biology is to explain how natural populations maintain an element of genetic diversity. Of all the proposed mechanisms, theory shows that frequency-dependent selection can be the most potent, yet there is only indirect evidence for its importance in natural populations. An experimental manipulation in natural populations of guppies now shows that there is a significant survival advantage for rare genotypes (exotic colouring in males) in natural populations of guppies. This is perhaps the best experimental evidence yet that frequency-dependent selection can be a potent mechanism maintaining genetic variation in natural populations. Results from an experimental manipulation showed a significant survival advantage for rare genotypes in natural populations of guppies, confirming that frequency-dependent selection can act as a potent mechanism in maintaining genetic variation in natural populations. The maintenance of genetic variation in traits under natural selection is a long-standing paradox in evolutionary biology1,2,3. Of the processes capable of maintaining variation, negative frequency-dependent selection (where rare types are favoured by selection) is the most powerful, at least in theory1; however, few experimental studies have confirmed that this process operates in nature. One of the most extreme, unexplained genetic polymorphisms is seen in the colour patterns of male guppies (Poecilia reticulata)4,5. Here we manipulated the frequencies of males with different colour patterns in three natural populations to estimate survival rates, and found that rare phenotypes had a highly significant survival advantage compared to common phenotypes. Evidence from humans6,7 and other species8,9 implicates frequency-dependent survival in the maintenance of molecular, morphological and health-related polymorphisms. As a controlled manipulation in nature, this study provides unequivocal support for frequency-dependent survival—an evolutionary process capable of maintaining extreme polymorphism.

Journal ArticleDOI
TL;DR: A genetic algorithm is proposed that uses a variable population size and periodic partial reinitialization of the population in the form of a saw-tooth function to enhance the overall performance of the algorithm relying on the dynamics of evolution of the GA and the synergy of the combined effects of population size variation and reinitalization.
Abstract: A genetic algorithm (GA) is proposed that uses a variable population size and periodic partial reinitialization of the population in the form of a saw-tooth function. The aim is to enhance the overall performance of the algorithm relying on the dynamics of evolution of the GA and the synergy of the combined effects of population size variation and reinitialization. Preliminary parametric studies to test the validity of these assertions are performed for two categories of problems, a multimodal function and a unimodal function with different features. The proposed scheme is compared with the conventional GA and micro GA (/spl mu/GA) of equal computing cost and guidelines for the selection of effective values of the involved parameters are given, which facilitate the implementation of the algorithm. The proposed algorithm is tested for a variety of benchmark problems and a problem generator from which it becomes evident that the saw-tooth scheme enhances the overall performance of GAs.

Journal ArticleDOI
TL;DR: This review assessed the efficacy and safety of nebulisers, pressurised metered-dose inhalers and dry powder inhalers as delivery systems for beta-agonists, anticholinergic agents and corticosteroids and concluded that the delivery devices can be equally effective.
Abstract: CRD summary This review assessed the efficacy and safety of nebulisers, pressurised metered-dose inhalers and dry powder inhalers as delivery systems for beta-agonists, anticholinergic agents and corticosteroids. The authors concluded that the delivery devices can be equally effective. Given the risk of studies being underpowered and the limited consideration of quality, the robustness of the authors' conclusions is somewhat unclear.

Journal ArticleDOI
TL;DR: A model to describe the association between markers and genes as conditional probabilities in synthetic populations under recurrent selection is proposed, which can be computed on the basis of assumptions related to the history of the population.
Abstract: Association analysis is a method potentially useful for detection of marker-trait associations based on linkage disequilibrium, but little information is available on the application of this technique to plant breeding populations. With appropriate statistical methods, valid association analysis can be done in plant breeding populations; however, the most significant marker may not be closest to the functional gene. Bias can arise from (i) covariance among markers and QTL, frequently related to population structure or intense selection and (ii) differences in initial frequencies of marker alleles in the population, such that exclusive alleles tend to be in higher association. The potentials and limitations of germplasm bank collections, synthetic populations, and elite germplasm are compared, as experimental materials for association analysis integrated with plant breeding practice. Synthetics offer a favorable balance of power and precision for association analysis and would allow mapping of quantitative traits with increasing resolution through cycles of intermating. A model to describe the association between markers and genes as conditional probabilities in synthetic populations under recurrent selection is proposed, which can be computed on the basis of assumptions related to the history of the population. This model is useful for predicting the potential of different populations for association analysis and forecasting the response to marker-assisted selection.

Journal ArticleDOI
TL;DR: A simulation study is conducted to determine the effect various factors have on theMPML estimation method and recommends a multi-stage procedure based on the MPML method that can be used in practical applications.
Abstract: In this article we study the approximately unbiased multi-level pseudo maximum likelihood (MPML) estimation method for general multi-level modeling with sampling weights. We conduct a simulation study to determine the effect various factors have on the estimation method. The factors we included in this study are scaling method, size of clusters, invariance of selection, informativeness of selection, intraclass correlation, and variability of standardized weights. The scaling method is an indicator of how the weights are normalized on each level. The invariance of the selection is an indicator of whether or not the same selection mechanism is applied across clusters. The informativeness of the selection is an indicator of how biased the selection is. We summarize our findings and recommend a multi-stage procedure based on the MPML method that can be used in practical applications.

Journal ArticleDOI
TL;DR: The1/3 law is obtained: if A and B are strict Nash equilibria then selection favors replacement of B by A, if the unstable equilibrium occurs at a frequency of A which is less than 1/3.
Abstract: Evolutionary game dynamics in finite populations can be described by a frequency dependent, stochastic Wright-Fisher process. We consider a symmetric game between two strategies, A and B. There are discrete generations. In each generation, individuals produce offspring proportional to their payoff. The next generation is sampled randomly from this pool of offspring. The total population size is constant. The resulting Markov process has two absorbing states corresponding to homogeneous populations of all A or all B. We quantify frequency dependent selection by comparing the absorption probabilities to the corresponding probabilities under random drift. We derive conditions for selection to favor one strategy or the other by using the concept of total positivity. In the limit of weak selection, we obtain the 1/3 law: if A and B are strict Nash equilibria then selection favors replacement of B by A, if the unstable equilibrium occurs at a frequency of A which is less than 1/3.

Book ChapterDOI
02 Apr 2006
TL;DR: This work introduces a more general method that detects sequences that have either come under selection, or begun to drift, on any lineage, based on a phylogenetic hidden Markov model (phylo-HMM), and derives efficient dynamic-programming algorithms for obtaining these distributions, given a model of neutral evolution.
Abstract: So far, most methods for identifying sequences under selection based on comparative sequence data have either assumed selectional pressures are the same across all branches of a phylogeny, or have focused on changes in specific lineages of interest. Here, we introduce a more general method that detects sequences that have either come under selection, or begun to drift, on any lineage. The method is based on a phylogenetic hidden Markov model (phylo-HMM), and does not require element boundaries to be determined a priori, making it particularly useful for identifying noncoding sequences. Insertions and deletions (indels) are incorporated into the phylo-HMM by a simple strategy that uses a separately reconstructed “indel history.” To evaluate the statistical significance of predictions, we introduce a novel method for computing P-values based on prior and posterior distributions of the number of substitutions that have occurred in the evolution of predicted elements. We derive efficient dynamic-programming algorithms for obtaining these distributions, given a model of neutral evolution. Our methods have been implemented as computer programs called DLESS (Detection of LinEage-Specific Selection) and phyloP (phylogenetic P-values). We discuss results obtained with these programs on both real and simulated data sets.