scispace - formally typeset
Search or ask a question

Showing papers on "Selection (genetic algorithm) published in 1997"


Journal ArticleDOI
01 Oct 1997-Genetics
TL;DR: It is found that the polymorphic patterns in a DNA sample under logistic population growth and genetic hitchhiking are very similar and that one of the newly developed tests, Fs, is considerably more powerful than existing tests for rejecting the hypothesis of neutrality of mutations.
Abstract: The main purpose of this article is to present several new statistical tests of neutrality of mutations against a class of alternative models, under which DNA polymorphisms tend to exhibit excesses of rare alleles or young mutations. Another purpose is to study the powers of existing and newly developed tests and to examine the detailed pattern of polymorphisms under population growth, genetic hitchhiking and background selection. It is found that the polymorphic patterns in a DNA sample under logistic population growth and genetic hitchhiking are very similar and that one of the newly developed tests, Fs, is considerably more powerful than existing tests for rejecting the hypothesis of neutrality of mutations. Background selection gives rise to quite different polymorphic patterns than does logistic population growth or genetic hitchhiking, although all of them show excesses of rare alleles or young mutations. We show that Fu and Li's tests are among the most powerful tests against background selection. Implications of these results are discussed.

6,332 citations


Journal ArticleDOI
TL;DR: This paper decompose the conventional measure of evaluation bias into several components and find that bias due to selection on unobservables, commonly called selection bias in econometrics, is empirically less important than other components, although it is still a sizeable fraction of the estimated programme impact.
Abstract: This paper considers whether it is possible to devise a nonexperimental procedure for evaluating a prototypical job training programme. Using rich nonexperimental data, we examine the performance of a two-stage evaluation methodology that (a) estimates the probability that a person participates in a programme and (b) uses the estimated probability in extensions of the classical method of matching. We decompose the conventional measure of programme evaluation bias into several components and find that bias due to selection on unobservables, commonly called selection bias in econometrics, is empirically less important than other components, although it is still a sizeable fraction of the estimated programme impact. Matching methods applied to comparison groups located in the same labour markets as participants and administered the same questionnaire eliminate much of the bias as conventionally measured, but the remaining bias is a considerable fraction of experimentally-determined programme impact estimates. We test and reject the identifying assumptions that justify the classical method of matching. We present a nonparametric conditional difference-in-differences extension of the method of matching that is consistent with the classical index-sufficient sample selection model and is not rejected by our tests of identifying assumptions. This estimator is effective in eliminating bias, especially when it is due to temporally-invariant omitted variables.

5,069 citations


Journal ArticleDOI
TL;DR: The first systematic study on the influence of random fluctuations and sampling size on the reliability of transcript profiles generated routinely by partially sequencing thousands of randomly selected clones from relevant cDNA libraries is presented.
Abstract: Genes differentially expressed in different tissues, during development, or during specific pathologies are of foremost interest to both basic and pharmaceutical research. "Transcript profiles" or "digital Northerns" are generated routinely by partially sequencing thousands of randomly selected clones from relevant cDNA libraries. Differentially expressed genes can then be detected from variations in the counts of their cognate sequence tags. Here we present the first systematic study on the influence of random fluctuations and sampling size on the reliability of this kind of data. We establish a rigorous significance test and demonstrate its use on publicly available transcript profiles. The theory links the threshold of selection of putatively regulated genes (e.g., the number of pharmaceutical leads) to the fraction of false positive clones one is willing to risk. Our results delineate more precisely and extend the limits within which digital Northern data can be used.

2,660 citations


Proceedings Article
01 Jan 1997
TL;DR: Criteria to evaluate the utility of clas-siiers induced from such imbalanced training sets are discussed, explanation of the poor behavior of some learners under these circumstances is given, and a simple technique called one-sided selection of examples is suggested.

2,271 citations


Journal ArticleDOI
TL;DR: This work studies the problem of choosing an optimal feature set for land use classification based on SAR satellite images using four different texture models and shows that pooling features derived from different texture Models, followed by a feature selection results in a substantial improvement in the classification accuracy.
Abstract: A large number of algorithms have been proposed for feature subset selection. Our experimental results show that the sequential forward floating selection algorithm, proposed by Pudil et al. (1994), dominates the other algorithms tested. We study the problem of choosing an optimal feature set for land use classification based on SAR satellite images using four different texture models. Pooling features derived from different texture models, followed by a feature selection results in a substantial improvement in the classification accuracy. We also illustrate the dangers of using feature selection in small sample size situations.

2,238 citations


01 Jan 1997
TL;DR: In this paper, the authors considered the problem of selecting a linear model to approximate the true un-known regression model, some necessary and/or sufficient conditions are estab- lished for the asymptotic validity of various model selection procedures such as Akaike's AIC, Mallows' Cp, Shibata's FPEλ, Schwarz' BIC, generalized AIC and cross validation.
Abstract: In the problem of selecting a linear model to approximate the true un- known regression model, some necessary and/or sufficient conditions are estab- lished for the asymptotic validity of various model selection procedures such as Akaike's AIC, Mallows' Cp, Shibata's FPEλ, Schwarz' BIC, generalized AIC, cross- validation, and generalized cross-validation. It is found that these selection proce- dures can be classified into three classes according to their asymptotic behavior. Under some fairly weak conditions, the selection procedures in one class are asymp- totically valid if there exist fixed-dimension correct models; the selection procedures in another class are asymptotically valid if no fixed-dimension correct model exists. The procedures in the third class are compromises of the procedures in the first two classes. Some empirical results are also presented.

595 citations


Journal ArticleDOI
TL;DR: A method was derived that maximizes the genetic level of selected animals while constraining their average coancestry to a predefined value and can be used to constrain the variance of response by restricting the average prediction error variance of the selected animals.
Abstract: A method was derived that maximizes the genetic level of selected animals while constraining their average coancestry to a predefined value The average coancestry of the selected parents equals the inbreeding level in the next generation, so that rates of inbreeding were controlled When this method was applied for several generations of selection, stable rates of genetic gain were attained, which indicates that the method could control the short- and long-term effects of selection on inbreeding At equal rates of inbreeding, genetic gains were 21 to 60% greater than that with selection for BLUP-EBV, because of increased selection differentials The difference was larger when the desirable rate of inbreeding was smallest Selection with a constraint on inbreeding required only EBV of, and relationships between, the selection candidates and is therefore easy to apply in practice The optimal solution is expressed in genetic contributions of selection candidates to the next generation, which is equivalent to numbers of offspring per candidate These optimal numbers of offspring may be difficult to attain because of female reproductive limitations The optimal method could be adapted to situations with additional reproductive constraints The method can also be used to constrain the variance of response by restricting the average prediction error variance of the selected animals

590 citations


Patent
20 Oct 1997
TL;DR: In this article, a system and method for the optimized storage and retrieval of video data at distributed sites calls for the deployment of "Smart Mirror" sites throughout a network, each of which maintains a copy of certain data managed by the system.
Abstract: A system and method for the optimized storage and retrieval of video data at distributed sites calls for the deployment of “Smart Mirror” sites throughout a network, each of which maintains a copy of certain data managed by the system. Every user is assigned to a specific delivery site based on an analysis of network performance with respect to each of the available delivery sites. Generalized network performance data is collected and stored to facilitate the selection of additional delivery sites and to ensure the preservation of improved performance in comparison to traditional networks.

556 citations


Journal ArticleDOI
TL;DR: In this paper, an exact response to selection (RS) equation is derived for proportionate selection given an infinite population in linkage equilibrium, where the genotype frequencies are the product of the univariate marginal frequencies.
Abstract: The Breeder Genetic Algorithm (BGA) was designed according to the theories and methods used in the science of livestock breeding. The prediction of a breeding experiment is based on the response to selection (RS) equation. This equation relates the change in a population's fitness to the standard deviation of its fitness, as well as to the parameters selection intensity and realized heritability. In this paper the exact RS equation is derived for proportionate selection given an infinite population in linkage equilibrium. In linkage equilibrium the genotype frequencies are the product of the univariate marginal frequencies. The equation contains Fisher's fundamental theorem of natural selection as an approximation. The theorem shows that the response is approximately equal to the quotient of a quantity called additive genetic variance, VA, and the average fitness. We compare Mendelian two-parent recombination with gene-pool recombination, which belongs to a special class of genetic algorithms that we call univariate marginal distribution (UMD) algorithms. UMD algorithms keep the genotypes in linkage equilibrium. For UMD algorithms, an exact RS equation is proven that can be used for long-term prediction. Empirical and theoretical evidence is provided that indicates that Mendelian two-parent recombination is also mainly exploiting the additive genetic variance. We compute an exact RS equation for binary tournament selection. It shows that the two classical methods for estimating realized heritability---the regression heritability and the heritability in the narrow sense---may give poor estimates. Furthermore, realized heritability for binary tournament selection can be very different from that of proportionate selection. The paper ends with a short survey about methods that extend standard genetic algorithms and UMD algorithms by detecting interacting variables in nonlinear fitness functions and using this information to sample new points.

521 citations



Journal ArticleDOI
TL;DR: In this paper, a two-step estimation procedure is proposed to estimate the regression equation of interest in a panel data sample selection model, which is consistent and asymptotically normal with a rate of convergence that can be made arbitrarily close to n -1/2, depending on the strength of certain smoothness assumptions.
Abstract: We consider the problem of estimation in a panel data sample selection model, where both the selection and the regression equation of interest contain unobservable individual-specific effects. We propose a two-step estimation procedure, which differences out both the sample selection effect and the unobservable individual effect from the equation of interest. In the first step, the unknown coefficients of the selection equation are consistently estimated. The estimates are then used to estimate the regression equation of interest. The estimator proposed in this paper is consistent and asymptotically normal, with a rate of convergence that can be made arbitrarily close to n -1/2 , depending on the strength of certain smoothness assumptions. The finite sample properties of the estimator are investigated in a small Monte Carlo simulation.

Journal ArticleDOI
TL;DR: The near-optimality, speed and simplicity of heuristic algorithms suggests that they are acceptable alternatives for many reserve selection problems, especially when dealing with large data sets or complicated analyses.

01 Jan 1997
TL;DR: The authors discusses criteria to evaluate the utility of clas-siiers induced from such imbalanced training sets, gives explanation of the poor behavior of some learners under these circumstances, and suggests as a solution a simple technique called one-sided selection of examples.
Abstract: Adding examples of the majority class to the training set can have a detrimental eeect on the learner's behavior: noisy or otherwise unreliable examples from the majority class can overwhelm the minority class. The paper discusses criteria to evaluate the utility of clas-siiers induced from such imbalanced training sets, gives explanation of the poor behavior of some learners under these circumstances, and suggests as a solution a simple technique called one-sided selection of examples.

Journal ArticleDOI
TL;DR: A hybrid algorithm is proposed by combining a learning method of linguistic classification rules with the multi-objective genetic algorithm for finding a set of non-dominated solutions of the rule selection problem.

Journal ArticleDOI
TL;DR: A quantitative expression for the force of indirect selection that applies to any female mating behavior, is relatively insensitive to the underlying genetics, and is based on measurable quantities suggests that the evolutionary force generated by indirect selection on preferences is weak in absolute terms.
Abstract: An important but controversial class of hypotheses concerning the evolution of female preferences for extreme male mating displays involves “indirect selection” Even in the absence of direct fitness effects, preference for males with high overall fitness can spread via a genetic correlation that develops between preference alleles and high fitness genotypes Here we develop a quantitative expression for the force of indirect selection that (i) applies to any female mating behavior, (ii) is relatively insensitive to the underlying genetics, and (iii) is based on measurable quantities In conjunction with the limited data now available, it suggests that the evolutionary force generated by indirect selection on preferences is weak in absolute terms This finding raises the possibility that direct selection on preference genes may often be more important than indirect selection, but more data on the quantities identified by our model and on direct selection are needed to decide the question

Patent
17 Dec 1997
TL;DR: Methods for the evolution of proteins of industrial and pharmaceutical interest, including methods for effecting recombination and selection, are provided in this paper, and compositions produced by these methods are also disclosed.
Abstract: Methods are provided for the evolution of proteins of industrial and pharmaceutical interest, including methods for effecting recombination and selection. Compositions produced by these methods are also disclosed.


Journal ArticleDOI
TL;DR: The quality and speed of congenic strain construction are enhanced by marker-assisted selection protocol-based strategies, which produce congenic strains with the target gene contained on clearly defined donor-derived genomic intervals in less than half the member of generations required by the classic protocol.

Journal ArticleDOI
TL;DR: A structured messy Genetic algorithm is developed, incorporating some of the principles of the messy genetic algorithm, such as strings that increase in length during the evolution of designs, to be an effective tool for the current optimization problem.
Abstract: The importance of water distribution network rehabilitation, replacement, and expansion is discussed The problem of choosing the best possible set of network improvements to make with a limited budget is presented as a large optimization problem to which conventional optimization techniques are poorly suited A multiobjective approach is described, using capital cost and benefit as dual objectives, enabling a range of noninferior solutions of varying cost to be derived A structured messy genetic algorithm is developed, incorporating some of the principles of the messy genetic algorithm, such as strings that increase in length during the evolution of designs The algorithm is shown to be an effective tool for the current optimization problem, being particularly suited both to the multiobjective approach and to problems that involve the selection of small sets of variables from large numbers of possibilities Two examples are included that demonstrate the features of the method and show that the algorithm performs much better than a standard genetic algorithm for a large network

Journal ArticleDOI
TL;DR: The main goal is to analyze the ancestral selection graph and to compare it to Kingman's coalescent process; it is found that the distribution of the time to the most recent common ancestor does not depend on the selection coefficient and hence is the same as in the neutral case.

Journal ArticleDOI
01 Feb 1997-Genetics
TL;DR: It is found that when the allele frequencies in the population are already in equilibrium, then the genealogy does not differ much from the neutral case, and this is supported by rigorous results.
Abstract: We introduce the genealogy of a random sample of genes taken from a large haploid population that evolves according to random reproduction with selection and mutation. Without selection, the genealogy is described by Kingman's well-known coalescent process. In the selective case, the genealogy of the sample is embedded in a graph with a coalescing and branching structure. We describe this graph, called the ancestral selection graph, and point out differences and similarities with Kingman's coalescent. We present simulations for a two-allele model with symmetric mutation in which one of the alleles has a selective advantage over the other. We find that when the allele frequencies in the population are already in equilibrium, then the genealogy does not differ much from the neutral case. This is supported by rigorous results. Furthermore, we describe the ancestral selection graph for other selective models with finitely many selection classes, such as the K-allele models, infinitely-many-alleles models, DNA sequence models, and infinitely-many-sites models, and briefly discuss the diploid case.

Journal ArticleDOI
TL;DR: Two models comparing the fitness outcomes of behavioural strategies based on conspecific reproductive success as a cue to assess local environmental quality before selecting a new breeding habitat show that prospecting breeding patches before recruiting is the best strategy if the environment is predictable and contains a low proportion of good patches.
Abstract: Classical models of breeding habitat selection rarely deal with the question of information gathering for patch quality assessment. In this paper, we present two models comparing the fitness outcomes of behavioural strategies based on conspecific reproductive success as a cue to assess local environmental quality before selecting a new breeding habitat. The models deal with two phases of the life-cycle of a territorial migratory species: recruitment to a breeding population (model 1) and breeding site fidelity of subsequent breeding attempts (model 2). The first model shows that prospecting breeding patches before recruiting is the best strategy if the environment is predictable and contains a low proportion of good patches, even if it implies losing a breeding opportunity. The second model shows that dispersing after a breeding attempt according to the patch’s breeding success rather than the individual’s own success is the best strategy if the environment is patchy. These results underline the importance of studying the spatio-temporal variations of factors affecting reproductive success when considering the importance of habitat selection strategies based on conspecifics. Moreover, they allow the understanding of individual behaviour patterns observed in natural populations and their potential consequences at the metapopulation level.

Patent
04 Sep 1997
TL;DR: A computer-implemented method and system utilizing a distributed network for the recommendation of goods and/or services to potential costumers based on a potential customer's selection of goods or services and a database of previous customer purchasing history is described in this paper.
Abstract: A computer-implemented method and system utilizing a distributed network for the recommendation of goods and/or services to potential costumers based on a potential customer's selection of goods and/or services and a database of previous customer purchasing history.

Journal ArticleDOI
TL;DR: This paper describes an approach for integrating a large number of context-dependent features into a semi-automated tool that provides a learning algorithm for selecting and combining groupings of the data, where groupings can be induced by highly specialized features.

Journal ArticleDOI
TL;DR: By meta‐analysis published estimates of the heritability of developmental stability, mainly the degree of individual fluctuating asymmetry in morphological characters, indicate that there is a significant additive genetic component to developmental stability.
Abstract: The existence of additive genetic variance in developmental stability has important implications for our understanding of morphological variation. The heritability of individual fluctuating asymmetry and other measures of developmental stability have frequently been estimated from parent-offspring regressions, sib analyses, or from selection experiments. Here we review by meta-analysis published estimates of the heritability of developmental stability, mainly the degree of individual fluctuating asymmetry in morphological characters. The overall mean effect size of heritabilities of individual fluctuating asymmetry was 0.19 from 34 studies of 17 species differing highly significantly from zero (P < 0.0001). The mean heritability for 14 species was 0.27. This indicates that there is a significant additive genetic component to developmental stability. Effect size was larger for selection experiments than for studies based on parent-offspring regression or sib analyses, implying that genetic estimates were unbiased by maternal or common environment effects. Additive genetic coefficients of variation for individual fluctuating asymmetry were considerably higher than those for character size per se. Developmental stability may be significantly heritable either because of strong directional selection, or fluctuating selection regimes which prevent populations from achieving a high degree of developmental stability to current environmental and genetic conditions.

Journal ArticleDOI
TL;DR: Maize breeding programs targeting low-N environments in the tropics should include high-N selection environments to maximize selection gains, and selection under high N for performance under low N was predicted significantly less efficient than selection underLow N when relative yield reduction due to N stress exceeded 43%.
Abstract: Most maize (Zea mays L.) in the tropics is grown under low-nitrogen (N) conditions, raising the need to assess efficient breeding strategies for such conditions. This study assesses the value of low-N vs. high-N selection environments for improving lowland tropical maize for low-N target environments. Fourteen replicated trials grown under low (no N applied) and high (200 kg N ha -1 applied) N at CIMMYT, Mexico, between 1986 and 1995 were analyzed for broad-sense heritability of grain yield, genetic correlation between grain yields under low and high N, and predicted response of grain yield under low N to selection under either low or high N. Broad-sense heritabilities for grain yield under low N were on average 29% smaller than under high N because of lower genotypic variances under low N. Error variances were similar at low and high N. Genetic correlations between grain yields under low and high N were generally positive. They decreased with increasing relative yield reduction under low N, indicating that specific adaptation to either low or high N became more important the more low-N and high-N experiments differed in grain yield. Selection under high N for performance under low N was predicted significantly less efficient than selection under low N when relative yield reduction due to N stress exceeded 43%. Maize breeding programs targeting low-N environments in the tropics should include low-N selection environments to maximize selection gains.


Proceedings ArticleDOI
13 Apr 1997
TL;DR: A model for predicting the convergence quality of genetic algorithms is presented that incorporates previous knowledge about decision making in genetic algorithms and the initial supply of building blocks in a novel way and accurately predicts the quality of the solution found by a GA using a given population size.
Abstract: The paper presents a model for predicting the convergence quality of genetic algorithms The model incorporates previous knowledge about decision making in genetic algorithms and the initial supply of building blocks in a novel way The result is an equation that accurately predicts the quality of the solution found by a GA using a given population size Adjustments for different selection intensities are considered and computational experiments demonstrate the effectiveness of the model


Book
30 Sep 1997
TL;DR: This paper presents a meta-analyses of the genetic Foundations of Breeding for Biotic and Abiotic Stress and its implications for selection with and without Competition, and investigates the relationship between genotype and environment.
Abstract: Preface. 1. Genetic Foundations: The Historical Setting. Part One: Quantitative Variation: Its Detection, Estimation and Utilization. 2. Genetic Models and their Predictive Value. 3. Experimental Mating Designs: An Assessment of their Use and Efficiency in Breeding Programs. 4. The Diallel Cross: The Ultimate Mating Design? 5. Selection with and without Competition. Part Two: Genotype and Environment: Their Interrelationships. 6. Genotype-Environment Interactions: Analysis and Problems. 7. Stability, Adaptability and Adaptation. 8. Breeding for Biotic and Abiotic Stress. 9. Genetic Resources, Genetic Diversity and Ecogeographic Breeding. Index.