scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Statistical method for testing the neutral mutation hypothesis by DNA polymorphism.

Fumio Tajima1
01 Nov 1989-Genetics (Genetics Society of America)-Vol. 123, Iss: 3, pp 585-595
TL;DR: The relationship between the two estimates of genetic variation at the DNA level, namely the number of segregating sites and the average number of nucleotide differences estimated from pairwise comparison, is investigated in this article.
Abstract: The relationship between the two estimates of genetic variation at the DNA level, namely the number of segregating sites and the average number of nucleotide differences estimated from pairwise comparison, is investigated. It is found that the correlation between these two estimates is large when the sample size is small, and decreases slowly as the sample size increases. Using the relationship obtained, a statistical method for testing the neutral mutation hypothesis is developed. This method needs only the data of DNA polymorphism, namely the genetic variation within population at the DNA level. A simple method of computer simulation, that was used in order to obtain the distribution of a new statistic developed, is also presented. Applying this statistical method to the five regions of DNA sequences in Drosophila melanogaster, it is found that large insertion/deletion (greater than 100 bp) is deleterious. It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: Arlequin ver 3.0 as discussed by the authors is a software package integrating several basic and advanced methods for population genetics data analysis, like the computation of standard genetic diversity indices, the estimation of allele and haplotype frequencies, tests of departure from linkage equilibrium, departure from selective neutrality and demographic equilibrium, estimation or parameters from past population expansions, and thorough analyses of population subdivision under the AMOVA framework.
Abstract: Arlequin ver 3.0 is a software package integrating several basic and advanced methods for population genetics data analysis, like the computation of standard genetic diversity indices, the estimation of allele and haplotype frequencies, tests of departure from linkage equilibrium, departure from selective neutrality and demographic equilibrium, estimation or parameters from past population expansions, and thorough analyses of population subdivision under the AMOVA framework. Arlequin 3 introduces a completely new graphical interface written in C++, a more robust semantic analysis of input files, and two new methods: a Bayesian estimation of gametic phase from multi-locus genotypes, and an estimation of the parameters of an instantaneous spatial expansion from DNA sequence polymorphism. Arlequin can handle several data types like DNA sequences, microsatellite data, or standard multi-locus genotypes. A Windows version of the software is freely available on http://cmpg.unibe.ch/software/arlequin3.

14,271 citations

Journal ArticleDOI
TL;DR: Version 5 implements a number of new features and analytical methods allowing extensive DNA polymorphism analyses on large datasets, including visualizing sliding window results integrated with available genome annotations in the UCSC browser.
Abstract: Motivation: DnaSP is a software package for a comprehensive analysis of DNA polymorphism data. Version 5 implements a number of new features and analytical methods allowing extensive DNA polymorphism analyses on large datasets. Among other features, the newly implemented methods allow for: (i) analyses on multiple data files; (ii) haplotype phasing; (iii) analyses on insertion/deletion polymorphism data; (iv) visualizing sliding window results integrated with available genome annotations in the UCSC browser. Availability: Freely available to academic users from: http://www.ub.edu/dnasp Contact: [email protected]

13,511 citations

Journal ArticleDOI
TL;DR: An overview of the statistical methods, computational tools, and visual exploration modules for data input and the results obtainable in MEGA is provided.
Abstract: With its theoretical basis firmly established in molecular evolutionary and population genetics, the comparative DNA and protein sequence analysis plays a central role in reconstructing the evolutionary histories of species and multigene families, estimating rates of molecular evolution, and inferring the nature and extent of selective forces shaping the evolution of genes and genomes. The scope of these investigations has now expanded greatly owing to the development of high-throughput sequencing techniques and novel statistical and computational methods. These methods require easy-to-use computer programs. One such effort has been to produce Molecular Evolutionary Genetics Analysis (MEGA) software, with its focus on facilitating the exploration and analysis of the DNA and protein sequence variation from an evolutionary perspective. Currently in its third major release, MEGA3 contains facilities for automatic and manual sequence alignment, web-based mining of databases, inference of the phylogenetic trees, estimation of evolutionary distances and testing evolutionary hypotheses. This paper provides an overview of the statistical methods, computational tools, and visual exploration modules for data input and the results obtainable in MEGA.

12,124 citations

Journal ArticleDOI
01 Oct 1997-Genetics
TL;DR: It is found that the polymorphic patterns in a DNA sample under logistic population growth and genetic hitchhiking are very similar and that one of the newly developed tests, Fs, is considerably more powerful than existing tests for rejecting the hypothesis of neutrality of mutations.
Abstract: The main purpose of this article is to present several new statistical tests of neutrality of mutations against a class of alternative models, under which DNA polymorphisms tend to exhibit excesses of rare alleles or young mutations. Another purpose is to study the powers of existing and newly developed tests and to examine the detailed pattern of polymorphisms under population growth, genetic hitchhiking and background selection. It is found that the polymorphic patterns in a DNA sample under logistic population growth and genetic hitchhiking are very similar and that one of the newly developed tests, Fs, is considerably more powerful than existing tests for rejecting the hypothesis of neutrality of mutations. Background selection gives rise to quite different polymorphic patterns than does logistic population growth or genetic hitchhiking, although all of them show excesses of rare alleles or young mutations. We show that Fu and Li's tests are among the most powerful tests against background selection. Implications of these results are discussed.

6,332 citations

Journal ArticleDOI
TL;DR: The present version of DnaSP introduces several new modules and features which, among other options, allow handling big data sets and conducting a large number of coalescent-based tests by Monte Carlo computer simulations.
Abstract: Summary: DnaSP is a software package for the analysis of DNA polymorphism data. Present version introduces several new modules and features which, among other options allow: (1) handling big data sets (∼5 Mb per sequence); (2) conducting a large number of coalescent-based tests by Monte Carlo computer simulations; (3) extensive analyses of the genetic differentiation and gene flow among populations; (4) analysing the evolutionary pattern of preferred and unpreferred codons; (5) generating graphical outputs for an easy visualization of results. Availability: The software package, including complete documentation and examples, is freely available to academic users from: http://www.ub.es/dnasp

6,100 citations

References
More filters
Book
01 Jan 1983
TL;DR: The neutral theory as discussed by the authors states that the great majority of evolutionary changes at the molecular level are caused not by Darwinian selection but by random drift of selectively neutral mutants, which has caused controversy ever since.
Abstract: Motoo Kimura, as founder of the neutral theory, is uniquely placed to write this book. He first proposed the theory in 1968 to explain the unexpectedly high rate of evolutionary change and very large amount of intraspecific variability at the molecular level that had been uncovered by new techniques in molecular biology. The theory - which asserts that the great majority of evolutionary changes at the molecular level are caused not by Darwinian selection but by random drift of selectively neutral mutants - has caused controversy ever since. This book is the first comprehensive treatment of this subject and the author synthesises a wealth of material - ranging from a historical perspective, through recent molecular discoveries, to sophisticated mathematical arguments - all presented in a most lucid manner.

7,874 citations

Journal ArticleDOI
TL;DR: The distribution is obtained for the number of segregating sites observed in a sample from a population which is subject to recurring, new, mutations but not subject to recombination, and applies approximately to three population models.

3,870 citations

Journal ArticleDOI
17 Feb 1968-Nature
TL;DR: Calculating the rate of evolution in terms of nucleotide substitutions seems to give a value so high that many of the mutations involved must be neutral ones.
Abstract: Calculating the rate of evolution in terms of nucleotide substitutions seems to give a value so high that many of the mutations involved must be neutral ones.

3,297 citations

Journal ArticleDOI
01 Oct 1983-Genetics
TL;DR: These studies indicate that the estimates of the average number of nucleotide differences and nucleon diversity have a large variance, and a large part of this variance is due to stochastic factors.
Abstract: With the aim of analyzing and interpreting data on DNA polymorphism obtained by DNA sequencing or restriction enzyme technique, a mathematical theory on the expected evolutionary relationship among DNA sequences (nucleons) sampled is developed under the assumption that the evolutionary change of nucleons is determined solely by mutation and random genetic drift. The statistical property of the number of nucleotide differences between randomly chosen nucleons and that of heterozygosity or nucleon diversity is investigated using this theory. These studies indicate that the estimates of the average number of nucleotide differences and nucleon diversity have a large variance, and a large part of this variance is due to stochastic factors. Therefore, increasing sample size does not help reduce the variance significantly The distribution of sample allele (nucleomorph) frequencies is also studied, and it is shown that a small number of samples are sufficient in order to know the distribution pattern.

3,038 citations

Journal ArticleDOI
01 Apr 1964-Genetics
TL;DR: This article proposes to examine some of the population consequences of a system of different isoalleles whose frequency in the population is determined by the mutation rate and by random drift, and considers three possibilities: A system of selectively neutral isoallels, a systemof mutually heterotic alleles, and a mixture of heterotic and harmful mutants.
Abstract: T has sometimes been suggested that the wild-type allele is not a single entity, I but rather a population of different isoalleles that are indistinguishable by any ordinary procedure. With hundreds of nucleotides, each presumably capable of base substitutions and with additional permutations possible through sequence rearrangements, gains, and losses, the number of possible gene states becomes astronomical. It is known that a single nucleotide substitution can have the most drastic consequences, but there are also mutations with very minute effects and there is the possibility that many are so small as to be undetectable. It is not the purpose of this article to discuss the plausibility of such a system of isoalleles, or the evidence for and against. Instead, we propose to examine some of the population consequences of such a system if it does exist. The probability seems great enough to warrant such an inquiry. If a large number of different states can arise by mutation, this doesn't necessarily mean that a large fraction of these would coexist in a single population. Some will be lost by random drift and others may be selectively disadvantageous. On the other hand, some may persist by being beneficial in heterozygous combinations. We shall consider three possibilities: ( 1 ) A system of selectively neutral isoalleles whose frequency in the population is determined by the mutation rate and by random drift. (2) A system of mutually heterotic alleles. ( 3 ) A mixture of heterotic and harmful mutants.

2,504 citations