Analysis of Gene Diversity in Subdivided Populations

doi:10.1073/PNAS.70.12.3321

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Estimating F-statistics for the analysis of population structure.

[...]

Bruce S. Weir¹, C. Clark Cockerham¹•Institutions (1)

North Carolina State University¹

01 Nov 1984-Evolution

TL;DR: The purpose of this discussion is to offer some unity to various estimation formulae and to point out that correlations of genes in structured populations, with which F-statistics are concerned, are expressed very conveniently with a set of parameters treated by Cockerham (1 969, 1973).

...read moreread less

Abstract: This journal frequently contains papers that report values of F-statistics estimated from genetic data collected from several populations. These parameters, FST, FIT, and FIS, were introduced by Wright (1951), and offer a convenient means of summarizing population structure. While there is some disagreement about the interpretation of the quantities, there is considerably more disagreement on the method of evaluating them. Different authors make different assumptions about sample sizes or numbers of populations and handle the difficulties of multiple alleles and unequal sample sizes in different ways. Wright himself, for example, did not consider the effects of finite sample size. The purpose of this discussion is to offer some unity to various estimation formulae and to point out that correlations of genes in structured populations, with which F-statistics are concerned, are expressed very conveniently with a set of parameters treated by Cockerham (1 969, 1973). We start with the parameters and construct appropriate estimators for them, rather than beginning the discussion with various data functions. The extension of Cockerham's work to multiple alleles and loci will be made explicit, and the use of jackknife procedures for estimating variances will be advocated. All of this may be regarded as an extension of a recent treatment of estimating the coancestry coefficient to serve as a mea-

...read moreread less

17,890 citations

Cites background from "Analysis of Gene Diversity in Subdi..."

...Many papers do not give computational formulae, but generally refer to work by Wright (1943, 1951, 1965, 1973) or Nei (1973, 1977), and any assumptions made about sample sizes are not stated....
[...]

Journal Article•DOI•

Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data.

[...]

Laurent Excoffier¹, Peter E. Smouse¹, Joseph M. Quattro¹•Institutions (1)

Rutgers University¹

01 Jun 1992-Genetics

TL;DR: In this article, a framework for the study of molecular variation within a single species is presented, where information on DNA haplotype divergence is incorporated into an analysis of variance format, derived from a matrix of squared-distances among all pairs of haplotypes.

...read moreread less

Abstract: We present here a framework for the study of molecular variation within a single species. Information on DNA haplotype divergence is incorporated into an analysis of variance format, derived from a matrix of squared-distances among all pairs of haplotypes. This analysis of molecular variance (AMOVA) produces estimates of variance components and F-statistic analogs, designated here as phi-statistics, reflecting the correlation of haplotypic diversity at different levels of hierarchical subdivision. The method is flexible enough to accommodate several alternative input matrices, corresponding to different types of molecular data, as well as different types of evolutionary assumptions, without modifying the basic structure of the analysis. The significance of the variance components and phi-statistics is tested using a permutational approach, eliminating the normality assumption that is conventional for analysis of variance but inappropriate for molecular data. Application of AMOVA to human mitochondrial DNA haplotype data shows that population subdivisions are better resolved when some measure of molecular differences among haplotypes is introduced into the analysis. At the intraspecific level, however, the additional information provided by knowing the exact phylogenetic relations among haplotypes or by a nonlinear translation of restriction-site change into nucleotide diversity does not significantly modify the inferred population genetic structure. Monte Carlo studies show that site sampling does not fundamentally affect the significance of the molecular variance components. The AMOVA treatment is easily extended in several different directions and it constitutes a coherent and flexible framework for the statistical analysis of molecular data.

...read moreread less

12,835 citations

Journal Article•

Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data.

[...]

Peter E. Smouse, Laurent Excoffier, Joseph M. Quattro

30 May 1992-Genomics

12,252 citations

Journal Article•DOI•

Discriminant analysis of principal components: a new method for the analysis of genetically structured populations

[...]

Thibaut Jombart¹, Sébastien Devillard², Francois Balloux¹•Institutions (2)

Imperial College London¹, University of Lyon²

15 Oct 2010-BMC Genetics

TL;DR: The Discriminant Analysis of Principal Components (DAPC) is introduced, a multivariate method designed to identify and describe clusters of genetically related individuals that performs generally better than STRUCTURE at characterizing population subdivision.

...read moreread less

Abstract: The dramatic progress in sequencing technologies offers unprecedented prospects for deciphering the organization of natural populations in space and time. However, the size of the datasets generated also poses some daunting challenges. In particular, Bayesian clustering algorithms based on pre-defined population genetics models such as the STRUCTURE or BAPS software may not be able to cope with this unprecedented amount of data. Thus, there is a need for less computer-intensive approaches. Multivariate analyses seem particularly appealing as they are specifically devoted to extracting information from large datasets. Unfortunately, currently available multivariate methods still lack some essential features needed to study the genetic structure of natural populations. We introduce the Discriminant Analysis of Principal Components (DAPC), a multivariate method designed to identify and describe clusters of genetically related individuals. When group priors are lacking, DAPC uses sequential K-means and model selection to infer genetic clusters. Our approach allows extracting rich information from genetic data, providing assignment of individuals to groups, a visual assessment of between-population differentiation, and contribution of individual alleles to population structuring. We evaluate the performance of our method using simulated data, which were also analyzed using STRUCTURE as a benchmark. Additionally, we illustrate the method by analyzing microsatellite polymorphism in worldwide human populations and hemagglutinin gene sequence variation in seasonal influenza. Analysis of simulated data revealed that our approach performs generally better than STRUCTURE at characterizing population subdivision. The tools implemented in DAPC for the identification of clusters and graphical representation of between-group structures allow to unravel complex population structures. Our approach is also faster than Bayesian clustering algorithms by several orders of magnitude, and may be applicable to a wider range of datasets.

...read moreread less

3,770 citations

Cites methods from "Analysis of Gene Diversity in Subdi..."

...FST refers to the mean pairwise FST computed using Nei’s estimator [62]....
[...]

Journal Article•DOI•

A measure of population subdivision based on microsatellite allele frequencies.

[...]

Montgomery Slatkin¹•Institutions (1)

University of California, Berkeley¹

01 Jan 1995-Genetics

TL;DR: It was found that, under the generalized stepwise mutation model, R( ST) provides relatively unbiased estimates of migration rates and times of population divergence while F(ST) tends to show too much population similarity, particularly when migration rates are low or divergence times are long.

...read moreread less

Abstract: A new measure of the extent of population subdivision as inferred from allele frequencies at microsatellite loci is proposed and tested with computer simulations. This measure, called R(ST), is analogous to Wright's F(ST) in representing the proportion of variation between populations. It differs in taking explicit account of the mutation process at microsatellite loci, for which a generalized stepwise mutation model appears appropriate. Simulations of subdivided populations were carried out to test the performance of R(ST) and F(ST). It was found that, under the generalized stepwise mutation model, R(ST) provides relatively unbiased estimates of migration rates and times of population divergence while F(ST) tends to show too much population similarity, particularly when migration rates are low or divergence times are long [corrected].

...read moreread less

3,621 citations

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Isolation by Distance.

[...]

Sewall Wright¹•Institutions (1)

University of Chicago¹

29 Mar 1943-Genetics

5,446 citations

Journal Article•DOI•

Gene Differences between Caucasian, Negro, and Japanese Populations

[...]

Masatoshi Nei¹, Arun K. Roychoudhury¹•Institutions (1)

Brown University¹

04 Aug 1972-Science

TL;DR: The numbers of gene (codon) differences per locus between two randomly chosen genomes within and between Caucasian, Negro, and Japanese populations have been estimated from gene frequency data for protein loci.

...read moreread less

Abstract: The numbers of gene (codon) differences per locus between two randomly chosen genomes within and between Caucasian, Negro, and Japanese populations have been estimated from gene frequency data for protein loci. The estimated number of gene differences between individuals from different populations is only slightly greater than the number between individuals from the same population.

...read moreread less

115 citations

Analysis of Gene Diversity in Subdivided Populations

Citations

Cites background from "Analysis of Gene Diversity in Subdi..."

Cites methods from "Analysis of Gene Diversity in Subdi..."

References

Related Papers (5)