scispace - formally typeset
Search or ask a question
Journal Article

Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data.

About: This article is published in Genomics.The article was published on 1992-05-30 and is currently open access. It has received 12252 citations till now. The article focuses on the topics: Human mitochondrial genetics & DNA.
Citations
More filters
Journal ArticleDOI
TL;DR: Genalex is a user-friendly cross-platform package that runs within Microsoft Excel, enabling population genetic analyses of codominant, haploid and binary data.
Abstract: genalex is a user-friendly cross-platform package that runs within Microsoft Excel, enabling population genetic analyses of codominant, haploid and binary data. Allele frequency-based analyses include heterozygosity, F statistics, Nei's genetic distance, population assignment, probabilities of identity and pairwise relatedness. Distance-based calculations include amova, principal coordinates analysis (PCA), Mantel tests, multivariate and 2D spatial autocorrelation and twogener. More than 20 different graphs summarize data and aid exploration. Sequence and genotype data can be imported from automated sequencers, and exported to other software. Initially designed as tool for teaching, genalex 6 now offers features for researchers as well. Documentation and the program are available at http://www.anu.edu.au/BoZo/GenAlEx/

15,786 citations

Journal ArticleDOI
TL;DR: Arlequin ver 3.0 as discussed by the authors is a software package integrating several basic and advanced methods for population genetics data analysis, like the computation of standard genetic diversity indices, the estimation of allele and haplotype frequencies, tests of departure from linkage equilibrium, departure from selective neutrality and demographic equilibrium, estimation or parameters from past population expansions, and thorough analyses of population subdivision under the AMOVA framework.
Abstract: Arlequin ver 3.0 is a software package integrating several basic and advanced methods for population genetics data analysis, like the computation of standard genetic diversity indices, the estimation of allele and haplotype frequencies, tests of departure from linkage equilibrium, departure from selective neutrality and demographic equilibrium, estimation or parameters from past population expansions, and thorough analyses of population subdivision under the AMOVA framework. Arlequin 3 introduces a completely new graphical interface written in C++, a more robust semantic analysis of input files, and two new methods: a Bayesian estimation of gametic phase from multi-locus genotypes, and an estimation of the parameters of an instantaneous spatial expansion from DNA sequence polymorphism. Arlequin can handle several data types like DNA sequences, microsatellite data, or standard multi-locus genotypes. A Windows version of the software is freely available on http://cmpg.unibe.ch/software/arlequin3.

14,271 citations

Journal ArticleDOI
TL;DR: In this article, a non-parametric method for multivariate analysis of variance, based on sums of squared distances, is proposed. But it is not suitable for most ecological multivariate data sets.
Abstract: Hypothesis-testing methods for multivariate data are needed to make rigorous probability statements about the effects of factors and their interactions in experiments. Analysis of variance is particularly powerful for the analysis of univariate data. The traditional multivariate analogues, however, are too stringent in their assumptions for most ecological multivariate data sets. Non-parametric methods, based on permutation tests, are preferable. This paper describes a new non-parametric method for multivariate analysis of variance, after McArdle and Anderson (in press). It is given here, with several applications in ecology, to provide an alternative and perhaps more intuitive formulation for ANOVA (based on sums of squared distances) to complement the description pro- vided by McArdle and Anderson (in press) for the analysis of any linear model. It is an improvement on previous non-parametric methods because it allows a direct additive partitioning of variation for complex models. It does this while maintaining the flexibility and lack of formal assumptions of other non-parametric methods. The test- statistic is a multivariate analogue to Fisher's F-ratio and is calculated directly from any symmetric distance or dissimilarity matrix. P-values are then obtained using permutations. Some examples of the method are given for tests involving several factors, including factorial and hierarchical (nested) designs and tests of interactions.

12,328 citations

Journal ArticleDOI
TL;DR: Popart is presented, an integrated software package that provides a comprehensive implementation of haplotype network methods, phylogeographic visualisation tools and standard statistical tests, together with publication‐ready figure production.
Abstract: Summary Haplotype networks are an intuitive method for visualising relationships between individual genotypes at the population level. Here, we present popart, an integrated software package that provides a comprehensive implementation of haplotype network methods, phylogeographic visualisation tools and standard statistical tests, together with publication-ready figure production. popart also provides a platform for the implementation and distribution of new network-based methods – we describe one such new method, integer neighbour-joining. The software is open source and freely available for all major operating systems.

3,634 citations

References
More filters
Journal ArticleDOI
TL;DR: The purpose of this discussion is to offer some unity to various estimation formulae and to point out that correlations of genes in structured populations, with which F-statistics are concerned, are expressed very conveniently with a set of parameters treated by Cockerham (1 969, 1973).
Abstract: This journal frequently contains papers that report values of F-statistics estimated from genetic data collected from several populations. These parameters, FST, FIT, and FIS, were introduced by Wright (1951), and offer a convenient means of summarizing population structure. While there is some disagreement about the interpretation of the quantities, there is considerably more disagreement on the method of evaluating them. Different authors make different assumptions about sample sizes or numbers of populations and handle the difficulties of multiple alleles and unequal sample sizes in different ways. Wright himself, for example, did not consider the effects of finite sample size. The purpose of this discussion is to offer some unity to various estimation formulae and to point out that correlations of genes in structured populations, with which F-statistics are concerned, are expressed very conveniently with a set of parameters treated by Cockerham (1 969, 1973). We start with the parameters and construct appropriate estimators for them, rather than beginning the discussion with various data functions. The extension of Cockerham's work to multiple alleles and loci will be made explicit, and the use of jackknife procedures for estimating variances will be advocated. All of this may be regarded as an extension of a recent treatment of estimating the coancestry coefficient to serve as a mea-

17,890 citations

Journal Article
TL;DR: The technic to be given below for imparting statistical validity to the procedures already in vogue can be viewed as a generalized form of regression with possible useful application to problems arising in quite different contexts.
Abstract: The problem of identifying subtle time-space clustering of disease, as may be occurring in leukemia, is described and reviewed. Published approaches, generally associated with studies of leukemia, not dependent on knowledge of the underlying population for their validity, are directed towards identifying clustering by establishing a relationship between the temporal and the spatial separations for the n ( n - 1)/2 possible pairs which can be formed from the n observed cases of disease. Here it is proposed that statistical power can be improved by applying a reciprocal transform to these separations. While a permutational approach can give valid probability levels for any observed association, for reasons of practicability, it is suggested that the observed association be tested relative to its permutational variance. Formulas and computational procedures for doing so are given. While the distance measures between points represent symmetric relationships subject to mathematical and geometric regularities, the variance formula developed is appropriate for arbitrary relationships. Simplified procedures are given for the case of symmetric and skew-symmetric relationships. The general procedure is indicated as being potentially useful in other situations as, for example, the study of interpersonal relationships. Viewing the procedure as a regression approach, the possibility for extending it to nonlinear and multivariate situations is suggested. Other aspects of the problem and of the procedure developed are discussed. Similarly, pure temporal clustering can be identified by a study of incidence rates in periods of widespread epidemics. In point of fact, many epidemics of communicable diseases are somewhat local in nature and so these do actually constitute temporal-spatial clusters. For leukemia and similar diseases in which cases seem to arise substantially at random rather than as clear-cut epidemics, it is necessary to devise sensitive and efficient procedures for detecting any nonrandom component of disease occurrence. Various ingenious procedures which statisticians have developed for the detection of disease clustering are reviewed here. These procedures can be generalized so as to increase their statistical validity and efficiency. The technic to be given below for imparting statistical validity to the procedures already in vogue can be viewed as a generalized form of regression with possible useful application to problems arising in quite different contexts.

11,408 citations

Journal ArticleDOI
09 Apr 1981
TL;DR: The complete sequence of the 16,569-base pair human mitochondrial genome is presented and shows extreme economy in that the genes have none or only a few noncoding bases between them, and in many cases the termination codons are not coded in the DNA but are created post-transcriptionally by polyadenylation of the mRNAs.
Abstract: The complete sequence of the 16,569-base pair human mitochondrial genome is presented. The genes for the 12S and 16S rRNAs, 22 tRNAs, cytochrome c oxidase subunits I, II and III, ATPase subunit 6, cytochrome b and eight other predicted protein coding genes have been located. The sequence shows extreme economy in that the genes have none or only a few noncoding bases between them, and in many cases the termination codons are not coded in the DNA but are created post-transcriptionally by polyadenylation of the mRNAs.

8,783 citations

Journal ArticleDOI
TL;DR: A method is presented by which the gene diversity (heterozygosity) of a subdivided population can be analyzed into its components, i.e., the gene diversities within and between subpopulations.
Abstract: A method is presented by which the gene diversity (heterozygosity) of a subdivided population can be analyzed into its components, i.e., the gene diversities within and between subpopulations. This method is applicable to any population without regard to the number of alleles per locus, the pattern of evolutionary forces such as mutation, selection, and migration, and the reproductive method of the organism used. Measures of the absolute and relative magnitudes of gene differentiation among subpopulations are also proposed.

8,465 citations

Book
01 Jan 1987
TL;DR: The Delta Method and the Influence Function Cross-Validation, Jackknife and Bootstrap Balanced Repeated Replication (half-sampling) Random Subsampling Nonparametric Confidence Intervals as mentioned in this paper.
Abstract: The Jackknife Estimate of Bias The Jackknife Estimate of Variance Bias of the Jackknife Variance Estimate The Bootstrap The Infinitesimal Jackknife The Delta Method and the Influence Function Cross-Validation, Jackknife and Bootstrap Balanced Repeated Replications (Half-Sampling) Random Subsampling Nonparametric Confidence Intervals.

7,007 citations