scispace - formally typeset
Search or ask a question
Journal ArticleDOI

DiveRsity: An R package for the estimation and exploration of population genetics parameters and their associated errors

01 Aug 2013-Methods in Ecology and Evolution (Wiley/Blackwell (10.1111))-Vol. 4, Iss: 8, pp 782-788
TL;DR: A new R package, diveRsity, for the calculation of various diversity statistics, including common diversity partitioning statistics (θ, GST) and population differentiation statistics (DJost, GST ' , χ2 test for population heterogeneity), among others.
Abstract: Summary We present a new R package, diveRsity, for the calculation of various diversity statistics, including common diversity partitioning statistics (θ, GST) and population differentiation statistics (DJost, GST ', χ2 test for population heterogeneity), among others. The package calculates these estimators along with their respective bootstrapped confidence intervals for loci, sample population pairwise and global levels. Various plotting tools are also provided for a visual evaluation of estimated values, allowing users to critically assess the validity and significance of statistical tests from a biological perspective. diveRsity has a set of unique features, which facilitate the use of an informed framework for assessing the validity of the use of traditional F-statistics for the inference of demography, with reference to specific marker types, particularly focusing on highly polymorphic microsatellite loci. However, the package can be readily used for other co-dominant marker types (e.g. allozymes, SNPs). Detailed examples of usage and descriptions of package capabilities are provided. The examples demonstrate useful strategies for the exploration of data and interpretation of results generated by diveRsity. Additional online resources for the package are also described, including a GUI web app version intended for those with more limited experience using R for statistical analysis.
Citations
More filters
Journal ArticleDOI
TL;DR: An approach to assess the impacts of global climate change on biodiversity that takes into account adaptive genetic variation and evolutionary potential is presented, showing that considering local climatic adaptations reduces range loss projections but increases the potential for competition between species.
Abstract: Local adaptations can determine the potential of populations to respond to environmental changes, yet adaptive genetic variation is commonly ignored in models forecasting species vulnerability and biogeographical shifts under future climate change. Here we integrate genomic and ecological modeling approaches to identify genetic adaptations associated with climate in two cryptic forest bats. We then incorporate this information directly into forecasts of range changes under future climate change and assessment of population persistence through the spread of climate-adaptive genetic variation (evolutionary rescue potential). Considering climate-adaptive potential reduced range loss projections, suggesting that failure to account for intraspecific variability can result in overestimation of future losses. On the other hand, range overlap between species was projected to increase, indicating that interspecific competition is likely to play an important role in limiting species' future ranges. We show that although evolutionary rescue is possible, it depends on a population's adaptive capacity and connectivity. Hence, we stress the importance of incorporating genomic data and landscape connectivity in climate change vulnerability assessments and conservation management.

248 citations

Journal ArticleDOI
TL;DR: A new simpler and more efficient approach for understanding gene flow patterns is presented that allows the estimation of directional components of genetic divergence between pairs of populations at low computational effort, using any of the classical or modern measures of genetic differentiation.
Abstract: Understanding the population structure and patterns of gene flow within species is of fundamental importance to the study of evolution. In the fields of population and evolutionary genetics, measures of genetic differentiation are commonly used to gather this information. One potential caveat is that these measures assume gene flow to be symmetric. However, asymmetric gene flow is common in nature, especially in systems driven by physical processes such as wind or water currents. As information about levels of asymmetric gene flow among populations is essential for the correct interpretation of the distribution of contemporary genetic diversity within species, this should not be overlooked. To obtain information on asymmetric migration patterns from genetic data, complex models based on maximum-likelihood or Bayesian approaches generally need to be employed, often at great computational cost. Here, a new simpler and more efficient approach for understanding gene flow patterns is presented. This approach allows the estimation of directional components of genetic divergence between pairs of populations at low computational effort, using any of the classical or modern measures of genetic differentiation. These directional measures of genetic differentiation can further be used to calculate directional relative migration and to detect asymmetries in gene flow patterns. This can be done in a user-friendly web application called divMigrate-online introduced in this study. Using simulated data sets with known gene flow regimes, we demonstrate that the method is capable of resolving complex migration patterns under a range of study designs.

198 citations

Journal ArticleDOI
TL;DR: DivMigrate-online as mentioned in this paper is a web application that allows the estimation of directional components of genetic divergence between pairs of populations at low computational effort, using any of the classical or modern measures of genetic differentiation, which can further be used to calculate directional relative migration and to detect asymmetries in gene flow patterns.
Abstract: Understanding the population structure and patterns of gene flow within species is of fundamental importance to the study of evolution. In the fields of population and evolutionary genetics, measures of genetic differentiation are commonly used to gather this information. One potential caveat is that these measures assume gene flow to be symmetric. However, asymmetric gene flow is common in nature, especially in systems driven by physical processes such as wind or water currents. Since information about levels of asymmetric gene flow among populations is essential for the correct interpretation of the distribution of contemporary genetic diversity within species, this should not be overlooked. To obtain information on asymmetric migration patterns from genetic data, complex models based on maximum likelihood or Bayesian approaches generally need to be employed, often at great computational cost. Here, a new simpler and more efficient approach for understanding gene flow patterns is presented. This approach allows the estimation of directional components of genetic divergence between pairs of populations at low computational effort, using any of the classical or modern measures of genetic differentiation. These directional measures of genetic differentiation can further be used to calculate directional relative migration and to detect asymmetries in gene flow patterns. This can be done in a user-friendly web application called divMigrate-online introduced in this paper. Using simulated data sets with known gene flow regimes, we demonstrate that the method is capable of resolving complex migration patterns under a range of study designs.

186 citations

Journal ArticleDOI
TL;DR: This study uses double‐digest restriction‐associated DNA sequencing to recover thousands of single nucleotide polymorphisms (SNPs) for two physically isolated populations of Amphirrhox longifolia, a nonmodel plant species for which no reference genome is available.
Abstract: High-throughput DNA sequencing facilitates the analysis of large portions of the genome in nonmodel organisms, ensuring high accuracy of population genetic parameters. However, empirical studies evaluating the appropriate sample size for these kinds of studies are still scarce. In this study, we use double-digest restriction-associated DNA sequencing (ddRADseq) to recover thousands of single nucleotide polymorphisms (SNPs) for two physically isolated populations of Amphirrhox longifolia (Violaceae), a nonmodel plant species for which no reference genome is available. We used resampling techniques to construct simulated populations with a random subset of individuals and SNPs to determine how many individuals and biallelic markers should be sampled for accurate estimates of intra- and interpopulation genetic diversity. We identified 3646 and 4900 polymorphic SNPs for the two populations of A. longifolia, respectively. Our simulations show that, overall, a sample size greater than eight individuals has little impact on estimates of genetic diversity within A. longifolia populations, when 1000 SNPs or higher are used. Our results also show that even at a very small sample size (i.e. two individuals), accurate estimates of FST can be obtained with a large number of SNPs (≥1500). These results highlight the potential of high-throughput genomic sequencing approaches to address questions related to evolutionary biology in nonmodel organisms. Furthermore, our findings also provide insights into the optimization of sampling strategies in the era of population genomics.

186 citations


Cites background or methods from "DiveRsity: An R package for the est..."

  • ...This analysis does not require standardization for markers with few allelic states (e.g. SNPs; Meirmans & Hedrick 2011); therefore, we do not report results for any ‘corrected’ FST analogues such as G’ST or Jost’s D (Keenan et al. 2013)....

    [...]

  • ...We then estimated population genetic differentiation by calculating FST (Weir & Cockerham 1984) using the R package DIVERSITY (Keenan et al. 2013)....

    [...]

  • ...SNPs; Meirmans & Hedrick 2011); therefore, we do not report results for any ‘corrected’ FST analogues such as G’ST or Jost’s D (Keenan et al. 2013)....

    [...]

Journal ArticleDOI
TL;DR: The r package stratag is introduced as a user‐friendly population genetics toolkit that provides easy access to a suite of standard genetic summaries as well as the ability to rapidly manipulate stratified genetic data for custom analyses.
Abstract: We introduce the r package stratag as a user-friendly population genetics toolkit. stratag provides easy access to a suite of standard genetic summaries as well as the ability to rapidly manipulate stratified genetic data for custom analyses. Tests of population subdivision with most common measures of population subdivision (e.g., FST , GST , ΦST , Χ2 ) can be conducted within a single function. The package also provides wrapper functions that allow users to configure and run popular external programs such as genepop, structure, and fastsimcoal from within r, and smoothly interface with popular r packages adegenet and pegas. stratag is intended to be an open-source dynamic package that will grow with future needs and user input.

182 citations

References
More filters
Journal ArticleDOI
TL;DR: The purpose of this discussion is to offer some unity to various estimation formulae and to point out that correlations of genes in structured populations, with which F-statistics are concerned, are expressed very conveniently with a set of parameters treated by Cockerham (1 969, 1973).
Abstract: This journal frequently contains papers that report values of F-statistics estimated from genetic data collected from several populations. These parameters, FST, FIT, and FIS, were introduced by Wright (1951), and offer a convenient means of summarizing population structure. While there is some disagreement about the interpretation of the quantities, there is considerably more disagreement on the method of evaluating them. Different authors make different assumptions about sample sizes or numbers of populations and handle the difficulties of multiple alleles and unequal sample sizes in different ways. Wright himself, for example, did not consider the effects of finite sample size. The purpose of this discussion is to offer some unity to various estimation formulae and to point out that correlations of genes in structured populations, with which F-statistics are concerned, are expressed very conveniently with a set of parameters treated by Cockerham (1 969, 1973). We start with the parameters and construct appropriate estimators for them, rather than beginning the discussion with various data functions. The extension of Cockerham's work to multiple alleles and loci will be made explicit, and the use of jackknife procedures for estimating variances will be advocated. All of this may be regarded as an extension of a recent treatment of estimating the coancestry coefficient to serve as a mea-

17,890 citations


"DiveRsity: An R package for the est..." refers methods in this paper

  • ...This R package allows the estimation of various population genetic summary statistics including the two ‘traditional’ F-statistics analogues; h (Weir & Cockerham 1984) and GST (Nei & Chesser 1983), and the two ‘new’ differentiation statistics; G0ST (Hedrick 2005) and DJost (Jost 2008), as well as…...

    [...]

Journal ArticleDOI
TL;DR: G ST and its relatives are often interpreted as measures of differentiation between subpopulations, with values near zero supposedly indicating low differentiation, but GST necessarily approaches zero when gene diversity is high, and it is not monotonic with increasing differentiation.
Abstract: G(ST) and its relatives are often interpreted as measures of differentiation between subpopulations, with values near zero supposedly indicating low differentiation. However, G(ST) necessarily approaches zero when gene diversity is high, even if subpopulations are completely differentiated, and it is not monotonic with increasing differentiation. Likewise, when diversity is equated with heterozygosity, standard similarity measures formed by taking the ratio of mean within-subpopulation diversity to total diversity necessarily approach unity when diversity is high, even if the subpopulations are completely dissimilar (no shared alleles). None of these measures can be interpreted as measures of differentiation or similarity. The derivations of these measures contain two subtle misconceptions which cause their paradoxical behaviours. Conclusions about population differentiation, gene flow, relatedness, and conservation priority will often be wrong when based on these fixation indices or similarity measures. These are not statistical issues; the problems persist even when true population frequencies are used in the calculations. Recent advances in the mathematics of diversity identify the misconceptions, and yield mathematically consistent descriptive measures of population structure which eliminate the paradoxes produced by standard measures. These measures can be directly related to the migration and mutation rates of the finite-island model.

2,262 citations


"DiveRsity: An R package for the est..." refers background or methods in this paper

  • ...F-statistics estimators (e.g. h,GST) suffer from an incompatibility when applied to highly polymorphic microsatellite markers (Hedrick 1999; Jost 2008), as a result of their negative dependence on within subpopulation heterozygosity (Jost 2008)....

    [...]

  • ...Attempts have been made to overcome this issue, most notably by Hedrick (2005), with the development of G0ST and more recently, Jost (2008) with the development of DJost....

    [...]

  • ...It is not the purpose of this study to elaborate on such issues; however, interested readers are encouraged to see Jost (2008), Meirmans &Hedrick (2011) andWhitlock (2011) for useful reviews....

    [...]

  • ...…population genetic summary statistics including the two ‘traditional’ F-statistics analogues; h (Weir & Cockerham 1984) and GST (Nei & Chesser 1983), and the two ‘new’ differentiation statistics; G0ST (Hedrick 2005) and DJost (Jost 2008), as well as their unbiased/nearly unbiased estimators....

    [...]

  • ...This R package allows the estimation of various population genetic summary statistics including the two ‘traditional’ F-statistics analogues; h (Weir & Cockerham 1984) and GST (Nei & Chesser 1983), and the two ‘new’ differentiation statistics; GST (Hedrick 2005) and DJost (Jost 2008), as well as their unbiased/nearly unbiased estimators....

    [...]

Journal ArticleDOI
TL;DR: The BIC provides an approximation to a Bayesian hypothesis test, does not require the specification of priors, and can be easily calculated from SPSS output.
Abstract: In the field of psychology, the practice ofp value null-hypothesis testing is as widespread as ever. Despite this popularity, or perhaps because of it, most psychologists are not aware of the statistical peculiarities of thep value procedure. In particular,p values are based on data that were never observed, and these hypothetical data are themselves influenced by subjective intentions. Moreover,p values do not quantify statistical evidence. This article reviews thesep value problems and illustrates each problem with concrete examples. The three problems are familiar to statisticians but may be new to psychologists. A practical solution to thesep value problems is to adopt a model selection perspective and use the Bayesian information criterion (BIC) for statistical inference (Raftery, 1995). The BIC provides an approximation to a Bayesian hypothesis test, does not require the specification of priors, and can be easily calculated from SPSS output.

1,887 citations


"DiveRsity: An R package for the est..." refers methods in this paper

  • ...This decision was taken given the lack of meaningful information conveyed through the use of P-values in this context, as well as the many misconceptions that exist regarding the biological interpretation of P-values in relation to these statistics (Wagenmakers 2007)....

    [...]

Journal ArticleDOI
TL;DR: Two programs, GENOTYPE and GENODIVE, developed for analyses of clonal diversity in asexually reproducing organisms, show that genotype can be used for detecting genotyping errors in studies of sexual organisms.
Abstract: Investigating diversity in asexual organisms using molecular markers involves the assignment of individuals to clonal lineages and the subsequent analysis of clonal diversity. Assignment is possible using a distance matrix in combination with a user-specified threshold, defined as the maximum distance between two individuals that are considered to belong to the same clonal lineage. Analysis of clonal diversity requires tests for differences in diversity and clonal composition between populations. We developed two programs, GENOTYPE and GENODIVE for such analyses of clonal diversity in asexually reproducing organisms. Additionally, genotype can be used for detecting genotyping errors in studies of sexual organisms.

1,846 citations

Trending Questions (1)
How can pairwise differences be used to estimate genetic diversity in loci?

Pairwise differences can be used to calculate diversity statistics such as θ and GST, and to assess population differentiation using statistics like DJost and χ2 test.