scispace - formally typeset
Search or ask a question

Showing papers in "Molecular Ecology Resources in 2010"


Journal ArticleDOI
TL;DR: The main innovations of the new version of the Arlequin program include enhanced outputs in XML format, the possibility to embed graphics displaying computation results directly into output files, and the implementation of a new method to detect loci under selection from genome scans.
Abstract: We present here a new version of the Arlequin program available under three different forms: a Windows graphical version (Winarl35), a console version of Arlequin (arlecore), and a specific console version to compute summary statistics (arlsumstat). The command-line versions run under both Linux and Windows. The main innovations of the new version include enhanced outputs in XML format, the possibility to embed graphics displaying computation results directly into output files, and the implementation of a new method to detect loci under selection from genome scans. Command-line versions are designed to handle large series of files, and arlsumstat can be used to generate summary statistics from simulated data sets within an Approximate Bayesian Computation framework.

13,581 citations


Journal ArticleDOI
TL;DR: In this paper, the authors describe colony, a computer program implementing full-pedigree likelihood methods to simultaneously infer sibship and parentage among individuals using multilocus genotype data.
Abstract: Pedigrees, depicting genealogical relationships between individuals, are important in several research areas. Molecular markers allow inference of pedigrees in wild species where relationship information is impossible to collect by observation. Marker data are analysed statistically using methods based on Mendelian inheritance rules. There are numerous computer programs available to conduct pedigree analysis, but most software is inflexible, both in terms of assumptions and data requirements. Most methods only accommodate monogamous diploid species using codominant markers without genotyping error. In addition, most commonly used methods use pairwise comparisons rather than a full-pedigree likelihood approach, which considers the likelihood of the entire pedigree structure and allows the simultaneous inference of parentage and sibship. Here, we describe colony, a computer program implementing full-pedigree likelihood methods to simultaneously infer sibship and parentage among individuals using multilocus genotype data. colony can be used for both diploid and haplodiploid species; it can use dominant and codominant markers, and can accommodate, and estimate, genotyping error at each locus. In addition, colony can carry out these inferences for both monoecious and dioecious species. The program is available as a Microsoft Windows version, which includes a graphical user interface, and a Macintosh version, which uses an R-based interface.

1,472 citations


Journal ArticleDOI
TL;DR: Software for the measurement of genetic diversity (SMOGD) is a web‐based application for the calculation of the recently proposed genetic diversity indices G′ST and Dest and generates genetic distance matrices from pairwise comparisons between populations.
Abstract: Software for the measurement of genetic diversity (SMOGD) is a web-based application for the calculation of the recently proposed genetic diversity indices G′ST and Dest. SMOGD includes bootstrapping functionality for estimating the variance, standard error and confidence intervals of estimated parameters, and SMOGD also generates genetic distance matrices from pairwise comparisons between populations. SMOGD accepts standard, multilocus Genepop and Arlequin formatted input files and produces HTML and tab-delimited output. This allows easy data submission, quick visualization, and rapid import of results into spreadsheet or database programs.

631 citations


Journal ArticleDOI
TL;DR: Numerical simulations show that in tests of significance of the relationship between simple variables and multivariate data tables, the power of linear correlation, regression and canonical analysis is far greater than that of the Mantel test and derived forms, meaning that the former methods are much more likely than the latter to detect a relationship when one is present in the data.
Abstract: The Mantel test is widely used to test the linear or monotonic independence of the elements in two distance matrices. It is one of the few appropriate tests when the hypothesis under study can only be formulated in terms of distances; this is often the case with genetic data. In particular, the Mantel test has been widely used to test for spatial relationship between genetic data and spatial layout of the sampling locations. We describe the domain of application of the Mantel test and derived forms. Formula development demonstrates that the sum-of-squares (SS) partitioned in Mantel tests and regression on distance matrices differs from the SS partitioned in linear correlation, regression and canonical analysis. Numerical simulations show that in tests of significance of the relationship between simple variables and multivariate data tables, the power of linear correlation, regression and canonical analysis is far greater than that of the Mantel test and derived forms, meaning that the former methods are much more likely than the latter to detect a relationship when one is present in the data. Examples of difference in power are given for the detection of spatial gradients. Furthermore, the Mantel test does not correctly estimate the proportion of the original data variation explained by spatial structures. The Mantel test should not be used as a general method for the investigation of linear relationships or spatial structures in univariate or multivariate data. Its use should be restricted to tests of hypotheses that can only be formulated in terms of distances.

622 citations


Journal ArticleDOI
TL;DR: A web tool called seqphase is presented that generates phase input files from fasta sequence alignments and converts phase output files back into fasta.
Abstract: The program phase is widely used for Bayesian inference of haplotypes from diploid genotypes; however, manually creating phase input files from sequence alignments is an error-prone and time-consuming process, especially when dealing with numerous variable sites and/or individuals Here, a web tool called seqphase is presented that generates phase input files from fasta sequence alignments and converts phase output files back into fasta During the production of the phase input file, several consistency checks are performed on the dataset and suitable command line options to be used for the actual phase data analysis are suggested seqphase was written in perl and is freely accessible over the Internet at the address http://wwwmnhnfr/jfflot/seqphase

474 citations


Journal ArticleDOI
TL;DR: Overall, it is found that parentage analysis is feasible and satisfying in most systems, and a simple roadmap is provided to help other scientists navigate the confusing topography of statistical techniques.
Abstract: The use of molecular techniques for parentage analysis has been a booming science for over a decade. The most important technological breakthrough was the introduction of microsatellite markers to molecular ecology, an advance that was accompanied by a proliferation and refinement of statistical techniques for the analysis of parentage data. Over the last several years, we have seen steady progress in a number of areas related to parentage analysis, and the prospects for successful studies continue to improve. Here, we provide an updated guide for scientists interested in embarking on parentage analysis in natural or artificial populations of organisms, with a particular focus on computer software packages that implement various methods of analysis. Our survey of the literature shows that there are a few established methods that perform extremely well in the analysis of most types of parentage studies. However, particular experimental designs or study systems can benefit from some of the less well-known computer packages available. Overall, we find that parentage analysis is feasible and satisfying in most systems, and we try to provide a simple roadmap to help other scientists navigate the confusing topography of statistical techniques.

448 citations


Journal ArticleDOI
TL;DR: It is shown that models without admixture are not robust to the inclusion of admixed individuals in the sample, thus providing an incorrect assessment of population genetic structure in many cases.
Abstract: This article reviews recent developments in Bayesian algorithms that explicitly include geographical information in the inference of population structure. Current models substantially differ in their prior distributions and background assumptions, falling into two broad categories: models with or without admixture. To aid users of this new generation of spatially explicit programs, we clarify the assumptions underlying the models, and we test these models in situations where their assumptions are not met. We show that models without admixture are not robust to the inclusion of admixed individuals in the sample, thus providing an incorrect assessment of population genetic structure in many cases. In contrast, admixture models are robust to an absence of admixture in the sample. We also give statistical and conceptual reasons why data should be explored using spatially explicit models that include admixture.

290 citations


Journal ArticleDOI
TL;DR: A new software package (introgress) provides functions for analysing introgression of genotypes between divergent, hybridizing lineages, including estimating genomic clines from multi‐locus genotype data and testing for deviations from neutral expectations.
Abstract: A new software package (introgress) provides functions for analysing introgression of genotypes between divergent, hybridizing lineages, including estimating genomic clines from multi-locus genotype data and testing for deviations from neutral expectations. The software works with co-dominant, dominant and haploid marker data, and does not require fixed allelic differences between parental populations for the sampled genetic markers. Permutation and parametric procedures generate neutral expectations for introgression and provide a basis for significance tests of observed genomic clines. The software also implements maximum likelihood estimates of hybrid index from genotypic data and a number of graphical analyses. The package is an extension of the R statistical software, is written in the R language and is freely available through the Comprehensive R Archive Network (CRAN; http://cran.r-project.org/). In this study, we describe introgress and demonstrate its use with a sample data set.

273 citations


Journal ArticleDOI
TL;DR: Methods for measuring and interpreting introgression at multiple loci in hybrid zones are reviewed, focusing on the problem of identifying loci that contribute to reproductive isolation, and future prospects for differential introgressive studies on a genomic scale are outlined.
Abstract: Hybrids between species provide information about the evolutionary processes involved in divergence. In addition to creating hybrids in the laboratory, biologists can take advantage of natural hybrid zones to understand the factors that shape gene flow between divergent lineages. In the early stages of speciation, most regions of the genome continue to flow freely between populations. Alternatively, the subset of the genome that confers reproductive barriers between nascent species is expected to reject introgression. Now enabled by advances in genomics, this perspective is motivating detailed comparisons of gene flow across genomic regions in hybrid zones. Here, I review methods for measuring and interpreting introgression at multiple loci in hybrid zones, focusing on the problem of identifying loci that contribute to reproductive isolation. Emerging patterns from multi-locus studies of hybrid zones are highlighted, including remarkable variance in introgression across the genome. Although existing methods have been useful, there is scope for development of new analytical approaches that better connect differential patterns of gene flow in hybrid zones with current knowledge of speciation mechanisms. I outline future prospects for differential introgression studies on a genomic scale.

207 citations


Journal ArticleDOI
TL;DR: Results using high‐throughput sequencing to obtain a large number of microsatellite loci from the venomous snake Agkistrodon contortrix, the copperhead were rapid, cost‐effective and identified thousands of useful micros satellite loci in a previously unstudied species.
Abstract: Optimalintegrationofnext-generationsequencingintomainstreamresearchrequiresre-evaluation of how problems can be reasonably overcome and what questions can be asked. One potential application is the rapid acquisition of genomic information to identify microsatellite loci for evolutionary, population genetic and chromosome linkage mapping research on non-model and not previously sequenced organisms. Here, we report on results using highthroughputsequencingtoobtainalargenumberofmicrosatellitelocifromthevenomoussnake Agkistrodon contortrix, the copperhead. We used the 454 Genome Sequencer FLX next-generation sequencing platform to sample randomly 27 Mbp (128 773 reads) of the copperhead genome,thussamplingabout2%ofthegenomeofthisspecies.Weidentifiedmicrosatelliteloci in 11.3% of all reads obtained, with 14 612 microsatellite loci identified in total, 4564 of which had flanking sequences suitable for polymerase chain reaction primer design. The random sequencing-based approach to identify microsatellites was rapid, cost-effective and identified thousandsofusefulmicrosatellitelociinapreviouslyunstudiedspecies.

206 citations


Journal ArticleDOI
TL;DR: A variety of SNP discovery and genotyping studies in ecology and evolution are summarized and the most efficient approaches to SNP discovery will depend on the research questions that the markers are to resolve as well as the focal species.
Abstract: Single nucleotide polymorphisms (SNPs) have gained wide use in humans and model species and are becoming the marker of choice for applications in other species. Technology that was developed for work in model species may provide useful tools for SNP discovery and genotyping in non-model organisms. However, SNP discovery can be expensive, labour intensive, and introduce ascertainment bias. In addition, the most efficient approaches to SNP discovery will depend on the research questions that the markers are to resolve as well as the focal species. We discuss advantages and disadvantages of several past and recent technologies for SNP discovery and genotyping and summarize a variety of SNP discovery and genotyping studies in ecology and evolution.

Journal ArticleDOI
TL;DR: This model implements individual‐based population modelling with Mendelian inheritance and k‐allele mutation on a resistant landscape and simulates changes in population and genotypes through time as functions of individual based movement, reproduction, mortality and dispersal on a continuous cost surface.
Abstract: Spatially explicit simulation of gene flow in complex landscapes is essential to explain observed population responses and provide a foundation for landscape genetics. To address this need, we wrote a spatially explicit, individual-based population genetics model (CDPOP). The model implements individual-based population modelling with Mendelian inheritance and k-allele mutation on a resistant landscape. The model simulates changes in population and genotypes through time as functions of individual based movement, reproduction, mortality and dispersal on a continuous cost surface. This model will be a valuable tool for the study of landscape genetics by increasing our understanding about the effects of life history, vagility and differential models of landscape resistance on the genetic structure of populations in complex landscapes.

Journal ArticleDOI
TL;DR: The suitability of the P6 loop for analysis of samples containing degraded ancient DNA from a mixture of species is demonstrated by high‐throughput parallel pyrosequencing of permafrost‐preserved DNA and reconstruction of two plant communities from the last glacial period.
Abstract: Palaeoenvironments and former climates are typically inferred from pollen and macrofossil records. This approach is time-consuming and suffers from low taxonomic resolution and biased taxon sampling. Here, we test an alternative DNA-based approach utilizing the P6 loop in the chloroplast trnL (UAA) intron; a short (13–158 bp) and variable region with highly conserved flanking sequences. For taxonomic reference, a whole trnL intron sequence database was constructed from recently collected material of 842 species, representing all widespread and/or ecologically important taxa of the species-poor arctic flora. The P6 loop alone allowed identification of all families, most genera (>75%) and one-third of the species, thus providing much higher taxonomic resolution than pollen records. The suitability of the P6 loop for analysis of samples containing degraded ancient DNA from a mixture of species is demonstrated by high-throughput parallel pyrosequencing of permafrost-preserved DNA and reconstruction of two plant communities from the last glacial period. Our approach opens new possibilities for DNA-based assessment of ancient as well as modern biodiversity of many groups of organisms using environmental samples.

Journal ArticleDOI
Junghwa An1, Arnaud Béchet, Åsa Berggren2, Sarah K. Brown3, Michael William Bruford4, Qingui Cai, Anna Cassel-Lundhagen2, Frank Cézilly5, Song-Lin Chen6, Wei Cheng7, Sung Kyoung Choi1, X. Y. Ding8, Yong Fan9, Kevin A. Feldheim10, Z. Y. Feng8, Vicki L. Friesen11, Maria Gaillard5, Juan A. Galaraza12, Leonardo A. Gallo, K. N. Ganeshaiah13, Julia Geraci5, John G. Gibbons14, William Stewart Grant7, Zac Grauvogel7, Susanne Gustafsson15, Jeffrey Robert Guyon16, L. Han8, Daniel D. Heath17, Sofia Hemmilä15, J. Derek Hogan17, B. W. Hou8, Jernej Jakše18, Branka Javornik18, Peter Kaňuch2, Kyung i.Kl Kim19, Kyung Seok Kim1, Sang Gyu Kim19, Sang In Kim1, Woo-Jin Kim19, Yi Kyung Kim19, Maren A. Klich20, Brian R. Kreiser21, Ye Seul Kwan22, Athena Lam23, Kelly Lasater1, Martin Lascoux15, Hang Lee1, Yun Sun Lee1, D. L. Li24, Shao Jing Li24, W. Y. Li24, Xiaolin Liao6, Zlatko Liber25, Lin Lin9, Shaoying Liu, Xin Hui Luo6, Xin Hui Luo26, Y. H. Ma8, Yajun Ma9, Paula Marchelli, Mi Sook Min1, Maria Domenica Moccia27, Kumara P. Mohana13, Marcelle Moore28, James A. Morris-Pocock11, Han Chan Park1, Monika Pfunder, Radosavljević Ivan25, Gudasalamani Ravikanth13, George K. Roderick23, Antonis Rokas14, Benjamin N. Sacks3, Benjamin N. Sacks28, Christopher A. Saski29, Zlatko Šatović25, Sean D. Schoville23, Federico Sebastiani, Zhen Xia Sha6, Eun Ha Shin19, Carolina Soliani, N. Sreejayan13, Zhengxin Sun11, Yong Tao24, Scott A. Taylor11, William D. Templin7, R. Uma Shaanker13, Ramesh Vasudeva13, Giovanni G. Vendramin30, Ryan P. Walter17, Gui Zhong Wang24, Ke Jian Wang24, Yi Wang24, Rémi Wattier5, Fuwen Wei, Alex Widmer27, Stefan Woltmann31, Yong Jin Won, Jing Wu9, M. L. Xie8, Gen-Bo Xu6, Gen-Bo Xu32, Xiao Jun Xu24, Hai Hui Ye24, Xiangjiang Zhan4, F. Zhang8, J. Zhong24 
TL;DR: The addition of 411 microsatellite marker loci and 15 pairs of Single Nucleotide Polymorphism (SNP) sequencing primers to the Molecular Ecology Resources Database are documents.
Abstract: This article documents the addition of 411 microsatellite marker loci and 15 pairs of Single Nucleotide Polymorphism (SNP) sequencing primers to the Molecular Ecology Resources Database. Loci were developed for the following species: Acanthopagrus schlegeli, Anopheles lesteri, Aspergillus clavatus, Aspergillus flavus, Aspergillus fumigatus, Aspergillus oryzae, Aspergillus terreus, Branchiostoma japonicum, Branchiostoma belcheri, Colias behrii, Coryphopterus personatus, Cynogolssus semilaevis, Cynoglossus semilaevis, Dendrobium officinale, Dendrobium officinale, Dysoxylum malabaricum, Metrioptera roeselii, Myrmeciza exsul, Ochotona thibetana, Neosartorya fischeri, Nothofagus pumilio, Onychodactylus fischeri, Phoenicopterus roseus, Salvia officinalis L., Scylla paramamosain, Silene latifo, Sula sula, and Vulpes vulpes. These loci were cross-tested on the following species: Aspergillus giganteus, Colias pelidne, Colias interior, Colias meadii, Colias eurytheme, Coryphopterus lipernes, Coryphopterus glaucofrenum, Coryphopterus eidolon, Gnatholepis thompsoni, Elacatinus evelynae, Dendrobium loddigesii Dendrobium devonianum, Dysoxylum binectariferum, Nothofagus antarctica, Nothofagus dombeyii, Nothofagus nervosa, Nothofagus obliqua, Sula nebouxii, and Sula variegata. This article also documents the addition of 39 sequencing primer pairs and 15 allele specific primers or probes for Paralithodes camtschaticus.

Journal ArticleDOI
TL;DR: The results on Pleistocene cave bear samples show that DNA yields are quantitatively comparable, and in fact even slightly better than with silica batch extraction, while at the same time the number of samples that can conveniently be processed in parallel increases and both bench time and costs decrease using this method.
Abstract: Genetic analyses using museum specimens and ancient DNA from fossil samples are becoming increasingly important in phylogenetic and especially population genetic studies. Recent progress in ancient DNA sequencing technologies has substantially increased DNA sequence yields and, in combination with barcoding methods, has enabled large-scale studies using any type of DNA. Moreover, more and more studies now use nuclear DNA sequences in addition to mitochondrial ones. Unfortunately, nuclear DNA is, due to its much lower copy number in living cells compared to mitochondrial DNA, much more difficult to obtain from low-quality samples. Therefore, a DNA extraction method that optimizes DNA yields from low-quality samples and at the same time allows processing many samples within a short time frame is immediately required. In fact, the major bottleneck in the analysis process using samples containing low amounts of degraded DNA now lies in the extraction of samples, as column-based methods using commercial kits are fast but have proven to give very low yields, while more efficient methods are generally very time-consuming. Here, we present a method that combines the high DNA yield of batch-based silica extraction with the time-efficiency of column-based methods. Our results on Pleistocene cave bear samples show that DNA yields are quantitatively comparable, and in fact even slightly better than with silica batch extraction, while at the same time the number of samples that can conveniently be processed in parallel increases and both bench time and costs decrease using this method. Thus, this method is suited for harvesting the power of high-throughput sequencing using the DNA preserved in the millions of paleontological and museums specimens.

Journal ArticleDOI
TL;DR: A new approach to create microsatellite primer sets that have high utility across a wide range of species was developed using birds and the success was demonstrated using birds.
Abstract: We have developed a new approach to create microsatellite primer sets that have high utility across a wide range of species. The success of this method was demonstrated using birds. We selected 35 avian EST microsatellite loci that had a high degree of sequence homology between the zebra finch Taeniopygia guttata and the chicken Gallus gallus and designed primer sets in which the primer bind sites were identical in both species. For 33 conserved primer sets, on average, 100% of loci amplified in each of 17 passerine species and 99% of loci in five non-passerine species. The genotyping of four individuals per species revealed that 24-76% (mean 48%) of loci were polymorphic in the passerines and 18-26% (mean 21%) in the non-passerines. When at least 17 individuals were genotyped per species for four Fringillidae finch species, 71-85% of loci were polymorphic, observed heterozygosity was above 0.50 for most loci and no locus deviated significantly from Hardy-Weinberg proportions. This new set of microsatellite markers is of higher cross-species utility than any set previously designed. The loci described are suitable for a range of applications that require polymorphic avian markers, including paternity and population studies. They will facilitate comparisons of bird genome organization, including genome mapping and studies of recombination, and allow comparisons of genetic variability between species whilst avoiding ascertainment bias. The costs and time to develop new loci can now be avoided for many applications in numerous species. Furthermore, our method can be readily used to develop microsatellite markers of high utility across other taxa.

Journal ArticleDOI
TL;DR: R genhet is an R function which calculates the five most used estimates of individual heterozygosity, and can be applied to any diploid genotype dataset, without any limitation in the number of individuals, loci or alleles.
Abstract: genhet is an R function which calculates the five most used estimates of individual heterozygosity. The advantage of this program is that it can be applied to any diploid genotype dataset, without any limitation in the number of individuals, loci or alleles. Its detailed manual should allow people who have never used R before to make the function work quite easily. The program is freely available at http://www.aureliecoulon.net/research/ac-computer-programs.html.

Journal ArticleDOI
TL;DR: This review addresses the well established typing methods such as the Single Strand Conformation Polymorphism (SSCP), Denaturing Gradient Gel Electrophoresis (DGGE), ReferenceStrand Conformational Analysis (RSCA) and cloning of PCR products and includes the intriguing possibility of direct amplicon sequencing followed by the computational inference of alleles and also next generation sequencing (NGS) technologies.
Abstract: Genes of the major histocompatibility complex (MHC) are considered a paradigm of adaptive evolution at the molecular level and as such are frequently investigated by evolutionary biologists and ecologists. Accurate genotyping is essential for understanding of the role that MHC variation plays in natural populations, but may be extremely challenging. Here, I discuss the DNA-based methods currently used for genotyping MHC in non-model vertebrates, as well as techniques likely to find widespread use in the future. I also highlight the aspects of MHC structure that are relevant for genotyping, and detail the challenges posed by the complex genomic organization and high sequence variation of MHC loci. Special emphasis is placed on designing appropriate PCR primers, accounting for artefacts and the problem of genotyping alleles from multiple, co-amplifying loci, a strategy which is frequently necessary due to the structure of the MHC. The suitability of typing techniques is compared in various research situations, strategies for efficient genotyping are discussed and areas of likely progress in future are identified. This review addresses the well established typing methods such as the Single Strand Conformation Polymorphism (SSCP), Denaturing Gradient Gel Electrophoresis (DGGE), Reference Strand Conformational Analysis (RSCA) and cloning of PCR products. In addition, it includes the intriguing possibility of direct amplicon sequencing followed by the computational inference of alleles and also next generation sequencing (NGS) technologies; the latter technique may, in the future, find widespread use in typing complex multilocus MHC systems.

Journal ArticleDOI
TL;DR: The results evidenced the usefulness of the DNA barcodes for cataloguing Cuban freshwater fish species and for identifying those groups that deserve further taxonomic attention.
Abstract: Despite ongoing efforts to protect species and ecosystems in Cuba, habitat degradation, overuse and introduction of alien species have posed serious challenges to native freshwater fish species. In spite of the accumulated knowledge on the systematics of this freshwater ichthyofauna, recent results suggested that we are far from having a complete picture of the Cuban freshwater fish diversity. It is estimated that 40% of freshwater Cuban fish are endemic; however, this number may be even higher. Partial sequences (652 bp) of the mitochondrial gene COI (cytochrome c oxidase subunit I) were used to barcode 126 individuals, representing 27 taxonomically recognized species in 17 genera and 10 families. Analysis was based on Kimura 2-parameter genetic distances, and for four genera a character-based analysis (population aggregation analysis) was also used. The mean conspecific, congeneric and confamiliar genetic distances were 0.6%, 9.1% and 20.2% respectively. Molecular species identification was in concordance with current taxonomical classification in 96.4% of cases, and based on the neighbour-joining trees, in all but one instance, members of a given genera clustered within the same clade. Within the genus Gambusia, genetic divergence analysis suggests that there may be at least four cryptic species. In contrast, low genetic divergence and a lack of diagnostic sites suggest that Rivulus insulaepinorum may be conspecific with Rivulus cylindraceus. Distance and character-based analysis were completely concordant, suggesting that they complement species identification. Overall, the results evidenced the usefulness of the DNA barcodes for cataloguing Cuban freshwater fish species and for identifying those groups that deserve further taxonomic attention.

Journal ArticleDOI
TL;DR: The r package pedantics implements tools to facilitate power and sensitivity analyses of pedigree‐related studies of natural populations and functions are available to permute pedigree data in various ways with the goal of mimicking patterns of pedigree errors and missingness.
Abstract: Analyses of pedigrees and pedigree-derived parameters (e.g. relatedness and fitness) provide some of the most informative types of studies in evolutionary biology. The r package pedantics implements tools to facilitate power and sensitivity analyses of pedigree-related studies of natural populations. Functions are available to permute pedigree data in various ways with the goal of mimicking patterns of pedigree errors and missingness that occur in studies of natural populations. Another set of functions simulates genetic and phenotypic data based on arbitrary pedigrees. Finally, functions are also available with which visual and numerical representations of pedigree structure can be generated.

Journal ArticleDOI
TL;DR: This review will mainly focus on the application of metabolomics in plant ecology and genetics and some of the most commonly used analytical metabolomic platforms are briefly discussed.
Abstract: Metabolomics is a fast developing field of comprehensive untargeted chemical analyses. It has many applications and can in principle be used on any organism without prior knowledge of the metabolome or genome. The amount of functional information that is acquired with metabolomics largely depends on whether a metabolome database has been developed for the focal species. Metabolomics is a level downstream from transcriptomics and proteomics and has been widely advertised as a functional genomics and systems biology tool. Indeed, it has been successfully applied to link phenotypes to genotypes in the model plant Arabidopsis thaliana. Metabolomics is also increasingly being used in ecology (ecological metabolomics) and environmental sciences (environmental metabolomics). In ecology, the technique has led to novel insights into the mechanisms of plant resistance to herbivores. Some of the most commonly used analytical metabolomic platforms are briefly discussed in this review, as well as their limitations. We will mainly focus on the application of metabolomics in plant ecology and genetics.

Journal ArticleDOI
TL;DR: Three widely used software programs for selecting markers informative for population assignment suffer from an upward bias, which is largest when screening many candidate loci from poorly differentiated populations.
Abstract: It is well known that statistical classification procedures should be assessed using data that are separate from those used to train the classifier. This principle is commonly overlooked when the classification procedure in question is population assignment using a set of genetic markers that were chosen specifically on the basis of their allele frequencies from amongst a larger number of candidate markers. This oversight leads to a systematic upward bias in the predicted accuracy of the chosen set of markers for population assignment. Three widely used software programs for selecting markers informative for population assignment suffer from this bias. The extent of this bias is documented through a small set of simulations. The relative effect of the bias is largest when screening many candidate loci from poorly differentiated populations. Simple unbiased methods are presented and their use encouraged.

Journal ArticleDOI
TL;DR: It is demonstrated that in this genus, deep genetic divisions expected on the basis of mtDNA barcoding are not always reflected in the nuclear genome, and advocate the use of AFLP markers as a check when mtDNABarcoding gives unexpected results.
Abstract: Mimicry and extensive geographical subspecies polymorphism combine to make species in the ithomiine butterfly genus Mechanitis (Lepidoptera; Nymphalidae) difficult to determine. We use mitochondrial DNA (mtDNA) barcoding, nuclear sequences and amplified fragment length polymorphism (AFLP) genotyping to investigate species limits in this genus. Although earlier biosystematic studies based on morphology described only four species, mtDNA barcoding revealed eight well-differentiated haplogroups, suggesting the presence of four new putative ‘cryptic species’. However, AFLP markers supported only one of these four new ‘cryptic species’ as biologically meaningful. We demonstrate that in this genus, deep genetic divisions expected on the basis of mtDNA barcoding are not always reflected in the nuclear genome, and advocate the use of AFLP markers as a check when mtDNA barcoding gives unexpected results.

Journal ArticleDOI
TL;DR: Rh is an extension package for the statistical software r that estimates this correlation and calculates three measures of individual multilocus heterozygosity: homozygosity by loci, internal relatedness and standardized heterYD, and has a homepage at http://www.helsinki.fi/biosci/egru/research/software.
Abstract: Individual multilocus heterozygosity estimates based on a limited number of loci are expected to correlate only weakly with the inbreeding level of an individual. Before using multilocus heterozygosity estimates in studies of inbreeding, their ability to capture information on inbreeding in the given setting should be tested. A convenient method for this is to compute the heterozygosity-heterozygosity correlation, i.e. the mean correlation between multilocus heterozygosity estimates calculated from random samples of loci, which should be positive if multilocus heterozygosity carries a signature of inbreeding. Rhh is an extension package for the statistical software r that estimates this correlation and calculates three measures of individual multilocus heterozygosity: homozygosity by loci, internal relatedness and standardized heterozygosity. The extension package is available through the CRAN (http://cran.r-project.org) and has a homepage at http://www.helsinki.fi/biosci/egru/research/software.

Journal ArticleDOI
TL;DR: The taxonomic distribution of numts is investigated by analysing cloned COI sequences and the effects of primer specificity on eliminating numt coamplification in four lineages are tested, which suggests that numts may be widespread in other taxonomic groups as well.
Abstract: DNA barcoding is a diagnostic method of species identification based on sequencing a short mitochondrial DNA fragment of cytochrome oxidase I (COI), but its ability to correctly diagnose species is limited by the presence of nuclear mitochondrial pseudogenes (numts). Numts can be coamplified with the mitochondrial orthologue when using universal primers, which can lead to incorrect species identification and an overestimation of the number of species. Some researchers have proposed that using more specific primers may help eliminate numt coamplification, but the efficacy of this method has not been thoroughly tested. In this study, we investigate the taxonomic distribution of numts in 11 lineages within the insect order Orthoptera, by analysing cloned COI sequences and further test the effects of primer specificity on eliminating numt coamplification in four lineages. We find that numts are coamplified in all 11 taxa using universal (barcoding) primers, which suggests that numts may be widespread in other taxonomic groups as well. Increased primer specificity is only effective at reducing numt coamplification in some species tested, and only eliminates it in one species tested. Furthermore, we find that a number of numts do not have stop codons or indels, making it difficult to distinguish them from mitochondrial orthologues, thus putting the efficacy of barcoding quality control measures under question. Our findings suggest that numt coamplification is a serious problem for DNA barcoding and more quality control measures should be implemented to identify and eliminate numts prior to using mitochondrial barcodes for species diagnoses.

Journal ArticleDOI
TL;DR: This work considers how natural stratifications influence the understanding of effective population size and how to estimate it, and what the consequences are for conservation and management of natural populations.
Abstract: The concept of effective population size (Ne) is based on an elegantly simple idea which, however, rapidly becomes very complex when applied to most real-world situations. In natural populations, spatial and temporal stratifications create different classes of individuals with different vital rates, and this in turn affects (generally reduces) Ne in complex ways. I consider how these natural stratifications influence our understanding of effective size and how to estimate it, and what the consequences are for conservation and management of natural populations. Important points that emerge include the following: 1. The relative influences of local vs metapopulation Ne depend on a variety of factors, including the time frame of interest. 2. Levels of diversity in local populations are strongly influenced by even low levels of migration, so these measures are not reliable indicators of local Ne. 3. For long-term effective size, obtaining a reliable estimate of mutation rate is the most important consideration; unless this is accomplished, estimates can be biased by orders of magnitude. 4. At least some estimators of contemporary Ne appear to be robust to relatively high (approximately 10%) equilibrium levels of migration, so under many realistic scenarios they might yield reliable estimates of local Ne. 5. Age structure probably has little effect on long-term estimators of Ne but can strongly influence contemporary estimates. 6. More research is needed in several key areas: (i) to disentangle effects of selection and drift in metapopulations connected by intermediate levels of migration; (ii) to elucidate the relationship between Nb (effective number of breeders per year) and Ne per generation in age-structured populations; (iii) to perform rigorous sensitivity analyses of new likelihood and coalescent-based methods for estimating demographic and evolutionary histories.

Journal ArticleDOI
TL;DR: A method that allows the design of phylum‐specific hybrid primers, and uses this to develop COI primers for the Echinodermata, which confirmed both the quality of the sequences (>500 bp, no pseudogenes) and their utility as a DNA barcode.
Abstract: Recent research has shown the usefulness of the Folmer region of the cytochrome oxidase I (COI) as a genetic barcode to assist in species delimitation of echinoderms. However, amplification of COI is often challenging in echinoderms (low success or pseudogenes). We present a method that allows the design of phylum-specific hybrid primers, and use this to develop COI primers for the Echinodermata. We aligned COI sequences from 310 echinoderm species and designed all possible primers along the consensus sequence with two methods (standard degenerate and hybrid). We found much lower degeneracy for hybrid primers (4-fold degeneracy) than for standard degenerate primers (≥48-fold degeneracy). We then designed the most conserved hybrid primers to amplify a >500-bp region within COI. These primers successfully amplified this gene region in all tested taxa (123 species across all echinoderm classes). Sequencing of 30 species among these confirmed both the quality of the sequences (>500 bp, no pseudogenes) and their utility as a DNA barcode. This method should be useful for developing primers for other mitochondrial genes and other phyla. The method will also be of interest for the development of future projects involving both community-based genetic assessments on macroorganisms and biodiversity assessment of environmental samples using high-throughput sequencing.

Journal ArticleDOI
TL;DR: The combination of two regions, ITS and trnH‐psbA, is the best choice for DNA identification of Alnus species, as an improvement and supplement for morphologically based taxonomy and suggests a relatively reliable and open taxonomic system based on the linked DNA and morphological data.
Abstract: One nuclear and three chloroplast DNA regions (ITS, rbcL, matK and trnH-psbA) were used to identify the species of Alnus (Betulaceae). The results showed that 23 out of all 26 Alnus species in the world, represented by 131 samples, had their own specific molecular character states, especially for three morphologically confused species (Alnus formosana, Alnus japonica and Alnus maritima). The discriminating power of the four markers at the species level was 10% (rbcL), 31.25% (matK), 63.6% (trnH-psbA) and 76.9% (ITS). For ITS, the mean value of genetic distance between species was more than 10 times the intraspecific distance (0.009%), and 13 species had unique character states that differentiated them from other species of Alnus. The trnH-psbA region had higher mean values of genetic distance between and within species (2.1% and 0.68% respectively) than any other region tested. Using the trnH-psbA region, 13 species are distinguished from 22 species, and seven species have a single diagnostic site. The combination of two regions, ITS and trnH-psbA, is the best choice for DNA identification of Alnus species, as an improvement and supplement for morphologically based taxonomy. This study illustrates the potential for certain DNA regions to be used as novel internet biological information carrier through combining DNA sequences with existing morphological character and suggests a relatively reliable and open taxonomic system based on the linked DNA and morphological data.

Journal ArticleDOI
TL;DR: The method of extracting DNA from herbarium samples using small amount of tissue is reliable and could be used for important historical specimens and illustrates that standardization and streamlining of sample processing should be shifted from the laboratory to the field.
Abstract: We present two methods for DNA extraction from fresh and dried mushrooms that are adaptable to high-throughput sequencing initiatives, such as DNA barcoding. Our results show that these protocols yield 85% sequencing success from recently collected materials. Tests with both recent ( 100 years) specimens reveal that older collections have low success rates and may be an inefficient resource for populating a barcode database. However, our method of extracting DNA from herbarium samples using small amount of tissue is reliable and could be used for important historical specimens. The application of these protocols greatly reduces time, and therefore cost, of generating DNA sequences from mushrooms and other fungi vs. traditional extraction methods. The efficiency of these methods illustrates that standardization and streamlining of sample processing should be shifted from the laboratory to the field.

Journal ArticleDOI
TL;DR: A model‐based approach to estimate local population FST’s that is based on the multinomial‐Dirichlet distribution, the so‐called F‐model is reviewed to foster its use for studying the genetic structure of metapopulations and the derivation of the Bayesian formulation is presented.
Abstract: We review a model-based approach to estimate local population F(ST) 's that is based on the multinomial-Dirichlet distribution, the so-called F-model. As opposed to the standard method of estimating a single F(ST) value, this approach takes into account the fact that in most if not all realistic situations, local populations differ in their effective sizes and migration rates. Therefore, the use of this approach can help better describe the genetic structure of populations. Despite this obvious advantage, this method has remained largely underutilized by molecular ecologists. Thus, the objective of this review is to foster its use for studying the genetic structure of metapopulations. We present the derivation of the Bayesian formulation for the estimation of population-specific F(ST) 's based on the multinomial-Dirichlet distribution. We describe several recent applications of the F-model and present the results of a small simulation study that explains how the F-model can help better describe the genetic structure of populations.