scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Whole-genome sequencing of multiple Arabidopsis thaliana populations

TL;DR: The majority of common small-scale polymorphisms as well as many larger insertions and deletions in the A. thaliana pan-genome are described, their effects on gene function, and the patterns of local and global linkage among these variants.
Abstract: The plant Arabidopsis thaliana occurs naturally in many different habitats throughout Eurasia. As a foundation for identifying genetic variation contributing to adaptation to diverse environments, a 1001 Genomes Project to sequence geographically diverse A. thaliana strains has been initiated. Here we present the first phase of this project, based on population-scale sequencing of 80 strains drawn from eight regions throughout the species' native range. We describe the majority of common small-scale polymorphisms as well as many larger insertions and deletions in the A. thaliana pan-genome, their effects on gene function, and the patterns of local and global linkage among these variants. The action of processes other than spontaneous mutation is identified by comparing the spectrum of mutations that have accumulated since A. thaliana diverged from its closest relative 10 million years ago with the spectrum observed in the laboratory. Recent species-wide selective sweeps are rare, and potentially deleterious mutations are more common in marginal populations.

Content maybe subject to copyright    Report

Citations
More filters
Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations

Journal ArticleDOI
TL;DR: The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.
Abstract: The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.

4,658 citations

Journal ArticleDOI
Klaus F. X. Mayer, Jane Rogers, Jaroslav Doležel1, Curtis J. Pozniak2, Kellye Eversole, Catherine Feuillet3, Bikram S. Gill4, Bernd Friebe4, Adam J. Lukaszewski5, Pierre Sourdille6, Takashi R. Endo7, M. Kubaláková1, Jarmila Číhalíková1, Zdeňka Dubská1, Jan Vrána1, Romana Šperková1, Hana Šimková1, Melanie Febrer8, Leah Clissold, Kirsten McLay, Kuldeep Singh9, Parveen Chhuneja9, Nagendra K. Singh10, Jitendra P. Khurana11, Eduard Akhunov4, Frédéric Choulet6, Adriana Alberti, Valérie Barbe, Patrick Wincker, Hiroyuki Kanamori12, Fuminori Kobayashi12, Takeshi Itoh12, Takashi Matsumoto12, Hiroaki Sakai12, Tsuyoshi Tanaka12, Jianzhong Wu12, Yasunari Ogihara13, Hirokazu Handa12, P. Ron Maclachlan2, Andrew G. Sharpe14, Darrin Klassen14, David Edwards, Jacqueline Batley, Odd-Arne Olsen, Simen Rød Sandve15, Sigbjørn Lien15, Burkhard Steuernagel16, Brande B. H. Wulff16, Mario Caccamo, Sarah Ayling, Ricardo H. Ramirez-Gonzalez, Bernardo J. Clavijo, Jonathan M. Wright, Matthias Pfeifer, Manuel Spannagl, Mihaela Martis, Martin Mascher17, Jarrod Chapman18, Jesse Poland4, Uwe Scholz17, Kerrie Barry18, Robbie Waugh19, Daniel S. Rokhsar18, Gary J. Muehlbauer, Nils Stein17, Heidrun Gundlach, Matthias Zytnicki20, Véronique Jamilloux20, Hadi Quesneville20, Thomas Wicker21, Primetta Faccioli, Moreno Colaiacovo, Antonio Michele Stanca, Hikmet Budak22, Luigi Cattivelli, Natasha Glover6, Lise Pingault6, Etienne Paux6, Sapna Sharma, Rudi Appels23, Matthew I. Bellgard23, Brett Chapman23, Thomas Nussbaumer, Kai Christian Bader, Hélène Rimbert, Shichen Wang4, Ron Knox, Andrzej Kilian, Michael Alaux20, Françoise Alfama20, Loïc Couderc20, Nicolas Guilhot6, Claire Viseux20, Mikaël Loaec20, Beat Keller21, Sébastien Praud 
18 Jul 2014-Science
TL;DR: Insight into the genome biology of a polyploid crop provide a springboard for faster gene isolation, rapid genetic marker development, and precise breeding to meet the needs of increasing food demand worldwide.
Abstract: An ordered draft sequence of the 17-gigabase hexaploid bread wheat (Triticum aestivum) genome has been produced by sequencing isolated chromosome arms. We have annotated 124,201 gene loci distributed nearly evenly across the homeologous chromosomes and subgenomes. Comparative gene analysis of wheat subgenomes and extant diploid and tetraploid wheat relatives showed that high sequence similarity and structural conservation are retained, with limited gene loss, after polyploidization. However, across the genomes there was evidence of dynamic gene gain, loss, and duplication since the divergence of the wheat lineages. A high degree of transcriptional autonomy and no global dominance was found for the subgenomes. These insights into the genome biology of a polyploid crop provide a springboard for faster gene isolation, rapid genetic marker development, and precise breeding to meet the needs of increasing food demand worldwide.

1,421 citations

Journal ArticleDOI
TL;DR: The relevance of biological factors including effect size, sample size, genetic heterogeneity, genomic confounding, linkage disequilibrium and spurious association, and statistical tools to account for these are presented.
Abstract: Over the last 10 years, high-density SNP arrays and DNA re-sequencing have illuminated the majority of the genotypic space for a number of organisms, including humans, maize, rice and Arabidopsis. For any researcher willing to define and score a phenotype across many individuals, Genome Wide Association Studies (GWAS) present a powerful tool to reconnect this trait back to its underlying genetics. In this review we discuss the biological and statistical considerations that underpin a successful analysis or otherwise. The relevance of biological factors including effect size, sample size, genetic heterogeneity, genomic confounding, linkage disequilibrium and spurious association, and statistical tools to account for these are presented. GWAS can offer a valuable first insight into trait architecture or candidate loci for subsequent validation.

1,088 citations


Cites background from "Whole-genome sequencing of multiple..."

  • ...Recent whole-genome sequencing has revealed a much higher SNPs density in Arabidopsis [44-46], with approximately 7 Million SNPs within a worldwide sample....

    [...]

Journal ArticleDOI
TL;DR: Genomic tools are now allowing genome-wide studies, and recent theoretical advances can help to design research strategies that combine genomics and field experiments to examine the genetics of local adaptation.
Abstract: It is increasingly important to improve our understanding of the genetic basis of local adaptation because of its relevance to climate change, crop and animal production, and conservation of genetic resources Phenotypic patterns that are generated by spatially varying selection have long been observed, and both genetic mapping and field experiments provided initial insights into the genetic architecture of adaptive traits Genomic tools are now allowing genome-wide studies, and recent theoretical advances can help to design research strategies that combine genomics and field experiments to examine the genetics of local adaptation These advances are also allowing research in non-model species, the adaptation patterns of which may differ from those of traditional model species

1,060 citations

References
More filters
Journal ArticleDOI
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Abstract: The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSIBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.

70,111 citations


"Whole-genome sequencing of multiple..." refers methods in this paper

  • ...lyrata protein sequence was then used as query for a PSI-BLAS...

    [...]

Journal ArticleDOI
TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

43,862 citations

Journal ArticleDOI
01 Jun 2000-Genetics
TL;DR: Pritch et al. as discussed by the authors proposed a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations, which can be applied to most of the commonly used genetic markers, provided that they are not closely linked.
Abstract: We describe a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations. We assume a model in which there are K populations (where K may be unknown), each of which is characterized by a set of allele frequencies at each locus. Individuals in the sample are assigned (probabilistically) to populations, or jointly to two or more populations if their genotypes indicate that they are admixed. Our model does not assume a particular mutation process, and it can be applied to most of the commonly used genetic markers, provided that they are not closely linked. Applications of our method include demonstrating the presence of population structure, assigning individuals to populations, studying hybrid zones, and identifying migrants and admixed individuals. We show that the method can produce highly accurate assignments using modest numbers of loci— e.g. , seven microsatellite loci in an example using genotype data from an endangered bird species. The software used for this article is available from http://www.stats.ox.ac.uk/~pritch/home.html.

27,454 citations

Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations

Journal ArticleDOI
Fumio Tajima1
01 Nov 1989-Genetics
TL;DR: The relationship between the two estimates of genetic variation at the DNA level, namely the number of segregating sites and the average number of nucleotide differences estimated from pairwise comparison, is investigated in this article.
Abstract: The relationship between the two estimates of genetic variation at the DNA level, namely the number of segregating sites and the average number of nucleotide differences estimated from pairwise comparison, is investigated. It is found that the correlation between these two estimates is large when the sample size is small, and decreases slowly as the sample size increases. Using the relationship obtained, a statistical method for testing the neutral mutation hypothesis is developed. This method needs only the data of DNA polymorphism, namely the genetic variation within population at the DNA level. A simple method of computer simulation, that was used in order to obtain the distribution of a new statistic developed, is also presented. Applying this statistical method to the five regions of DNA sequences in Drosophila melanogaster, it is found that large insertion/deletion (greater than 100 bp) is deleterious. It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,417 citations

Related Papers (5)