Paired-end RAD-seq for de novo assembly and marker design without available reference
Reads0
Chats0
TLDR
A method for de novo assembly of paired-end RAD-seq data in order to produce extended contigs flanking a restriction site to reconstruct one-tenth of the guppy genome represented by 200-500 bp contigs associated to EcoRI recognition sites.Abstract:
Motivation: Next-generation sequencing technologies have facilitated the study of organisms on a genome-wide scale. A recent method called restriction site associated DNA sequencing (RAD-seq) allows to sample sequence information at reduced complexity across a target genome using the Illumina platform. Single-end RAD-seq has proven to provide a large number of informative genetic markers in reference as well as non-reference organisms.
Results: Here, we present a method for de novo assembly of paired-end RAD-seq data in order to produce extended contigs flanking a restriction site. We were able to reconstruct one-tenth of the guppy genome represented by 200–500 bp contigs associated to EcoRI recognition sites. In addition, these contigs were used as reference allowing the detection of thousands of new polymorphic markers that are informative for mapping and population genetic studies in the guppy.
Availability: A perl and C++ implementation of the method demonstrated in this article is available under http://guppy.weigelworld.org/weigeldatabases/radMarkers/ as package RApiD.
Contact: christine.dreyer@tuebingen.mpg.de
Supplementary Information:Supplementary data are available at Bioinformatics online.read more
Citations
More filters
Journal ArticleDOI
Harnessing the power of RADseq for ecological and evolutionary genomics
TL;DR: This Review provides a comprehensive discussion of RADseq methods to aid researchers in choosing among the many different approaches and avoiding erroneous scientific conclusions from RADseq data, a problem that has plagued other genetic marker types in the past.
Journal ArticleDOI
Restriction site-associated DNA sequencing, genotyping error estimation and de novo assembly optimization for population genetic inference
Alicia Mastretta-Yanes,Nils Arrigo,Nadir Alvarez,Tove H. Jorgensen,Daniel Piñero,Brent C. Emerson,Brent C. Emerson +6 more
TL;DR: Individual sample replicates are used, under the expectation of identical genotypes, to quantify genotyping error in the absence of a reference genome and optimize de novo assembly parameters within the program Stacks, by minimizing error and maximizing the retrieval of informative loci.
Journal ArticleDOI
Special features of RAD Sequencing data: implications for genotyping.
John W. Davey,Timothee Cezard,Pablo Fuentes-Utrilla,Cathlene Eland,Karim Gharbi,Mark Blaxter +5 more
TL;DR: It is shown that there are several sources of bias specific to RAD‐Seq that are not explicitly addressed by current genotyping tools, namely restriction fragment bias, restriction site heterozygosity and PCR GC content bias, and that most RAD loci will be accurately genotyped by existing tools.
Journal ArticleDOI
Sequence Capture versus Restriction Site Associated DNA Sequencing for Shallow Systematics.
TL;DR: It is argued that sequence capture should be given greater attention as a method of obtaining data for studies in shallow systematics and comparative phylogeography.
Journal ArticleDOI
Genomic patterns of introgression in rainbow and westslope cutthroat trout illuminated by overlapping paired‐end RAD sequencing
Paul A. Hohenlohe,Mitch D. Day,Stephen J. Amish,Michael R. Miller,Nicholas Kamps-Hughes,Matthew C. Boyer,Clint C. Muhlfeld,Clint C. Muhlfeld,Fred W. Allendorf,Eric A. Johnson,Gordon Luikart +10 more
TL;DR: A novel approach, overlapping paired‐end RAD sequencing, is presented, to generate RAD contigs of >300–400 bp that provide sufficient flanking sequence for design of high‐throughput SNP genotyping arrays and strict filtering to identify duplicate paralogous loci.
References
More filters
Journal ArticleDOI
Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers
Nathan A. Baird,Paul D. Etter,Tressa S. Atwood,Mark Currey,Anthony L. Shiver,Zachary A. Lewis,Eric U. Selker,William A. Cresko,Eric A. Johnson +8 more
TL;DR: The sequencing of restriction-site associated DNA (RAD) tags was described, which identified more than 13,000 SNPs, and three traits in two model organisms were mapped, using less than half the capacity of one Illumina sequencing run.
Journal ArticleDOI
Mapping short DNA sequencing reads and calling variants using mapping quality scores
Heng Li,Jue Ruan,Richard Durbin +2 more
TL;DR: This work describes the software MAQ, software that can build assemblies by mapping shotgun short reads to a reference genome, using quality scores to derive genotype calls of the consensus sequence of a diploid genome, e.g., from a human sample.
Journal ArticleDOI
High-quality draft assemblies of mammalian genomes from massively parallel sequence data
Sante Gnerre,Iain MacCallum,Dariusz Przybylski,Filipe J. Ribeiro,Joshua N. Burton,Bruce J. Walker,Ted Sharpe,Giles Hall,Terrance Shea,Sean M. Sykes,Aaron M. Berlin,Daniel Aird,Maura Costello,Riza M. Daza,Louise Williams,Robert Nicol,Andreas Gnirke,Chad Nusbaum,Eric S. Lander,David B. Jaffe +19 more
TL;DR: The development of an algorithm for genome assembly, ALLPATHS-LG, and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform, have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome.
Journal ArticleDOI
Reconstructing Indian Population History
TL;DR: It is predicted that there will be an excess of recessive diseases in India, which should be possible to screen and map genetically and is higher in traditionally upper caste and Indo-European speakers.
Journal ArticleDOI
Population Genomics of Parallel Adaptation in Threespine Stickleback using Sequenced RAD Tags
Paul A. Hohenlohe,Susan Bassham,Paul D. Etter,Nicholas Stiffler,Eric A. Johnson,William A. Cresko +5 more
TL;DR: Overall estimates of genetic diversity and differentiation among populations confirm the biogeographic hypothesis that large panmictic oceanic populations have repeatedly given rise to phenotypically divergent freshwater populations and identify several novel regions showing parallel differentiation across independent populations.