Redundans: an assembly pipeline for highly heterozygous genomes
Leszek P. Pryszcz,Toni Gabaldón +1 more
Reads0
Chats0
TLDR
A pipeline that specifically deals with the assembly of heterozygous genomes by introducing a step to recognise and selectively remove alternative heterozygOUS contigs is developed and compared to other existing tools.Abstract:
Many genomes display high levels of heterozygosity (i.e. presence of different alleles at the same loci in homologous chromosomes), being those of hybrid organisms an extreme such case. The assembly of highly heterozygous genomes from short sequencing reads is a challenging task because it is difficult to accurately recover the different haplotypes. When confronted with highly heterozygous genomes, the standard assembly process tends to collapse homozygous regions and reports heterozygous regions in alternative contigs. The boundaries between homozygous and heterozygous regions result in multiple assembly paths that are hard to resolve, which leads to highly fragmented assemblies with a total size larger than expected. This, in turn, causes numerous problems in downstream analyses such as fragmented gene models, wrong gene copy number, or broken synteny. To circumvent these caveats we have developed a pipeline that specifically deals with the assembly of heterozygous genomes by introducing a step to recognise and selectively remove alternative heterozygous contigs. We tested our pipeline on simulated and naturally-occurring heterozygous genomes and compared its accuracy to other existing tools. Our method is freely available at https://github.com/Gabaldonlab/redundans.read more
Citations
More filters
SPAdes, a new genome assembly algorithm and its applications to single-cell sequencing ( 7th Annual SFAF Meeting, 2012)
TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).
Journal ArticleDOI
Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies.
TL;DR: Purge Haplotigs improves the haploid and diploid representations of third-gen sequencing based genome assemblies by identifying and reassigning allelic contigs and is less likely to over-purge repetitive or paralogous elements compared to alignment-only based methods.
Journal ArticleDOI
MitoZ: a toolkit for animal mitochondrial genome assembly, annotation and visualization.
TL;DR: A mitogenome toolkit MitoZ is developed, consisting of independent modules of de novo assembly, findMitoScaf (find Mitochondrial Scaffolds), annotation and visualization, that can generate mitogenomes assembly together with annotations and visualization results from HTS raw reads.
Journal ArticleDOI
A model species for agricultural pest genomics: The genome of the Colorado potato beetle, Leptinotarsa decemlineata (Coleoptera: Chrysomelidae)
Sean D. Schoville,Yolanda H. Chen,Martin Andersson,Joshua B. Benoit,Anita Bhandari,Julia H. Bowsher,Kristian Brevik,Kaat Cappelle,Mei-Ju May Chen,Anna K. Childers,Christopher P. Childers,Olivier Christiaens,Justin Clements,Elise M. Didion,Elena N. Elpidina,Patamarerk Engsontia,Markus Friedrich,Inmaculada García-Robles,Richard A. Gibbs,Chandan Goswami,Alessandro Grapputo,Kristina Gruden,Marcin Grynberg,Bernard Henrissat,Bernard Henrissat,Bernard Henrissat,Emily C. Jennings,Jeffery W. Jones,Megha Kalsi,Sher Afzal Khan,Abhishek Kumar,Abhishek Kumar,Fei Li,Vincent Lombard,Vincent Lombard,Xingzhou Ma,Alexander G. Martynov,Nicholas J. Miller,Robert F. Mitchell,Monica Munoz-Torres,Anna Muszewska,Brenda Oppert,Subba Reddy Palli,Kristen A. Panfilio,Kristen A. Panfilio,Yannick Pauchet,Lindsey C. Perkin,Marko Petek,Monica F. Poelchau,Eric Record,Joseph P. Rinehart,Hugh M. Robertson,Andrew J. Rosendale,Victor M. Ruiz-Arroyo,Guy Smagghe,Zsofia Szendrei,Gregg W.C. Thomas,Alex S. Torson,Iris M. Vargas Jentzsch,Matthew T. Weirauch,Matthew T. Weirauch,Ashley D. Yates,George D. Yocum,June-Sun Yoon,Stephen Richards +64 more
TL;DR: Surprisingly, the suite of genes involved in insecticide resistance is similar to other beetles, and duplications in the RNAi pathway might explain why Leptinotarsa decemlineata has high sensitivity to dsRNA.
Journal ArticleDOI
The Reference Genome of Tea Plant and Resequencing of 81 Diverse Accessions Provide Insights into Its Genome Evolution and Adaptation.
En-Hua Xia,Wei Tong,Yan Hou,Yanlin An,Linbo Chen,Qiong Wu,Yun-Long Liu,Jie Yu,Fangdong Li,Ruopei Li,Penghui Li,Huijuan Zhao,Ruoheng Ge,Jin Huang,Ali Inayat Mallano,Yanrui Zhang,Shengrui Liu,Wei-Wei Deng,Chuankui Song,Zhaoliang Zhang,Jian Zhao,Shu Wei,Zhengzhu Zhang,Tao Xia,Chaoling Wei,Xiaochun Wan +25 more
TL;DR: A high-quality reference genome of the tea plant consisting of 15 pseudo-chromosomes, 70.38% of which are LTR retrotransposons is presented, showing the evidence that LTR-RTs play critical roles in the genome size expansion and transcriptional diversification of tea plant genes through preferential gene insertions in promoter regions and introns.
References
More filters
Journal ArticleDOI
Fast and accurate short read alignment with Burrows–Wheeler transform
Heng Li,Richard Durbin +1 more
TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Journal ArticleDOI
Initial sequencing and analysis of the human genome.
Eric S. Lander,Lauren Linton,Bruce W. Birren,Chad Nusbaum,Michael C. Zody,Jennifer Baldwin,Keri Devon,Ken Dewar,Michael Doyle,William Fitzhugh,Roel Funke,Diane Gage,Katrina Harris,Andrew Heaford,John Howland,Lisa Kann,Jessica A. Lehoczky,Rosie Levine,Paul A. McEwan,Kevin McKernan,James Meldrim,Jill P. Mesirov,Cher Miranda,William Morris,Jerome Naylor,Christina Raymond,Mark Rosetti,Ralph Santos,Andrew Sheridan,Carrie Sougnez,Nicole Stange-Thomann,Nikola Stojanovic,Aravind Subramanian,Dudley Wyman,Jane Rogers,John Sulston,R Ainscough,Stephan Beck,David Bentley,John Burton,C M Clee,Nigel P. Carter,Alan Coulson,Rebecca Deadman,Panos Deloukas,Andrew Dunham,Ian Dunham,Richard Durbin,Lisa French,Darren Grafham,Simon G. Gregory,Tim Hubbard,Sean Humphray,Adrienne Hunt,Matthew Jones,Christine Lloyd,Amanda McMurray,Lucy Matthews,Simon Mercer,Sarah Milne,James C. Mullikin,Andrew J. Mungall,Robert W. Plumb,Mark T. Ross,Ratna Shownkeen,Sarah Sims,Robert H. Waterston,Richard K. Wilson,LaDeana W. Hillier,John Douglas Mcpherson,Marco A. Marra,Elaine R. Mardis,Lucinda Fulton,Asif T. Chinwalla,Kymberlie H. Pepin,Warren Gish,Stephanie L. Chissoe,Michael C. Wendl,Kim D. Delehaunty,Tracie L. Miner,Andrew Delehaunty,Jason B. Kramer,Lisa Cook,Robert S. Fulton,Douglas L. Johnson,Patrick Minx,Sandra W. Clifton,Trevor Hawkins,Elbert Branscomb,Paul Predki,Paul G. Richardson,Sarah Wenning,Tom Slezak,Norman A. Doggett,Jan Fang Cheng,Anne S. Olsen,Susan Lucas,Christopher J. Elkin,Edward Uberbacher,Marvin Frazier,Richard A. Gibbs,Donna M. Muzny,Steven E. Scherer,John Bouck,Erica Sodergren,Kim C. Worley,Catherine M. Rives,James H. Gorrell,Michael L. Metzker,Susan L. Naylor,Raju Kucherlapati,David L. Nelson,George M. Weinstock,Yoshiyuki Sakaki,Asao Fujiyama,Masahira Hattori,Tetsushi Yada,Atsushi Toyoda,Takehiko Itoh,Chiharu Kawagoe,Hidemi Watanabe,Yasushi Totoki,Todd D. Taylor,Jean Weissenbach,Roland Heilig,William Saurin,François Artiguenave,Philippe Brottier,Thomas Brüls,Eric Pelletier,Catherine Robert,Patrick Wincker,André Rosenthal,Matthias Platzer,Gerald Nyakatura,Stefan Taudien,Andreas Rump,Douglas R. Smith,Lynn Doucette-Stamm,Marc Rubenfield,Keith Weinstock,Mei Lee Hong,Joann Dubois,Huanming Yang,Jun Yu,Jian Wang,Guyang Huang,Jun Gu,Leroy Hood,Lee Rowen,Anup Madan,Shizen Qin,Ronald W. Davis,Nancy A. Federspiel,A. Pia Abola,Michael Proctor,Bruce A. Roe,Feng Chen,Huaqin Pan,Juliane Ramser,Hans Lehrach,Richard Reinhardt,W. Richard McCombie,Melissa De La Bastide,Neilay Dedhia,H. Blöcker,K. Hornischer,Gabriele Nordsiek,Richa Agarwala,L. Aravind,Jeffrey A. Bailey,Alex Bateman,Serafim Batzoglou,Ewan Birney,Peer Bork,Daniel G. Brown,Christopher B. Burge,Lorenzo Cerutti,Hsiu Chuan Chen,Deanna M. Church,Michele Clamp,Richard R. Copley,Tobias Doerks,Sean R. Eddy,Evan E. Eichler,Terrence S. Furey,James E. Galagan,James G. R. Gilbert,Cyrus L. Harmon,Yoshihide Hayashizaki,David Haussler,Henning Hermjakob,Karsten Hokamp,Wonhee Jang,L. Steven Johnson,Thomas A. Jones,Simon Kasif,Arek Kaspryzk,Scot Kennedy,W. James Kent,Paul Kitts,Eugene V. Koonin,Ian F Korf,David Kulp,Doron Lancet,Todd M. Lowe,Aoife McLysaght,Tarjei S. Mikkelsen,John V. Moran,Nicola Mulder,Victor J. Pollara,Chris P. Ponting,Greg Schuler,Jörg Schultz,Guy Slater,Arian F.A. Smit,Elia Stupka,Joseph Szustakowki,Danielle Thierry-Mieg,Jean Thierry-Mieg,Lukas Wagner,John W. Wallis,Raymond Wheeler,Alan Williams,Yuri I. Wolf,Kenneth H. Wolfe,Shiaw Pyng Yang,Ru Fang Yeh,Francis S. Collins,Mark S. Guyer,Jane Peterson,Adam Felsenfeld,Kris A. Wetterstrand,Richard M. Myers,Jeremy Schmutz,Mark Dickson,Jane Grimwood,David R. Cox,Maynard V. Olson,Rajinder Kaul,Christopher K. Raymond,Nobuyoshi Shimizu,Kazuhiko Kawasaki,Shinsei Minoshima,Glen A. Evans,Maria Athanasiou,Roger A. Schultz,Aristides Patrinos,Michael J. Morgan +248 more
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Journal ArticleDOI
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
TL;DR: Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches and can be used simultaneously to achieve even greater alignment speeds.
Journal ArticleDOI
SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing
Anton Bankevich,Sergey Nurk,Dmitry Antipov,Alexey Gurevich,Mikhail Dvorkin,Alexander S. Kulikov,Valery M. Lesin,Sergey I. Nikolenko,Son Pham,Andrey D. Prjibelski,Alexey V. Pyshkin,Alexander Sirotkin,Nikolay Vyahhi,Glenn Tesler,Max A. Alekseyev,Pavel A. Pevzner +15 more
TL;DR: SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies.
SPAdes, a new genome assembly algorithm and its applications to single-cell sequencing ( 7th Annual SFAF Meeting, 2012)
TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).