FLASH: Fast Length Adjustment of Short Reads to Improve Genome Assemblies
Tanja Magoc,Steven L. Salzberg +1 more
TLDR
FLASH is a fast computational tool to extend the length of short reads by overlapping paired-end reads from fragment libraries that are sufficiently short and when FLASH was used to extend reads prior to assembly, the resulting assemblies had substantially greater N50 lengths for both contigs and scaffolds.Abstract:
Motivation: Next-generation sequencing technologies generate very large numbers of short reads. Even with very deep genome coverage, short read lengths cause problems in de novo assemblies. The use of paired-end libraries with a fragment size shorter than twice the read length provides an opportunity to generate much longer reads by overlapping and merging read pairs before assembling a genome.
Results: We present FLASH, a fast computational tool to extend the length of short reads by overlapping paired-end reads from fragment libraries that are sufficiently short. We tested the correctness of the tool on one million simulated read pairs, and we then applied it as a pre-processor for genome assemblies of Illumina reads from the bacterium Staphylococcus aureus and human chromosome 14. FLASH correctly extended and merged reads >99% of the time on simulated reads with an error rate of <1%. With adequately set parameters, FLASH correctly merged reads over 90% of the time even when the reads contained up to 5% errors. When FLASH was used to extend reads prior to assembly, the resulting assemblies had substantially greater N50 lengths for both contigs and scaffolds.
Availability and Implementation: The FLASH system is implemented in C and is freely available as open-source code at http://www.cbcb.umd.edu/software/flash.
Contact: moc.liamg@cogam.tread more
Citations
More filters
Journal ArticleDOI
Evolution and Global Transmission of a Multidrug-Resistant, Community-Associated Methicillin-Resistant Staphylococcus aureus Lineage from the Indian Subcontinent
Eike J. Steinig,Sebastián Duchêne,D. Ashley Robinson,Stefan Monecke,Stefan Monecke,Maho Yokoyama,Maisem Laabei,Peter Slickers,Patiyan Andersson,Deborah A Williamson,Angela Kearns,Richard V. Goering,Elizabeth Dickson,Ralf Ehricht,Ralf Ehricht,Margaret Ip,Matthew V. N. O'Sullivan,Geoffrey W. Coombs,Andreas Petersen,Gráinne I. Brennan,Anna C. Shore,David C. Coleman,Annalisa Pantosti,Hermínia de Lencastre,Henrik Westh,Nobumichi Kobayashi,Helen Heffernan,Birgit Strommenger,Franziska Layer,Stefan Weber,Hege Vangstein Aamot,Leila Skakni,Sharon J. Peacock,Derek S. Sarovich,Simon R. Harris,Julian Parkhill,Ruth C. Massey,M. T. G. Holden,M. T. G. Holden,Stephen D. Bentley,Steven Y. C. Tong +40 more
TL;DR: The Bengal Bay clone emerged from a virulent progenitor circulating on the Indian subcontinent and subsequently global transmission was associated with travel or family contact in the region, demonstrating the importance of whole-genome sequencing for tracking the evolution of emerging and resistant pathogens.
Journal ArticleDOI
Long-term soil transplant simulating climate change with latitude significantly alters microbial temporal turnover.
Yuting Liang,Yuji Jiang,Feng Wang,Chongqing Wen,Ye Deng,Kai Xue,Yujia Qin,Yunfeng Yang,Liyou Wu,Jizhong Zhou,Jizhong Zhou,Jizhong Zhou,Bo Sun +12 more
TL;DR: Climate warming lead to a faster succession rate of microbial communities as well as lower species richness and compositional changes compared with in situ and climate cooling, which may be related to the high metabolic rates and intense competition under higher temperature.
Journal ArticleDOI
Gut microbiota density influences host physiology and is shaped by host and microbial factors
Eduardo J. Contijoch,Graham J. Britton,Chao Yang,Ilaria Mogno,Zhihua Li,Ruby Ng,Sean R. Llewellyn,Sheela Hira,Crystal H Johnson,Keren M. Rabinowitz,Keren M. Rabinowitz,Revital Barkan,Iris Dotan,Iris Dotan,Robert Hirten,Shih Chen Fu,Yuying Luo,Nancy Yang,Tramy Luong,Philippe R Labrias,Sergio A. Lira,Inga Peter,Ari Grinspan,Jose C. Clemente,Roman Kosoy,Seunghee Kim-Schulze,Xiaochen Qin,Anabella Castillo,Amanda Hurley,Ashish Atreja,Jason Rogers,Farah Fasihuddin,Merjona Saliaj,Amy Nolan,Pamela Reyes-Mercedes,Carina Rodriguez,Sarah Aly,Kenneth Santa-Cruz,Lauren A. Peters,Mayte Suárez-Fariñas,R Huang,Ke Hao,Jun Zhu,Bin Zhang,Bojan Losic,Haritz Irizar,Won-Min Song,Antonio Fabio Di Narzo,Wenhui Wang,Benjamin L. Cohen,Christopher J. DiMaio,David A. Greenwald,Steven H. Itzkowitz,Aimee L. Lucas,James F. Marion,Elana Maser,Ryan C. Ungaro,Steven Naymagon,Joshua Novak,Brijen Shah,Thomas A. Ullman,Peter H. Rubin,James F. George,Peter Legnani,Shannon Telesco,Joshua R. Friedman,Carrie Brodmerkel,S Plevy,Judy H. Cho,Jean-Frederic Colombel,Eric E. Schadt,Carmen Argmann,Marla Dubinsky,Andrew Kasarskis,Bruce E. Sands,Jeremiah J. Faith +75 more
TL;DR: Understanding the interplay between microbiota and disease in terms of microbiota density, host carrying capacity, and microbiota fitness provide new insights into microbiome structure and microbiome targeted therapeutics.
Journal ArticleDOI
Response of Soil Microbial Communities to Elevated Antimony and Arsenic Contamination Indicates the Relationship between the Innate Microbiota and Contaminant Fractions
Weimin Sun,Enzong Xiao,Enzong Xiao,Tangfu Xiao,Tangfu Xiao,Valdis Krumins,Qi Wang,Max M. Häggblom,Yiran Dong,Song Tang,Min Hu,Baoqin Li,Bingqing Xia,Wei Liu +13 more
TL;DR: Observed interactions among various Sb and As fractions and the soil microbiota suggest the potential for bioremediation of Sb- and As-contaminated soils.
Journal ArticleDOI
Nanobodies from camelid mice and llamas neutralize SARS-CoV-2 variants.
Jianliang Xu,Kai Xu,Kai Xu,Seolkyoung Jung,Andrea Conte,Jenna Ariel Lieberman,Frauke Muecksch,Julio C. C. Lorenzi,Solji Park,Fabian Schmidt,Zijun Wang,Yaoxing Huang,Yang Luo,Manoj S. Nair,Pengfei Wang,Jonathan E. Schulz,Lino Tessarollo,Tatsiana Bylund,Gwo-Yu Chuang,Adam S. Olia,Tyler Stephens,I-Ting Teng,Yaroslav Tsybovsky,Tongqing Zhou,Vincent J. Munster,David D. Ho,Theodora Hatziioannou,Paul D. Bieniasz,Paul D. Bieniasz,Michel C. Nussenzweig,Michel C. Nussenzweig,Peter D. Kwong,Rafael Casellas +32 more
TL;DR: In this paper, the authors identify two groups of highly neutralizing nanobodies against SARS-CoV-2 variants, namely, group 1 and group 2, which is almost exclusively focused on the RBD-ACE2 interface and does not neutralize SARS CoV2 variants that carry E484K or N501Y substitutions.
References
More filters
Journal ArticleDOI
The Sequence Alignment/Map format and SAMtools
Heng Li,Bob Handsaker,Alec Wysoker,T. J. Fennell,Jue Ruan,Nils Homer,Gabor T. Marth,Gonçalo R. Abecasis,Richard Durbin +8 more
TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.
Journal ArticleDOI
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
TL;DR: Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches and can be used simultaneously to achieve even greater alignment speeds.
Journal ArticleDOI
Versatile and open software for comparing large genomes
Stefan Kurtz,Adam M. Phillippy,Arthur L. Delcher,Michael E. Smoot,Martin Shumway,Corina Antonescu,Steven L. Salzberg +6 more
TL;DR: The newest version of MUMmer easily handles comparisons of large eukaryotic genomes at varying evolutionary distances, as demonstrated by applications to multiple genomes.
Journal ArticleDOI
De novo assembly of human genomes with massively parallel short read sequencing
Ruiqiang Li,Hongmei Zhu,Jue Ruan,Wubin Qian,Xiaodong Fang,Zhongbin Shi,Yingrui Li,Shengting Li,Gao Shan,Karsten Kristiansen,Songgang Li,Huanming Yang,Jing Wang,Jun Wang +13 more
TL;DR: The development of this de novo short read assembly method creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way.
Journal ArticleDOI
High-quality draft assemblies of mammalian genomes from massively parallel sequence data
Sante Gnerre,Iain MacCallum,Dariusz Przybylski,Filipe J. Ribeiro,Joshua N. Burton,Bruce J. Walker,Ted Sharpe,Giles Hall,Terrance Shea,Sean M. Sykes,Aaron M. Berlin,Daniel Aird,Maura Costello,Riza M. Daza,Louise Williams,Robert Nicol,Andreas Gnirke,Chad Nusbaum,Eric S. Lander,David B. Jaffe +19 more
TL;DR: The development of an algorithm for genome assembly, ALLPATHS-LG, and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform, have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome.
Related Papers (5)
QIIME allows analysis of high-throughput community sequencing data.
J. Gregory Caporaso,Justin Kuczynski,Jesse Stombaugh,Kyle Bittinger,Frederic D. Bushman,Elizabeth K. Costello,Noah Fierer,Antonio Gonzalez Peña,Julia K. Goodrich,Jeffrey I. Gordon,Gavin A. Huttley,Scott T. Kelley,Dan Knights,Jeremy E. Koenig,Ruth E. Ley,Catherine A. Lozupone,Daniel McDonald,Brian D. Muegge,Meg Pirrung,Jens Reeder,Joel Sevinsky,Peter J. Turnbaugh,William A. Walters,Jeremy Widmann,Tanya Yatsunenko,Jesse R. Zaneveld,Rob Knight,Rob Knight +27 more
Trimmomatic: a flexible trimmer for Illumina sequence data
Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities
Patrick D. Schloss,Patrick D. Schloss,Sarah L. Westcott,Sarah L. Westcott,Thomas Ryabin,Justine R. Hall,Martin Hartmann,Emily B. Hollister,Ryan A. Lesniewski,Brian B. Oakley,Donovan H. Parks,Courtney J. Robinson,Jason W. Sahl,Blaz Stres,Gerhard G. Thallinger,David J. Van Horn,Carolyn F. Weber +16 more