scispace - formally typeset
Search or ask a question
Author

Jason T. Howard

Bio: Jason T. Howard is an academic researcher from Rockefeller University. The author has contributed to research in topics: Genomics & Genome. The author has an hindex of 22, co-authored 35 publications receiving 7093 citations. Previous affiliations of Jason T. Howard include University of North Carolina at Chapel Hill & Duke University.

Papers
More filters
Journal ArticleDOI
Erich D. Jarvis1, Siavash Mirarab2, Andre J. Aberer3, Bo Li4, Bo Li5, Bo Li6, Peter Houde7, Cai Li4, Cai Li5, Simon Y. W. Ho8, Brant C. Faircloth9, Benoit Nabholz, Jason T. Howard1, Alexander Suh10, Claudia C. Weber10, Rute R. da Fonseca11, Jianwen Li, Fang Zhang Zhang, Hui Li, Long Zhou, Nitish Narula7, Nitish Narula12, Liang Liu13, Ganesh Ganapathy1, Bastien Boussau, Shamsuzzoha Bayzid2, Volodymyr Zavidovych1, Sankar Subramanian14, Toni Gabaldón15, Salvador Capella-Gutierrez, Jaime Huerta-Cepas, Bhanu Rekepalli16, Bhanu Rekepalli17, Kasper Munch18, Mikkel H. Schierup18, Bent E. K. Lindow11, Wesley C. Warren19, David A. Ray, Richard E. Green20, Michael William Bruford21, Xiangjiang Zhan21, Xiangjiang Zhan22, Andrew Dixon, Shengbin Li6, Ning Li23, Yinhua Huang23, Elizabeth P. Derryberry24, Elizabeth P. Derryberry25, Mads F. Bertelsen26, Frederick H. Sheldon24, Robb T. Brumfield24, Claudio V. Mello27, Claudio V. Mello28, Peter V. Lovell27, Morgan Wirthlin27, Maria Paula Cruz Schneider28, Francisco Prosdocimi28, José Alfredo Samaniego11, Amhed Missael Vargas Velazquez11, Alonzo Alfaro-Núñez11, Paula F. Campos11, Bent O. Petersen29, Thomas Sicheritz-Pontén29, An Pas, Thomas L. Bailey, R. Paul Scofield30, Michael Bunce31, David M. Lambert14, Qi Zhou, Polina L. Perelman32, Amy C. Driskell33, Beth Shapiro20, Zijun Xiong, Yongli Zeng, Shiping Liu, Zhenyu Li, Binghang Liu, Kui Wu, Jin Xiao, Xiong Yinqi, Quiemei Zheng, Yong Zhang, Huanming Yang, Jian Wang, Linnéa Smeds10, Frank E. Rheindt34, Michael J. Braun35, Jon Fjeldså11, Ludovic Orlando11, F. Keith Barker4, Knud A. Jønsson4, Warren E. Johnson33, Klaus-Peter Koepfli33, Stephen J. O'Brien36, David Haussler, Oliver A. Ryder, Carsten Rahbek4, Eske Willerslev11, Gary R. Graves33, Gary R. Graves4, Travis C. Glenn13, John E. McCormack37, Dave Burt38, Hans Ellegren10, Per Alström, Scott V. Edwards39, Alexandros Stamatakis3, David P. Mindell40, Joel Cracraft4, Edward L. Braun41, Tandy Warnow2, Tandy Warnow42, Wang Jun, M. Thomas P. Gilbert4, M. Thomas P. Gilbert31, Guojie Zhang11, Guojie Zhang5 
12 Dec 2014-Science
TL;DR: A genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves recovered a highly resolved tree that confirms previously controversial sister or close relationships and identifies the first divergence in Neoaves, two groups the authors named Passerea and Columbea.
Abstract: To better determine the history of modern birds, we performed a genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves using phylogenomic methods created to handle genome-scale data. We recovered a highly resolved tree that confirms previously controversial sister or close relationships. We identified the first divergence in Neoaves, two groups we named Passerea and Columbea, representing independent lineages of diverse and convergently evolved land and water bird species. Among Passerea, we infer the common ancestor of core landbirds to have been an apex predator and confirm independent gains of vocal learning. Among Columbea, we identify pigeons and flamingoes as belonging to sister clades. Even with whole genomes, some of the earliest branches in Neoaves proved challenging to resolve, which was best explained by massive protein-coding sequence convergence and high levels of incomplete lineage sorting that occurred during a rapid radiation after the Cretaceous-Paleogene mass extinction event about 66 million years ago.

1,624 citations

Journal ArticleDOI
TL;DR: This work introduces a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences, leading to substantially better assemblies than current sequencing strategies.
Abstract: Single-molecule sequencing instruments can generate multikilobase sequences with the potential to greatly improve genome and transcriptome assembly. However, the error rates of single-molecule reads are high, which has limited their use thus far to resequencing bacteria. To address this limitation, we introduce a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences. We demonstrate the utility of this approach on reads generated by a PacBio RS instrument from phage, prokaryotic and eukaryotic whole genomes, including the previously unsequenced genome of the parrot Melopsittacus undulatus, as well as for RNA-Seq reads of the corn (Zea mays) transcriptome. Our long-read correction achieves >99.9% base-call accuracy, leading to substantially better assemblies than current sequencing strategies: in the best example, the median contig size was quintupled relative to high-coverage, second-generation assemblies. Greater gains are predicted if read lengths continue to increase, including the prospect of single-contig bacterial chromosome assembly.

987 citations

Journal ArticleDOI
Guojie Zhang1, Guojie Zhang2, Cai Li1, Qiye Li1, Bo Li1, Denis M. Larkin3, Chul Hee Lee4, Jay F. Storz5, Agostinho Antunes6, Matthew J. Greenwold7, Robert W. Meredith8, Anders Ödeen9, Jie Cui10, Qi Zhou11, Luohao Xu1, Hailin Pan1, Zongji Wang12, Lijun Jin1, Pei Zhang1, Haofu Hu1, Wei Yang1, Jiang Hu1, Jin Xiao1, Zhikai Yang1, Yang Liu1, Qiaolin Xie1, Hao Yu1, Jinmin Lian1, Ping Wen1, Fang Zhang1, Hui Li1, Yongli Zeng1, Zijun Xiong1, Shiping Liu12, Long Zhou1, Zhiyong Huang1, Na An1, Jie Wang13, Qiumei Zheng1, Yingqi Xiong1, Guangbiao Wang1, Bo Wang1, Jingjing Wang1, Yu Fan14, Rute R. da Fonseca2, Alonzo Alfaro-Núñez2, Mikkel Schubert2, Ludovic Orlando2, Tobias Mourier2, Jason T. Howard15, Ganeshkumar Ganapathy15, Andreas R. Pfenning15, Osceola Whitney15, Miriam V. Rivas15, Erina Hara15, Julia Smith15, Marta Farré3, Jitendra Narayan16, Gancho T. Slavov16, Michael N Romanov17, Rui Borges6, João Paulo Machado6, Imran Khan6, Mark S. Springer18, John Gatesy18, Federico G. Hoffmann19, Juan C. Opazo20, Olle Håstad21, Roger H. Sawyer7, Heebal Kim4, Kyu-Won Kim4, Hyeon Jeong Kim4, Seoae Cho4, Ning Li22, Yinhua Huang22, Michael William Bruford23, Xiangjiang Zhan13, Andrew Dixon, Mads F. Bertelsen24, Elizabeth P. Derryberry25, Wesley C. Warren26, Richard K. Wilson26, Shengbin Li27, David A. Ray19, Richard E. Green28, Stephen J. O'Brien29, Darren K. Griffin17, Warren E. Johnson30, David Haussler28, Oliver A. Ryder, Eske Willerslev2, Gary R. Graves31, Per Alström21, Jon Fjeldså32, David P. Mindell33, Scott V. Edwards34, Edward L. Braun35, Carsten Rahbek32, David W. Burt36, Peter Houde37, Yong Zhang1, Huanming Yang38, Jian Wang1, Erich D. Jarvis15, M. Thomas P. Gilbert2, M. Thomas P. Gilbert39, Jun Wang 
12 Dec 2014-Science
TL;DR: This work explored bird macroevolution using full genomes from 48 avian species representing all major extant clades to reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits.
Abstract: Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits.

872 citations

Journal ArticleDOI
01 Apr 2010-Nature
TL;DR: This work shows that song behaviour engages gene regulatory networks in the zebra finch brain, altering the expression of long non-coding RNAs, microRNAs, transcription factors and their targets and shows evidence for rapid molecular evolution in the songbird lineage of genes that are regulated during song experience.
Abstract: The zebra finch is an important model organism in several fields with unique relevance to human neuroscience. Like other songbirds, the zebra finch communicates through learned vocalizations, an ability otherwise documented only in humans and a few other animals and lacking in the chicken-the only bird with a sequenced genome until now. Here we present a structural, functional and comparative analysis of the genome sequence of the zebra finch (Taeniopygia guttata), which is a songbird belonging to the large avian order Passeriformes. We find that the overall structures of the genomes are similar in zebra finch and chicken, but they differ in many intrachromosomal rearrangements, lineage-specific gene family expansions, the number of long-terminal-repeat-based retrotransposons, and mechanisms of sex chromosome dosage compensation. We show that song behaviour engages gene regulatory networks in the zebra finch brain, altering the expression of long non-coding RNAs, microRNAs, transcription factors and their targets. We also show evidence for rapid molecular evolution in the songbird lineage of genes that are regulated during song experience. These results indicate an active involvement of the genome in neural processes underlying vocal communication and identify potential genetic substrates for the evolution and regulation of this behaviour.

837 citations

Journal ArticleDOI
Keith Bradnam1, Joseph Fass1, Anton Alexandrov, Paul Baranay2, Michael Bechner, Inanc Birol, Sébastien Boisvert3, Jarrod Chapman4, Guillaume Chapuis5, Guillaume Chapuis6, Rayan Chikhi6, Rayan Chikhi5, Hamidreza Chitsaz7, Wen-Chi Chou8, Jacques Corbeil3, Cristian Del Fabbro9, T. Roderick Docking, Richard Durbin10, Dent Earl11, Scott J. Emrich12, Pavel Fedotov, Nuno A. Fonseca13, Ganeshkumar Ganapathy14, Richard A. Gibbs15, Sante Gnerre16, Elenie Godzaridis3, Steve Goldstein, Matthias Haimel13, Giles Hall16, David Haussler11, Joseph B. Hiatt17, Isaac Ho4, Jason T. Howard14, Martin Hunt10, Shaun D. Jackman, David B. Jaffe16, Erich D. Jarvis14, Huaiyang Jiang15, Sergey Kazakov, Paul J. Kersey13, Jacob O. Kitzman17, James R. Knight, Sergey Koren18, Tak-Wah Lam, Dominique Lavenier6, Dominique Lavenier5, François Laviolette3, Yingrui Li, Zhenyu Li, Binghang Liu, Yue Liu15, Ruibang Luo, Iain MacCallum16, Matthew D. MacManes19, Nicolas Maillet5, Sergey Melnikov, Bruno Vieira20, Delphine Naquin5, Zemin Ning10, Thomas D. Otto10, Benedict Paten11, Octávio S. Paulo20, Adam M. Phillippy18, Francisco Pina-Martins20, Michael Place, Dariusz Przybylski16, Xiang Qin15, Carson Qu15, Filipe J. Ribeiro16, Stephen Richards15, Daniel S. Rokhsar4, Daniel S. Rokhsar19, J. Graham Ruby21, J. Graham Ruby22, Simone Scalabrin9, Michael C. Schatz23, David C. Schwartz, Alexey Sergushichev, Ted Sharpe16, Timothy I. Shaw8, Jay Shendure17, Yujian Shi, Jared T. Simpson10, Henry Song15, Fedor Tsarev, Francesco Vezzi24, Riccardo Vicedomini9, Jun Wang, Kim C. Worley15, Shuangye Yin16, Siu-Ming Yiu, Jianying Yuan, Guojie Zhang, Hao Zhang, Shiguo Zhou, Ian F Korf1 
TL;DR: The Assemblathon 2 as mentioned in this paper presented a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and a snake) from 21 participating teams.
Abstract: Background - The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. Results - In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. Conclusions - Many current genome assemblers produced useful assemblies, containing a significant representation of their genes, regulatory sequences, and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.

690 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: FeatureCounts as discussed by the authors is a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments, which implements highly efficient chromosome hashing and feature blocking techniques.
Abstract: MOTIVATION: Next-generation sequencing technologies generate millions of short sequence reads, which are usually aligned to a reference genome. In many applications, the key information required for downstream analysis is the number of reads mapping to each genomic feature, for example to each exon or each gene. The process of counting reads is called read summarization. Read summarization is required for a great variety of genomic analyses but has so far received relatively little attention in the literature. RESULTS: We present featureCounts, a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments. featureCounts implements highly efficient chromosome hashing and feature blocking techniques. It is considerably faster than existing methods (by an order of magnitude for gene-level summarization) and requires far less computer memory. It works with either single or paired-end reads and provides a wide range of options appropriate for different sequencing applications. AVAILABILITY AND IMPLEMENTATION: featureCounts is available under GNU General Public License as part of the Subread (http://subread.sourceforge.net) or Rsubread (http://www.bioconductor.org) software packages.

14,103 citations

Journal ArticleDOI
TL;DR: Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences, is presented, demonstrating that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either Pacific Biosciences or Oxford Nanopore technologies.
Abstract: Long-read single-molecule sequencing has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. However, given the relatively high error rates of such technologies, efficient and accurate assembly of large repeats and closely related haplotypes remains challenging. We address these issues with Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences. Canu introduces support for nanopore sequencing, halves depth-of-coverage requirements, and improves assembly continuity while simultaneously reducing runtime by an order of magnitude on large genomes versus Celera Assembler 8.2. These advances result from new overlapping and assembly algorithms, including an adaptive overlapping strategy based on tf-idf weighted MinHash and a sparse assembly graph construction that avoids collapsing diverged repeats and haplotypes. We demonstrate that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either Pacific Biosciences (PacBio) or Oxford Nanopore technologies and achieves a contig NG50 of >21 Mbp on both human and Drosophila melanogaster PacBio data sets. For assembly structures that cannot be linearly represented, Canu provides graph-based assembly outputs in graphical fragment assembly (GFA) format for analysis or integration with complementary phasing and scaffolding techniques. The combination of such highly resolved assembly graphs with long-range scaffolding information promises the complete and automated assembly of complex genomes.

4,806 citations

Journal ArticleDOI
TL;DR: The approach to utilizing available RNA-Seq and other data types in the authors' manual curation process for vertebrate, plant, and other species is summarized, and a new direction for prokaryotic genomes and protein name management is described.
Abstract: The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55,000 organisms (>4800 viruses, >40,000 prokaryotes and >10,000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.

4,104 citations

Journal ArticleDOI
TL;DR: This work presents a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing.
Abstract: We present a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. Our method uses the longest reads as seeds to recruit all other reads for construction of highly accurate preassembled reads through a directed acyclic graph-based consensus procedure, which we follow with assembly using off-the-shelf long-read assemblers. In contrast to hybrid approaches, HGAP does not require highly accurate raw reads for error correction. We demonstrate efficient genome assembly for several microorganisms using as few as three SMRT Cell zero-mode waveguide arrays of sequencing and for BACs using just one SMRT Cell. Long repeat regions can be successfully resolved with this workflow. We also describe a consensus algorithm that incorporates SMRT sequencing primary quality values to produce de novo genome sequence exceeding 99.999% accuracy.

3,647 citations

Journal ArticleDOI
TL;DR: PartitionFinder 2 is a program for automatically selecting best-fit partitioning schemes and models of evolution for phylogenetic analyses that includes the ability to analyze morphological datasets, new methods to analyze genome-scale datasets, and new output formats to facilitate interoperability with downstream software.
Abstract: PartitionFinder 2 is a program for automatically selecting best-fit partitioning schemes and models of evolution for phylogenetic analyses. PartitionFinder 2 is substantially faster and more efficient than version 1, and incorporates many new methods and features. These include the ability to analyze morphological datasets, new methods to analyze genome-scale datasets, new output formats to facilitate interoperability with downstream software, and many new models of molecular evolution. PartitionFinder 2 is freely available under an open source license and works on Windows, OSX, and Linux operating systems. It can be downloaded from www.robertlanfear.com/partitionfinder. The source code is available at https://github.com/brettc/partitionfinder.

3,445 citations