Author
Hui Li
Other affiliations: Beijing Genomics Institute
Bio: Hui Li is an academic researcher from University of Pennsylvania. The author has contributed to research in topics: Virus & Hepatitis C virus. The author has an hindex of 15, co-authored 20 publications receiving 3326 citations. Previous affiliations of Hui Li include Beijing Genomics Institute.
Topics: Virus, Hepatitis C virus, Viral replication, Genome, Epitope
Papers
More filters
••
Duke University1, University of Texas at Austin2, Heidelberg Institute for Theoretical Studies3, Xi'an Jiaotong University4, Beijing Genomics Institute5, American Museum of Natural History6, New Mexico State University7, University of Sydney8, University of California9, Uppsala University10, University of Copenhagen11, Okinawa Institute of Science and Technology12, University of Georgia13, Griffith University14, Catalan Institution for Research and Advanced Studies15, Oak Ridge National Laboratory16, Joint Institute for Nuclear Research17, Aarhus University18, Washington University in St. Louis19, University of California, Santa Cruz20, Cardiff University21, Kunming Institute of Zoology22, China Agricultural University23, Louisiana State University24, Tulane University25, Copenhagen Zoo26, Oregon Health & Science University27, Federal University of Pará28, Technical University of Denmark29, Canterbury Museum30, Curtin University31, Novosibirsk State University32, Smithsonian Institution33, National University of Singapore34, National Museum of Natural History35, Nova Southeastern University36, Occidental College37, University of Edinburgh38, Harvard University39, University of California, San Francisco40, University of Florida41, University of Illinois at Urbana–Champaign42
TL;DR: A genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves recovered a highly resolved tree that confirms previously controversial sister or close relationships and identifies the first divergence in Neoaves, two groups the authors named Passerea and Columbea.
Abstract: To better determine the history of modern birds, we performed a genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves using phylogenomic methods created to handle genome-scale data. We recovered a highly resolved tree that confirms previously controversial sister or close relationships. We identified the first divergence in Neoaves, two groups we named Passerea and Columbea, representing independent lineages of diverse and convergently evolved land and water bird species. Among Passerea, we infer the common ancestor of core landbirds to have been an apex predator and confirm independent gains of vocal learning. Among Columbea, we identify pigeons and flamingoes as belonging to sister clades. Even with whole genomes, some of the earliest branches in Neoaves proved challenging to resolve, which was best explained by massive protein-coding sequence convergence and high levels of incomplete lineage sorting that occurred during a rapid radiation after the Cretaceous-Paleogene mass extinction event about 66 million years ago.
1,624 citations
••
Beijing Genomics Institute1, University of Copenhagen2, Royal Veterinary College3, Seoul National University4, University of Nebraska–Lincoln5, University of Porto6, University of South Carolina7, Montclair State University8, Uppsala University9, National University of Singapore10, University of California, Berkeley11, South China University of Technology12, Chinese Academy of Sciences13, Kunming Institute of Zoology14, Howard Hughes Medical Institute15, Aberystwyth University16, University of Kent17, University of California, Riverside18, Mississippi State University19, Austral University of Chile20, Swedish University of Agricultural Sciences21, China Agricultural University22, Cardiff University23, Copenhagen Zoo24, Louisiana State University25, Washington University in St. Louis26, Xi'an Jiaotong University27, University of California, Santa Cruz28, Nova Southeastern University Oceanographic Center29, Smithsonian Conservation Biology Institute30, National Museum of Natural History31, Natural History Museum32, University of California, San Francisco33, Harvard University34, University of Florida35, University of Edinburgh36, New Mexico State University37, Macau University of Science and Technology38, Curtin University39
TL;DR: This work explored bird macroevolution using full genomes from 48 avian species representing all major extant clades to reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits.
Abstract: Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits.
872 citations
••
TL;DR: TF viruses are enriched for higher Env content, enhanced cell-free infectivity, improved dendritic cell interaction, and relative IFN-α resistance, which should be considered in the development and testing of AIDS vaccines.
Abstract: Defining the virus–host interactions responsible for HIV-1 transmission, including the phenotypic requirements of viruses capable of establishing de novo infections, could be important for AIDS vaccine development. Previous analyses have failed to identify phenotypic properties other than chemokine receptor 5 (CCR5) and CD4+ T-cell tropism that are preferentially associated with viral transmission. However, most of these studies were limited to examining envelope (Env) function in the context of pseudoviruses. Here, we generated infectious molecular clones of transmitted founder (TF; n = 27) and chronic control (CC; n = 14) viruses of subtypes B (n = 18) and C (n = 23) and compared their phenotypic properties in assays specifically designed to probe the earliest stages of HIV-1 infection. We found that TF virions were 1.7-fold more infectious (P = 0.049) and contained 1.9-fold more Env per particle (P = 0.048) compared with CC viruses. TF viruses were also captured by monocyte-derived dendritic cells 1.7-fold more efficiently (P = 0.035) and more readily transferred to CD4+ T cells (P = 0.025). In primary CD4+ T cells, TF and CC viruses replicated with comparable kinetics; however, when propagated in the presence of IFN-α, TF viruses replicated to higher titers than CC viruses. This difference was significant for subtype B (P = 0.000013) but not subtype C (P = 0.53) viruses, possibly reflecting demographic differences of the respective patient cohorts. Together, these data indicate that TF viruses are enriched for higher Env content, enhanced cell-free infectivity, improved dendritic cell interaction, and relative IFN-α resistance. These viral properties, which likely act in concert, should be considered in the development and testing of AIDS vaccines.
384 citations
••
University of Oxford1, University of Washington2, University of Tennessee3, University of Pennsylvania4, Duke University5, University of Cape Town6, Los Alamos National Laboratory7, University of North Carolina at Chapel Hill8, Centre for the AIDS Programme of Research in South Africa9, Santa Fe Institute10
TL;DR: It is explained how CD8+ T cells can exert significant and sustained HIV-1 pressure even when escape is very slow and that within an individual, the impacts of other T cell factors on HIV- 1 escape should be considered in the context of immunodominance.
Abstract: HIV-1 accumulates mutations in and around reactive epitopes to escape recognition and killing by CD8+ T cells. Measurements of HIV-1 time to escape should therefore provide information on which parameters are most important for T cell–mediated in vivo control of HIV-1. Primary HIV-1–specific T cell responses were fully mapped in 17 individuals, and the time to virus escape, which ranged from days to years, was measured for each epitope. While higher magnitude of an individual T cell response was associated with more rapid escape, the most significant T cell measure was its relative immunodominance measured in acute infection. This identified subject-level or “vertical” immunodominance as the primary determinant of in vivo CD8+ T cell pressure in HIV-1 infection. Conversely, escape was slowed significantly by lower population variability, or entropy, of the epitope targeted. Immunodominance and epitope entropy combined to explain half of all the variability in time to escape. These data explain how CD8+ T cells can exert significant and sustained HIV-1 pressure even when escape is very slow and that within an individual, the impacts of other T cell factors on HIV-1 escape should be considered in the context of immunodominance.
194 citations
••
TL;DR: A new stochastic model of the HCV life cycle is developed and it is found that the accumulation of mutations is surprisingly slow: at 30 days, the viral population on average is still 46% identical to its transmitted viral genome.
Abstract: Hepatitis C virus (HCV) is present in the host with multiple variants generated by its error prone RNA-dependent RNA polymerase. Little is known about the initial viral diversification and the viral life cycle processes that influence diversity. We studied the diversification of HCV during acute infection in 17 plasma donors, with frequent sampling early in infection. To analyze these data, we developed a new stochastic model of the HCV life cycle. We found that the accumulation of mutations is surprisingly slow: at 30 days, the viral population on average is still 46% identical to its transmitted viral genome. Fitting the model to the sequence data, we estimate the median in vivo viral mutation rate is 2.5×10−5 mutations per nucleotide per genome replication (range 1.6–6.2×10−5), about 5-fold lower than previous estimates. To confirm these results we analyzed the frequency of stop codons (N = 10) among all possible non-sense mutation targets (M = 898,335), and found a mutation rate of 2.8–3.2×10−5, consistent with the estimate from the dynamical model. The slow accumulation of mutations is consistent with slow turnover of infected cells and replication complexes within infected cells. This slow turnover is also inferred from the viral load kinetics. Our estimated mutation rate, which is similar to that of other RNA viruses (e.g., HIV and influenza), is also compatible with the accumulation of substitutions seen in HCV at the population level. Our model identifies the relevant processes (long-lived cells and slow turnover of replication complexes) and parameters involved in determining the rate of HCV diversification.
153 citations
Cited by
More filters
••
TL;DR: The approach to utilizing available RNA-Seq and other data types in the authors' manual curation process for vertebrate, plant, and other species is summarized, and a new direction for prokaryotic genomes and protein name management is described.
Abstract: The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55,000 organisms (>4800 viruses, >40,000 prokaryotes and >10,000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.
4,104 citations
••
TL;DR: PartitionFinder 2 is a program for automatically selecting best-fit partitioning schemes and models of evolution for phylogenetic analyses that includes the ability to analyze morphological datasets, new methods to analyze genome-scale datasets, and new output formats to facilitate interoperability with downstream software.
Abstract: PartitionFinder 2 is a program for automatically selecting best-fit partitioning schemes and models of evolution for phylogenetic analyses. PartitionFinder 2 is substantially faster and more efficient than version 1, and incorporates many new methods and features. These include the ability to analyze morphological datasets, new methods to analyze genome-scale datasets, new output formats to facilitate interoperability with downstream software, and many new models of molecular evolution. PartitionFinder 2 is freely available under an open source license and works on Windows, OSX, and Linux operating systems. It can be downloaded from www.robertlanfear.com/partitionfinder. The source code is available at https://github.com/brettc/partitionfinder.
3,445 citations
•
TL;DR: In this paper, a test based on two conserved CHD (chromo-helicase-DNA-binding) genes that are located on the avian sex chromosomes of all birds, with the possible exception of the ratites (ostriches, etc.).
2,554 citations
••
Duke University1, University of Texas at Austin2, Heidelberg Institute for Theoretical Studies3, American Museum of Natural History4, Xi'an Jiaotong University5, Beijing Genomics Institute6, New Mexico State University7, University of Sydney8, University of California9, Uppsala University10, University of Copenhagen11, Okinawa Institute of Science and Technology12, University of Georgia13, Griffith University14, Catalan Institution for Research and Advanced Studies15, Joint Institute for Nuclear Research16, Oak Ridge National Laboratory17, Aarhus University18, Washington University in St. Louis19, University of California, Santa Cruz20, Cardiff University21, Kunming Institute of Zoology22, China Agricultural University23, Louisiana State University24, Tulane University25, Copenhagen Zoo26, Federal University of Pará27, Oregon Health & Science University28, Technical University of Denmark29, Canterbury Museum30, Curtin University31, Novosibirsk State University32, Smithsonian Institution33, National University of Singapore34, National Museum of Natural History35, Nova Southeastern University36, Occidental College37, University of Edinburgh38, Harvard University39, University of California, San Francisco40, University of Florida41, University of Illinois at Urbana–Champaign42
TL;DR: A genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves recovered a highly resolved tree that confirms previously controversial sister or close relationships and identifies the first divergence in Neoaves, two groups the authors named Passerea and Columbea.
Abstract: To better determine the history of modern birds, we performed a genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves using phylogenomic methods created to handle genome-scale data. We recovered a highly resolved tree that confirms previously controversial sister or close relationships. We identified the first divergence in Neoaves, two groups we named Passerea and Columbea, representing independent lineages of diverse and convergently evolved land and water bird species. Among Passerea, we infer the common ancestor of core landbirds to have been an apex predator and confirm independent gains of vocal learning. Among Columbea, we identify pigeons and flamingoes as belonging to sister clades. Even with whole genomes, some of the earliest branches in Neoaves proved challenging to resolve, which was best explained by massive protein-coding sequence convergence and high levels of incomplete lineage sorting that occurred during a rapid radiation after the Cretaceous-Paleogene mass extinction event about 66 million years ago.
1,624 citations
••
TL;DR: This work presents BUSCO v3 with example analyses that highlight the wide‐ranging utility of BUSCO assessments, which extend beyond quality control of genomics data sets to applications in comparative genomics analyses, gene predictor training, metagenomics, and phylogenomics.
Abstract: Genomics promises comprehensive surveying of genomes and metagenomes, but rapidly changing technologies and expanding data volumes make evaluation of completeness a challenging task. Technical sequencing quality metrics can be complemented by quantifying completeness of genomic data sets in terms of the expected gene content of Benchmarking Universal Single-Copy Orthologs (BUSCO, http://busco.ezlab.org). The latest software release implements a complete refactoring of the code to make it more flexible and extendable to facilitate high-throughput assessments. The original six lineage assessment data sets have been updated with improved species sampling, 34 new subsets have been built for vertebrates, arthropods, fungi, and prokaryotes that greatly enhance resolution, and data sets are now also available for nematodes, protists, and plants. Here, we present BUSCO v3 with example analyses that highlight the wide-ranging utility of BUSCO assessments, which extend beyond quality control of genomics data sets to applications in comparative genomics analyses, gene predictor training, metagenomics, and phylogenomics.
1,575 citations