scispace - formally typeset
Search or ask a question
Author

David K. Smith

Bio: David K. Smith is an academic researcher from University of Hong Kong. The author has contributed to research in topics: Influenza A virus & Influenza A virus subtype H5N1. The author has an hindex of 27, co-authored 68 publications receiving 4564 citations. Previous affiliations of David K. Smith include Li Ka Shing Faculty of Medicine, University of Hong Kong & Shantou University.


Papers
More filters
Journal ArticleDOI
TL;DR: An r package, ggtree, which provides programmable visualization and annotation of phylogenetic trees, which can read more tree file formats than other softwares, and support visualization of phylo, multiphylo, phylo4, phyla4d, obkdata and phyloseq tree objects defined in other r packages.
Abstract: Summary We present an r package, ggtree, which provides programmable visualization and annotation of phylogenetic trees. ggtree can read more tree file formats than other softwares, including newick, nexus, NHX, phylip and jplace formats, and support visualization of phylo, multiphylo, phylo4, phylo4d, obkdata and phyloseq tree objects defined in other r packages. It can also extract the tree/branch/node-specific and other data from the analysis outputs of beast, epa, hyphy, paml, phylodog, pplacer, r8s, raxml and revbayes software, and allows using these data to annotate the tree. The package allows colouring and annotation of a tree by numerical/categorical node attributes, manipulating a tree by rotating, collapsing and zooming out clades, highlighting user selected clades or operational taxonomic units and exploration of a large tree by zooming into a selected portion. A two-dimensional tree can be drawn by scaling the tree width based on an attribute of the nodes. A tree can be annotated with an associated numerical matrix (as a heat map), multiple sequence alignment, subplots or silhouette images. The package ggtree is released under the artistic-2.0 license. The source code and documents are freely available through bioconductor (http://www.bioconductor.org/packages/ggtree).

2,692 citations

Journal ArticleDOI
10 Oct 2013-Nature
TL;DR: It is shown that H7 viruses probably transferred from domestic duck to chicken populations in China on at least two independent occasions and subsequently reassorted with enzootic H9N2 viruses to generate the H7N9 outbreak lineage, and a related previously unrecognized H7n7 lineage.
Abstract: A novel H7N9 influenza A virus first detected in March 2013 has since caused more than 130 human infections in China, resulting in 40 deaths. Preliminary analyses suggest that the virus is a reassortant of H7, N9 and H9N2 avian influenza viruses, and carries some amino acids associated with mammalian receptor binding, raising concerns of a new pandemic. However, neither the source populations of the H7N9 outbreak lineage nor the conditions for its genesis are fully known. Using a combination of active surveillance, screening of virus archives, and evolutionary analyses, here we show that H7 viruses probably transferred from domestic duck to chicken populations in China on at least two independent occasions. We show that the H7 viruses subsequently reassorted with enzootic H9N2 viruses to generate the H7N9 outbreak lineage, and a related previously unrecognized H7N7 lineage. The H7N9 outbreak lineage has spread over a large geographic region and is prevalent in chickens at live poultry markets, which are thought to be the immediate source of human infections. Whether the H7N9 outbreak lineage has, or will, become enzootic in China and neighbouring regions requires further investigation. The discovery here of a related H7N7 influenza virus in chickens that has the ability to infect mammals experimentally, suggests that H7 viruses may pose threats beyond the current outbreak. The continuing prevalence of H7 viruses in poultry could lead to the generation of highly pathogenic variants and further sporadic human infections, with a continued risk of the virus acquiring human-to-human transmissibility.

420 citations

Journal ArticleDOI
01 Jan 2016-Science
TL;DR: Camels serve as an important reservoir for the maintenance and diversification of the MERS-CoVs and are the source of human infections with this virus, according to surveillance in Saudi Arabia in 2014 and 2015.
Abstract: Outbreaks of Middle East respiratory syndrome (MERS) raise questions about the prevalence and evolution of the MERS coronavirus (CoV) in its animal reservoir. Our surveillance in Saudi Arabia in 2014 and 2015 showed that viruses of the MERS-CoV species and a human CoV 229E-related lineage co-circulated at high prevalence, with frequent co-infections in the upper respiratory tract of dromedary camels. viruses of the betacoronavirus 1 species, we found that dromedary camels share three CoV species with humans. Several MERS-CoV lineages were present in camels, including a recombinant lineage that has been dominant since December 2014 and that subsequently led to the human outbreaks in 2015. Camels therefore serve as an important reservoir for the maintenance and diversification of the MERS-CoVs and are the source of human infections with this virus.

373 citations

Journal ArticleDOI
TL;DR: The distinctive amino acid biases of high‐B‐factor ordered regions, short disordered regions, and long dis ordered regions indicate that the sequence determinants for these flexibility categories differ from one another, whereas the significantly‐greater‐than‐chance predictability of these categories from sequence suggest that flexible ordered regions and short disorder are, to a significant degree, encoded at the primary structure level.
Abstract: Comparisons were made among four categories of protein flexibility: (1) low-B-factor ordered regions, (2) high-B-factor ordered regions, (3) short disordered regions, and (4) long disordered regions. Amino acid compositions of the four categories were found to be significantly different from each other, with high-B-factor ordered and short disordered regions being the most similar pair. The high-B-factor (flexible) ordered regions are characterized by a higher average flexibility index, higher average hydrophilicity, higher average absolute net charge, and higher total charge than disordered regions. The low-B-factor regions are significantly enriched in hydrophobic residues and depleted in the total number of charged residues compared to the other three categories. We examined the predictability of the high-B-factor regions and developed a predictor that discriminates between regions of low and high B-factors. This predictor achieved an accuracy of 70% and a correlation of 0.43 with experimental data, outperforming the 64% accuracy and 0.32 correlation of predictors based solely on flexibility indices. To further clarify the differences between short disordered regions and ordered regions, a predictor of short disordered regions was developed. Its relatively high accuracy of 81% indicates considerable differences between ordered and disordered regions. The distinctive amino acid biases of high-B-factor ordered regions, short disordered regions, and long disordered regions indicate that the sequence determinants for these flexibility categories differ from one another, whereas the significantly-greater-than-chance predictability of these categories from sequence suggest that flexible ordered regions, short disorder, and long disorder are, to a significant degree, encoded at the primary structure level.

333 citations

Journal ArticleDOI
TL;DR: It was found that the 1918 pandemic H1N1 virus contained genes with mammalian-like viral codon usage patterns, indicating that the introduction of this virus to humans was not through in toto transfer of an avian influenza virus.
Abstract: The influenza A virus is an important infectious cause of morbidity and mortality in humans and was responsible for 3 pandemics in the 20th century. As the replication of the influenza virus is based on its host's machinery, codon usage of its viral genes might be subject to host selection pressures, especially after interspecies transmission. A better understanding of viral evolution and host adaptive responses might help control this disease. Relative Synonymous Codon Usage (RSCU) values of the genes from segment 1 to segment 6 of avian and human influenza viruses, including pandemic H1N1, were studied via Correspondence Analysis (CA). The codon usage patterns of seasonal human influenza viruses were distinct among their subtypes and different from those of avian viruses. Newly isolated viruses could be added to the CA results, creating a tool to investigate the host origin and evolution of viral genes. It was found that the 1918 pandemic H1N1 virus contained genes with mammalian-like viral codon usage patterns, indicating that the introduction of this virus to humans was not through in toto transfer of an avian influenza virus. Many human viral genes had directional changes in codon usage over time of viral isolation, indicating the effect of host selection pressures. These changes reduced the overall GC content and the usage of G at the third codon position in the viral genome. Limited evidence of translational selection pressure was found in a few viral genes. Codon usage patterns from CA allowed identification of host origin and evolutionary trends in influenza viruses, providing an alternative method and a tool to understand the evolution of influenza viruses. Human influenza viruses are subject to selection pressure on codon usage which might assist in understanding the characteristics of newly emerging viruses.

265 citations


Cited by
More filters
01 Feb 2015
TL;DR: In this article, the authors describe the integrative analysis of 111 reference human epigenomes generated as part of the NIH Roadmap Epigenomics Consortium, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression.
Abstract: The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

4,409 citations

Journal ArticleDOI
TL;DR: The viral factors that enabled the emergence of diseases such as severe acute respiratory syndrome and Middle East respiratory syndrome are explored and the diversity and potential of bat-borne coronaviruses are highlighted.
Abstract: Severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) are two highly transmissible and pathogenic viruses that emerged in humans at the beginning of the 21st century. Both viruses likely originated in bats, and genetically diverse coronaviruses that are related to SARS-CoV and MERS-CoV were discovered in bats worldwide. In this Review, we summarize the current knowledge on the origin and evolution of these two pathogenic coronaviruses and discuss their receptor usage; we also highlight the diversity and potential of spillover of bat-borne coronaviruses, as evidenced by the recent spillover of swine acute diarrhoea syndrome coronavirus (SADS-CoV) to pigs. Coronaviruses have a broad host range and distribution, and some highly pathogenic lineages have spilled over to humans and animals. Here, Cui, Li and Shi explore the viral factors that enabled the emergence of diseases such as severe acute respiratory syndrome and Middle East respiratory syndrome.

3,970 citations

Journal ArticleDOI

3,734 citations

Journal ArticleDOI
TL;DR: The emergence of Middle East respiratory syndrome coronavirus (MERS-CoV) in 2012 marked the second introduction of a highly pathogenic coronav virus into the human population in the twenty-first century, and the current state of development of measures to combat emerging coronaviruses is discussed.
Abstract: The emergence of Middle East respiratory syndrome coronavirus (MERS-CoV) in 2012 marked the second introduction of a highly pathogenic coronavirus into the human population in the twenty-first century. The continuing introductions of MERS-CoV from dromedary camels, the subsequent travel-related viral spread, the unprecedented nosocomial outbreaks and the high case-fatality rates highlight the need for prophylactic and therapeutic measures. Scientific advancements since the 2002-2003 severe acute respiratory syndrome coronavirus (SARS-CoV) pandemic allowed for rapid progress in our understanding of the epidemiology and pathogenesis of MERS-CoV and the development of therapeutics. In this Review, we detail our present understanding of the transmission and pathogenesis of SARS-CoV and MERS-CoV, and discuss the current state of development of measures to combat emerging coronaviruses.

2,794 citations

Journal ArticleDOI
TL;DR: An r package, ggtree, which provides programmable visualization and annotation of phylogenetic trees, which can read more tree file formats than other softwares, and support visualization of phylo, multiphylo, phylo4, phyla4d, obkdata and phyloseq tree objects defined in other r packages.
Abstract: Summary We present an r package, ggtree, which provides programmable visualization and annotation of phylogenetic trees. ggtree can read more tree file formats than other softwares, including newick, nexus, NHX, phylip and jplace formats, and support visualization of phylo, multiphylo, phylo4, phylo4d, obkdata and phyloseq tree objects defined in other r packages. It can also extract the tree/branch/node-specific and other data from the analysis outputs of beast, epa, hyphy, paml, phylodog, pplacer, r8s, raxml and revbayes software, and allows using these data to annotate the tree. The package allows colouring and annotation of a tree by numerical/categorical node attributes, manipulating a tree by rotating, collapsing and zooming out clades, highlighting user selected clades or operational taxonomic units and exploration of a large tree by zooming into a selected portion. A two-dimensional tree can be drawn by scaling the tree width based on an attribute of the nodes. A tree can be annotated with an associated numerical matrix (as a heat map), multiple sequence alignment, subplots or silhouette images. The package ggtree is released under the artistic-2.0 license. The source code and documents are freely available through bioconductor (http://www.bioconductor.org/packages/ggtree).

2,692 citations