scispace - formally typeset
Search or ask a question
Author

William W. L. Hsiao

Bio: William W. L. Hsiao is an academic researcher from University of British Columbia. The author has contributed to research in topics: Genome & Ontology (information science). The author has an hindex of 26, co-authored 53 publications receiving 3833 citations. Previous affiliations of William W. L. Hsiao include Simon Fraser University & Provincial Health Services Authority.


Papers
More filters
Journal ArticleDOI
TL;DR: A new Resistomes & Variants module provides analysis and statistical summary of in silico predicted resistance variants from 82 pathogens and over 100 000 genomes, able to summarize predicted resistance using the information included in CARD, identify trends in AMR mobility and determine previously undescribed and novel resistance variants.
Abstract: The Comprehensive Antibiotic Resistance Database (CARD; https://card.mcmaster.ca) is a curated resource providing reference DNA and protein sequences, detection models and bioinformatics tools on the molecular basis of bacterial antimicrobial resistance (AMR). CARD focuses on providing high-quality reference data and molecular sequences within a controlled vocabulary, the Antibiotic Resistance Ontology (ARO), designed by the CARD biocuration team to integrate with software development efforts for resistome analysis and prediction, such as CARD's Resistance Gene Identifier (RGI) software. Since 2017, CARD has expanded through extensive curation of reference sequences, revision of the ontological structure, curation of over 500 new AMR detection models, development of a new classification paradigm and expansion of analytical tools. Most notably, a new Resistomes & Variants module provides analysis and statistical summary of in silico predicted resistance variants from 82 pathogens and over 100 000 genomes. By adding these resistance variants to CARD, we are able to summarize predicted resistance using the information included in CARD, identify trends in AMR mobility and determine previously undescribed and novel resistance variants. Here, we describe updates and recent expansions to CARD and its biocuration process, including new resources for community biocuration of AMR molecular reference data.

1,526 citations

Journal ArticleDOI
TL;DR: Overall, RHA1 appears to have evolved to simultaneously catabolize a diverse range of plant-derived compounds in an O2-rich environment and is established as an important model for studying actinomycete physiology.
Abstract: Rhodococcus sp. RHA1 (RHA1) is a potent polychlorinated biphenyl-degrading soil actinomycete that catabolizes a wide range of compounds and represents a genus of considerable industrial interest. RHA1 has one of the largest bacterial genomes sequenced to date, comprising 9,702,737 bp (67% G+C) arranged in a linear chromosome and three linear plasmids. A targeted insertion methodology was developed to determine the telomeric sequences. RHA1's 9,145 predicted protein-encoding genes are exceptionally rich in oxygenases (203) and ligases (192). Many of the oxygenases occur in the numerous pathways predicted to degrade aromatic compounds (30) or steroids (4). RHA1 also contains 24 nonribosomal peptide synthase genes, six of which exceed 25 kbp, and seven polyketide synthase genes, providing evidence that rhodococci harbor an extensive secondary metabolism. Among sequenced genomes, RHA1 is most similar to those of nocardial and mycobacterial strains. The genome contains few recent gene duplications. Moreover, three different analyses indicate that RHA1 has acquired fewer genes by recent horizontal transfer than most bacteria characterized to date and far fewer than Burkholderia xenovorans LB400, whose genome size and catabolic versatility rival those of RHA1. RHA1 and LB400 thus appear to demonstrate that ecologically similar bacteria can evolve large genomes by different means. Overall, RHA1 appears to have evolved to simultaneously catabolize a diverse range of plant-derived compounds in an O(2)-rich environment. In addition to establishing RHA1 as an important model for studying actinomycete physiology, this study provides critical insights that facilitate the exploitation of these industrially important microorganisms.

625 citations

Journal ArticleDOI
TL;DR: IslandPath is a network service which incorporates multiple DNA signals and genome annotation features into a graphical display of a bacterial or archaeal genome, to aid the detection of genomic islands.
Abstract: Summary: Genomic islands (clusters of genes of potential horizontal origin in a prokaryotic genome) are frequently associated with a particular adaptation of a microbe that is of medical, agricultural or environmental importance, such as antibiotic resistance, pathogen virulence, or metal resistance. While many sequence features associated with such islands have been adopted separately in applications for analysis of genomic islands, including pathogenicity islands, there is no single application that integrates multiple features for island detection. IslandPath is a network service which incorporates multiple DNA signals and genome annotation features into a graphical display of a bacterial or archaeal genome, to aid the detection of genomic islands. Availability: This application is available at http://www. pathogenomics.sfu.ca/islandpath and the source code is freely available, under GNU public licence, from the

337 citations

Journal ArticleDOI
TL;DR: Gene expression patterns in gut biopsies from individuals with common variable immunodeficiency or with HIV infection and intestinal malabsorption were very similar to those of the B cell–deficient mice, providing a possible explanation for a longstanding enigmatic association between immunodficiency and defective lipid absorption in humans.
Abstract: Using a systems biology approach, we discovered and dissected a three-way interaction between the immune system, the intestinal epithelium and the microbiota. We found that, in the absence of B cells, or of IgA, and in the presence of the microbiota, the intestinal epithelium launches its own protective mechanisms, upregulating interferon-inducible immune response pathways and simultaneously repressing Gata4-related metabolic functions. This shift in intestinal function leads to lipid malabsorption and decreased deposition of body fat. Network analysis revealed the presence of two interconnected epithelial-cell gene networks, one governing lipid metabolism and another regulating immunity, that were inversely expressed. Gene expression patterns in gut biopsies from individuals with common variable immunodeficiency or with HIV infection and intestinal malabsorption were very similar to those of the B cell–deficient mice, providing a possible explanation for a longstanding enigmatic association between immunodeficiency and defective lipid absorption in humans.

325 citations

Journal ArticleDOI
TL;DR: The comparative genomics approach, IslandPick, was the most accurate, compared with a curated list of GIs, indicating that it has constructed suitable datasets, and the accuracy of several sequence composition-based GI predictors is evaluated.
Abstract: Genomic islands (GIs) are clusters of genes in prokaryotic genomes of probable horizontal origin. GIs are disproportionately associated with microbial adaptations of medical or environmental interest. Recently, multiple programs for automated detection of GIs have been developed that utilize sequence composition characteristics, such as G+C ratio and dinucleotide bias. To robustly evaluate the accuracy of such methods, we propose that a dataset of GIs be constructed using criteria that are independent of sequence composition-based analysis approaches. We developed a comparative genomics approach (IslandPick) that identifies both very probable islands and non-island regions. The approach involves 1) flexible, automated selection of comparative genomes for each query genome, using a distance function that picks appropriate genomes for identification of GIs, 2) identification of regions unique to the query genome, compared with the chosen genomes (positive dataset) and 3) identification of regions conserved across all genomes (negative dataset). Using our constructed datasets, we investigated the accuracy of several sequence composition-based GI prediction tools. Our results indicate that AlienHunter has the highest recall, but the lowest measured precision, while SIGI-HMM is the most precise method. SIGI-HMM and IslandPath/DIMOB have comparable overall highest accuracy. Our comparative genomics approach, IslandPick, was the most accurate, compared with a curated list of GIs, indicating that we have constructed suitable datasets. This represents the first evaluation, using diverse and, independent datasets that were not artificially constructed, of the accuracy of several sequence composition-based GI predictors. The caveats associated with this analysis and proposals for optimal island prediction are discussed.

258 citations


Cited by
More filters
01 Jun 2012
TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).
Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

10,124 citations

Journal ArticleDOI
TL;DR: An objective measure of genome quality is proposed that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities and is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches.
Abstract: Large-scale recovery of genomes from isolates, single cells, and metagenomic data has been made possible by advances in computational methods and substantial reductions in sequencing costs. Although this increasing breadth of draft genomes is providing key information regarding the evolutionary and functional diversity of microbial life, it has become impractical to finish all available reference genomes. Making robust biological inferences from draft genomes requires accurate estimates of their completeness and contamination. Current methods for assessing genome quality are ad hoc and generally make use of a limited number of “marker” genes conserved across all bacterial or archaeal genomes. Here we introduce CheckM, an automated method for assessing the quality of a genome using a broader set of marker genes specific to the position of a genome within a reference genome tree and information about the collocation of these genes. We demonstrate the effectiveness of CheckM using synthetic data and a wide range of isolate-, single-cell-, and metagenome-derived genomes. CheckM is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches. Using CheckM, we identify a diverse range of errors currently impacting publicly available isolate genomes and demonstrate that genomes obtained from single cells and metagenomic data vary substantially in quality. In order to facilitate the use of draft genomes, we propose an objective measure of genome quality that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities.

5,788 citations

Journal ArticleDOI
25 Jun 2010-PLOS ONE
TL;DR: A new method to align two or more genomes that have undergone rearrangements due to recombination and substantial amounts of segmental gain and loss is described, demonstrating high accuracy in situations where genomes have undergone biologically feasible amounts of genome rearrangement, segmental loss and loss.
Abstract: Background Multiple genome alignment remains a challenging problem. Effects of recombination including rearrangement, segmental duplication, gain, and loss can create a mosaic pattern of homology even among closely related organisms.

3,302 citations

Journal ArticleDOI
27 Mar 2014-Cell
TL;DR: In high-income countries, overuse of antibiotics, changes in diet, and elimination of constitutive partners, such as nematodes, may have selected for a microbiota that lack the resilience and diversity required to establish balanced immune responses.

3,257 citations

Journal ArticleDOI
TL;DR: The large-scale dynamics of the microbiome can be described by many of the tools and observations used in the study of population ecology, andiphering the metagenome and its aggregate genetic information can also be used to understand the functional properties of the microbial community.
Abstract: Interest in the role of the microbiome in human health has burgeoned over the past decade with the advent of new technologies for interrogating complex microbial communities. The large-scale dynamics of the microbiome can be described by many of the tools and observations used in the study of population ecology. Deciphering the metagenome and its aggregate genetic information can also be used to understand the functional properties of the microbial community. Both the microbiome and metagenome probably have important functions in health and disease; their exploration is a frontier in human genetics.

2,650 citations