scispace - formally typeset
Search or ask a question

Showing papers by "Liqing Zhang published in 2016"


Journal ArticleDOI
15 Sep 2016-PLOS ONE
TL;DR: A new online user-friendly metagenomics analysis server called MetaStorm is developed, which facilitates customization of computational analysis for metagenomic data sets and provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution.
Abstract: Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution.

63 citations


Journal ArticleDOI
TL;DR: This rare flood afforded the opportunity to gain deeper insight into factors influencing the spread of ARGs in watersheds, including Bulk water bacterial phylogeny correlated with ARG profiles while sediment phylogeny varied along the river’s anthropogenic gradient.
Abstract: Record-breaking floods in September 2013 caused massive damage to homes and infrastructure across the Colorado Front Range and heavily impacted the Cache La Poudre River watershed. Given the unique nature of this watershed as a test-bed for tracking environmental pathways of antibiotic resistance gene (ARG) dissemination, we sought to determine the impact of extreme flooding on ARG reservoirs in river water and sediment. We utilized high-throughput DNA sequencing to obtain metagenomic profiles of ARGs before and after flooding, and investigated 23 antibiotics and 14 metals as putative selective agents during post-flood recovery. With 277 ARG subtypes identified across samples, total bulk water ARGs decreased following the flood but recovered to near pre-flood abundances by ten months post-flood at both a pristine site and at a site historically heavily influenced by wastewater treatment plants and animal feeding operations. Network analysis of de novo assembled sequencing reads into 52,556 scaffolds identified ARGs likely located on mobile genetic elements, with up to 11 ARGs per plasmid-associated scaffold. Bulk water bacterial phylogeny correlated with ARG profiles while sediment phylogeny varied along the river's anthropogenic gradient. This rare flood afforded the opportunity to gain deeper insight into factors influencing the spread of ARGs in watersheds.

57 citations


Journal ArticleDOI
01 Dec 2016-Anaerobe
TL;DR: The effect of quercetin on the growth and genetic expression of three different commensal gut bacteria was documented for the first time and provides insight into the interactions between genetic regulation and growth.

36 citations


Journal ArticleDOI
Hong Tran1, Xiaowei Wu1, Saima Sultana Tithi1, Ming-an Sun1, Hehuang Xie1, Liqing Zhang1 
24 Mar 2016-PLOS ONE
TL;DR: A Bayesian model that assigns multireads to their most likely locations based on the posterior probability derived from information hidden in uniquely aligned reads is developed and shows robust performance with low coverage depth, making it particularly attractive considering the prohibitive cost of bisulfite sequencing.
Abstract: DNA methylation is an epigenetic modification critical for normal development and diseases. The determination of genome-wide DNA methylation at single-nucleotide resolution is made possible by sequencing bisulfite treated DNA with next generation high-throughput sequencing. However, aligning bisulfite short reads to a reference genome remains challenging as only a limited proportion of them (around 50–70%) can be aligned uniquely; a significant proportion, known as multireads, are mapped to multiple locations and thus discarded from downstream analyses, causing financial waste and biased methylation inference. To address this issue, we develop a Bayesian model that assigns multireads to their most likely locations based on the posterior probability derived from information hidden in uniquely aligned reads. Analyses of both simulated data and real hairpin bisulfite sequencing data show that our method can effectively assign approximately 70% of the multireads to their best locations with up to 90% accuracy, leading to a significant increase in the overall mapping efficiency. Moreover, the assignment model shows robust performance with low coverage depth, making it particularly attractive considering the prohibitive cost of bisulfite sequencing. Additionally, results show that longer reads help improve the performance of the assignment model. The assignment model is also robust to varying degrees of methylation and varying sequencing error rates. Finally, incorporating prior knowledge on mutation rate and context specific methylation level into the assignment model increases inference accuracy. The assignment model is implemented in the BAM-ABS package and freely available at https://github.com/zhanglabvt/BAM_ABS.

4 citations


Journal ArticleDOI
TL;DR: Results from this study demonstrate that there is a clear interaction between the polyphenols quercetin and naringenin and the probiotic LGG, and an identifiable pattern of gene expression is revealed.
Abstract: Plant polyphenols quercetin and naringenin are considered healthy dietary compounds; however, little is known of their effect on the probiotic Lactobacillus rhamnosus GG (LGG). In this study it was discovered that both quercetin and naringenin produced temporary inhibition of LGG growth, particularly at 8 hours post inoculation, with LGG eventually recovering from this suppression. The observed growth inhibition was regarded as a phenotypic response of LGG to the polyphenols; we hypothesized that the subsequent recovery was due to unknown, underlying genetic factors. The molecular response of LGG to quercetin and naringenin was determined through RNA analysis using the Helicos single molecule sequencing platform. The expression profiles of LGG grown in the presence of either quercetin or naringenin were divergent from each other, with only a few similarities, indicating that these polyphenols inhibit growth through separate mechanisms. LGG treated with quercetin demonstrated upregulation of genes associated with DNA repair and transcriptional regulation, and a decrease in expression of genes involved in metabolism and protein movement through the cell wall. LGG treated with naringenin resulted in an increase of genes associated with metabolism, and a decrease in genes involved in stress response. Results from this study demonstrate that there is a clear interaction between the polyphenols quercetin and naringenin and the probiotic LGG. The RNA expression analysis provides unique insight into the molecular response of LGG to quercetin and naringenin, revealing an identifiable pattern of gene expression.

3 citations


Proceedings ArticleDOI
Mohammad Shabbir Hasan1, Xiaowei Wu1, Layne T. Watson1, Zhiyi Li1, Liqing Zhang1 
01 Oct 2016
TL;DR: UPS-indel is described, a utility tool that creates a universal positioning system for indels so that equivalent indels can be identified easily by a simple comparison of their coordinates generated by the proposed positioning system.
Abstract: Indel which represents the insertion and deletion of base pairs in the sequence of an organism is a very common form of genetic variation that takes place in the human genome. Being responsible for genetic diversity and human disease, indels have been considered as an important area in the genome research community. With progress in Next Generation Sequencing (NGS), a good number of indel calling tools have been developed and different databases store the results of different indel calling tools for future research. Different indels, though differing in allele sequence and position, can be biologically equivalent when they lead to the same altered sequences. Storing these biologically equivalent indels as distinct entries in databases causes data redundancy. Previous research showed that about 10% human indels stored in dbSNP are redundant due to lack of a unified system for identifying and representing equivalent indels. In this paper we describe UPS-indel, a utility tool that creates a universal positioning system for indels so that equivalent indels can be identified easily by a simple comparison of their coordinates generated by the proposed positioning system. Applying UPS-indel, we identify nearly 15% redundant indels in dbSNP (version 142) across all human chromosomes, higher than the previous report. UPS-indel is written in C++ and is freely available at http://bench.cs.vt.edu/ups-indel.

1 citations


Journal ArticleDOI
TL;DR: An interactive platform named SPAI which stands for Single Platform for Analyzing Indels, a Graphical User Interface (GUI) tool that facilitates users to run several popular indel calling tools and perform several analyses on the indelCalling results without knowing any command line programming.
Abstract: Insertions and Deletions (Indels) are the most common form of structural variation in human genome. Indels not only contribute to genetic diversity but also cause diseases. Therefore assessing indels in human genome has become an interesting topic to the research community. This increasing interest on indel calling research has resulted into the development of a good number of indel calling tools. However, all of these tools are command line based and require expertise from Computer Science (CS) to execute them which makes it challenging for researchers from non-CS background. In this paper, we describe an interactive platform named SPAI which stands for Single Platform for Analyzing Indels. Being a Graphical User Interface (GUI) tool, SPAI facilitates users to run several popular indel calling tools and perform several analyses on the indel calling results without knowing any command line programming. SPAI is written in Java and tested in Linux operating system.

Proceedings ArticleDOI
01 Oct 2016
TL;DR: An automated tool is developed that systematically identifies and consolidates duplicated code blocks and is applied to improve the quality of several commonly used C++ libraries, including SeqAn, BEDtools, and NCBI C++ Toolkit.
Abstract: As computing is an enabling tool of bioinformatics, software quality can influence not only the efficiency of the research process, but also the degree of confidence in scientific findings. As we discovered, popular bioinformatics C++ libraries suffer from problems that make their code hard to maintain, finetune, and extend. In particular, code duplication caused by the ubiquitous copy-and-paste development practice, substantially complicates software maintenance and evolution. The presence of multiple clones of the same code snippet multiples the amount of effort required to modify or extend it. In this paper, we present the results of a systematic study we have conducted to understand the code quality of popular bioinformatics libraries. Based on the results of our study, we developed an automated tool that systematically identifies and consolidates duplicated code blocks. Here we describe our tool—ReBio1—and the results of applying it to improve the quality of several commonly used C++ libraries, including SeqAn, BEDtools, and NCBI C++ Toolkit. Our results reveal that these libraries indeed suffer from poor maintainability, and that our automated tool can effectively improve their quality.