scispace - formally typeset
Search or ask a question
Author

Haizhou Wang

Bio: Haizhou Wang is an academic researcher from New Mexico State University. The author has contributed to research in topics: Inference & Heuristic (computer science). The author has an hindex of 3, co-authored 4 publications receiving 1691 citations.

Papers
More filters
01 Jul 2012
TL;DR: A comprehensive blind assessment of over 30 network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae and in silico microarray data defines the performance, data requirements and inherent biases of different inference approaches, and provides guidelines for algorithm application and development.
Abstract: Reconstructing gene regulatory networks from high-throughput data is a long-standing challenge. Through the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we performed a comprehensive blind assessment of over 30 network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae and in silico microarray data. We characterize the performance, data requirements and inherent biases of different inference approaches, and we provide guidelines for algorithm application and development. We observed that no single inference method performs optimally across all data sets. In contrast, integration of predictions from multiple inference methods shows robust and high performance across diverse data sets. We thereby constructed high-confidence networks for E. coli and S. aureus, each comprising ∼1,700 transcriptional interactions at a precision of ∼50%. We experimentally tested 53 previously unobserved regulatory interactions in E. coli, of which 23 (43%) were supported. Our results establish community-based methods as a powerful and robust tool for the inference of transcriptional gene regulatory networks.

1,355 citations

Journal ArticleDOI
TL;DR: In this paper, a dynamic programming algorithm for optimal one-dimensional clustering is proposed, which is implemented as an R package called Ckmeans.1d.dp.
Abstract: The heuristic k-means algorithm, widely used for cluster analysis, does not guarantee optimality. We developed a dynamic programming algorithm for optimal one-dimensional clustering. The algorithm is implemented as an R package called Ckmeans.1d.dp. We demonstrate its advantage in optimality and runtime over the standard iterative k-means algorithm.

328 citations

Journal ArticleDOI
TL;DR: The HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks, used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model to score networks.
Abstract: It remains unclear whether causal, rather than merely correlational, relationships in molecular networks can be inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective, and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess inferred molecular networks in a causal sense.

231 citations

Journal ArticleDOI
TL;DR: The CGLN method offers constrained network inference without requiring prior probabilities and thus can promote novel interactions, consistent with the discovery process of scientific facts that are not yet in common beliefs.
Abstract: Integrating prior molecular network knowledge into interpretation of new experimental data is routine practice in biology research. However, a dilemma for deciphering interactome using Bayes’ rule is the demotion of novel interactions with low prior probabilities. Here the authors present constrained generalised logical network (CGLN) inference to predict novel interactions in dynamic networks, respecting previously known interactions and observed temporal coherence. It encodes prior interactions as probabilistic logic rules called local constraints, and forms global constraints using observed dynamic patterns. CGLN finds constraint-satisfying trajectories by solving a k-stops problem in the state space of dynamic networks and then reconstructs candidate networks. They benchmarked CGLN on randomly generated networks, and CGLN outperformed its alternatives when 50% or more interactions in a network are given as local constraints. CGLN is then applied to infer dynamic protein interaction networks regulating invadopodium formation in motile cancer cells. CGLN predicted 134 novel protein interactions for their involvement in invadopodium formation. The most frequently predicted interactions centre around focal adhesion kinase and tyrosine kinase substrate TKS4, and 14 interactions are supported by the literature in molecular contexts related to invadopodium formation. As an alternative to the Bayesian paradigm, the CGLN method offers constrained network inference without requiring prior probabilities and thus can promote novel interactions, consistent with the discovery process of scientific facts that are not yet in common beliefs.

1 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: On a compendium of single-cell data from tumors and brain, it is demonstrated that cis-regulatory analysis can be exploited to guide the identification of transcription factors and cell states.
Abstract: We present SCENIC, a computational method for simultaneous gene regulatory network reconstruction and cell-state identification from single-cell RNA-seq data (http://scenicaertslaborg) On a compendium of single-cell data from tumors and brain, we demonstrate that cis-regulatory analysis can be exploited to guide the identification of transcription factors and cell states SCENIC provides critical biological insights into the mechanisms driving cellular heterogeneity

2,277 citations

Journal ArticleDOI
20 Feb 2014-Immunity
TL;DR: By integrating murine data from the ImmGen project, this work proposes a refined, activation-independent core signature for human and murine macrophages that serves as a framework for future research into regulation of macrophage activation in health and disease.

1,648 citations

Posted ContentDOI
31 May 2017-bioRxiv
TL;DR: SCENIC (Single Cell rEgulatory Network Inference and Clustering) is the first method to analyze scRNA-seq data using a network-centric, rather than cell-centric approach and allows for the simultaneous tracing of genomic regulatory programs and the mapping of cellular identities emerging from these programs.
Abstract: Single-cell RNA-seq allows building cell atlases of any given tissue and infer the dynamics of cellular state transitions during developmental or disease trajectories. Both the maintenance and transitions of cell states are encoded by regulatory programs in the genome sequence. However, this regulatory code has not yet been exploited to guide the identification of cellular states from single-cell RNA-seq data. Here we describe a computational resource, called SCENIC (Single Cell rEgulatory Network Inference and Clustering), for the simultaneous reconstruction of gene regulatory networks (GRNs) and the identification of stable cell states, using single-cell RNA-seq data. SCENIC outperforms existing approaches at the level of cell clustering and transcription factor identification. Importantly, we show that cell state identification based on GRNs is robust towards batch-effects and technical-biases. We applied SCENIC to a compendium of single-cell data from the mouse and human brain and demonstrate that the proper combinations of transcription factors, target genes, enhancers, and cell types can be identified. Moreover, we used SCENIC to map the cell state landscape in melanoma and identified a gene regulatory network underlying a proliferative melanoma state driven by MITF and STAT and a contrasting network controlling an invasive state governed by NFATC2 and NFIB. We further validated these predictions by showing that two transcription factors are predominantly expressed in early metastatic sentinel lymph nodes. In summary, SCENIC is the first method to analyze scRNA-seq data using a network-centric, rather than cell-centric approach. SCENIC is generic, easy to use, and flexible, and allows for the simultaneous tracing of genomic regulatory programs and the mapping of cellular identities emerging from these programs. Availability: SCENIC is available as an R workflow based on three new R/Bioconductor packages: GENIE3, RcisTarget and AUCell. As scalable alternative to GENIE3, we also provide GRNboost, paving the way towards the network analysis across millions of single cells.

1,101 citations

Journal ArticleDOI
12 Oct 2012-Cell
TL;DR: It is found that cooperatively bound BATF and IRF4 contribute to initial chromatin accessibility and, with STAT3, initiate a transcriptional program that is then globally tuned by the lineage-specifying TF RORγt, which plays a focal deterministic role at key loci.

1,021 citations

Journal ArticleDOI
TL;DR: It is shown how network techniques can help in the identification of single-target, edgetic, multi-target and allo-network drug target candidates and an optimized protocol of network-aided drug development is suggested, and a list of systems-level hallmarks of drug quality is provided.

806 citations