scispace - formally typeset
Search or ask a question

Showing papers in "BMC Systems Biology in 2013"


Journal ArticleDOI
TL;DR: CoBRA for Python (COBRApy), a Python package that provides support for basic COBRA methods, is described, designed in an object-oriented fashion that facilitates the representation of the complex biological processes of metabolism and gene expression.
Abstract: COnstraint-Based Reconstruction and Analysis (COBRA) methods are widely used for genome-scale modeling of metabolic networks in both prokaryotes and eukaryotes. Due to the successes with metabolism, there is an increasing effort to apply COBRA methods to reconstruct and analyze integrated models of cellular processes. The COBRA Toolbox for MATLAB is a leading software package for genome-scale analysis of metabolism; however, it was not designed to elegantly capture the complexity inherent in integrated biological networks and lacks an integration framework for the multiomics data used in systems biology. The openCOBRA Project is a community effort to promote constraints-based research through the distribution of freely available software. Here, we describe COBRA for Python (COBRApy), a Python package that provides support for basic COBRA methods. COBRApy is designed in an object-oriented fashion that facilitates the representation of the complex biological processes of metabolism and gene expression. COBRApy does not require MATLAB to function; however, it includes an interface to the COBRA Toolbox for MATLAB to facilitate use of legacy codes. For improved performance, COBRApy includes parallel processing support for computationally intensive processes. COBRApy is an object-oriented framework designed to meet the computational challenges associated with the next generation of stoichiometric constraint-based models and high-density omics data sets. http://opencobra.sourceforge.net/

881 citations


Journal ArticleDOI
TL;DR: A computational framework to identify miRNA-disease associations is developed, and a bipartite miRNAs-Disease network is constructed for systematically analyzing the global properties of miRNA regulation of disease genes.
Abstract: Background MicroRNAs (miRNAs) are important post-transcriptional regulators that have been demonstrated to play an important role in human diseases. Elucidating the associations between miRNAs and diseases at the systematic level will deepen our understanding of the molecular mechanisms of diseases. However, miRNA-disease associations identified by previous computational methods are far from completeness and more effort is needed.

202 citations


Journal ArticleDOI
TL;DR: With SignaLink 2 as a single resource, users can effectively analyze signaling pathways, scaffold proteins, modifier enzymes, transcription factors and miRNAs that are important in the regulation of signaling processes.
Abstract: Signaling networks in eukaryotes are made up of upstream and downstream subnetworks. The upstream subnetwork contains the intertwined network of signaling pathways, while the downstream regulatory part contains transcription factors and their binding sites on the DNA as well as microRNAs and their mRNA targets. Currently, most signaling and regulatory databases contain only a subsection of this network, making comprehensive analyses highly time-consuming and dependent on specific data handling expertise. The need for detailed mapping of signaling systems is also supported by the fact that several drug development failures were caused by undiscovered cross-talk or regulatory effects of drug targets. We previously created a uniformly curated signaling pathway resource, SignaLink, to facilitate the analysis of pathway cross-talks. Here, we present SignaLink 2, which significantly extends the coverage and applications of its predecessor. We developed a novel concept to integrate and utilize different subsections (i.e., layers) of the signaling network. The multi-layered (onion-like) database structure is made up of signaling pathways, their pathway regulators (e.g., scaffold and endocytotic proteins) and modifier enzymes (e.g., phosphatases, ubiquitin ligases), as well as transcriptional and post-transcriptional regulators of all of these components. The user-friendly website allows the interactive exploration of how each signaling protein is regulated. The customizable download page enables the analysis of any user-specified part of the signaling network. Compared to other signaling resources, distinctive features of SignaLink 2 are the following: 1) it involves experimental data not only from humans but from two invertebrate model organisms, C. elegans and D. melanogaster; 2) combines manual curation with large-scale datasets; 3) provides confidence scores for each interaction; 4) operates a customizable download page with multiple file formats (e.g., BioPAX, Cytoscape, SBML). Non-profit users can access SignaLink 2 free of charge at http://SignaLink.org . With SignaLink 2 as a single resource, users can effectively analyze signaling pathways, scaffold proteins, modifier enzymes, transcription factors and miRNAs that are important in the regulation of signaling processes. This integrated resource allows the systems-level examination of how cross-talks and signaling flow are regulated, as well as provide data for cross-species comparisons and drug discovery analyses.

184 citations


Journal ArticleDOI
TL;DR: The Path2Models project has automatically generated mathematical models from pathway representations using a suite of freely available software, resulting in more than 140 000 freely available models.
Abstract: Background: Systems biology projects and omics technologies have led to a growing number of biochemical pathway models and reconstructions. However, the majority of these models are still created de novo, based on literature mining and the manual processing of pathway data. Results: To increase the efficiency of model creation, the Path2Models project has automatically generated mathematical models from pathway representations using a suite of freely available software. Data sources include KEGG, BioCarta, MetaCyc and SABIO-RK. Depending on the source data, three types of models are provided: kinetic, logical and constraint-based. Models from over 2 600 organisms are encoded consistently in SBML, and are made freely available through BioModels Database at http://www.ebi.ac.uk/biomodels-main/path2models. Each model contains the list of participants, their interactions, the relevant mathematical constructs, and initial parameter values. Most models are also available as easy-to-understand graphical SBGN maps.

161 citations


Journal ArticleDOI
TL;DR: 3Omics incorporates the advantages and functionality of existing software into a single platform, thereby simplifying data analysis and enabling the user to perform a one-click integrated analysis.
Abstract: Background: Integrative and comparative analyses of multiple transcriptomics, proteomics and metabolomics datasets require an intensive knowledge of tools and background concepts. Thus, it is challenging for users to perform such analyses, highlighting the need for a single tool for such purposes. The 3Omics one-click web tool was developed to visualize and rapidly integrate multiple human inter- or intra-transcriptomic, proteomic, and metabolomic data by combining five commonly used analyses: correlation networking, coexpression, phenotyping, pathway enrichment, and GO (Gene Ontology) enrichment. Results: 3Omics generates inter-omic correlation networks to visualize relationships in data with respect to time or experimental conditions for all transcripts, proteins and metabolites. If only two of three omics datasets are input, then 3Omics supplements the missing transcript, protein or metabolite information related to the input data by text-mining the PubMed database. 3Omics’ coexpression analysis assists in revealing functions shared among different omics datasets. 3Omics’ phenotype analysis integrates Online Mendelian Inheritance in Man with available transcript or protein data. Pathway enrichment analysis on metabolomics data by 3Omics reveals enriched pathways in the KEGG/HumanCyc database. 3Omics performs statistical Gene Ontology-based functional enrichment analyses to display significantly overrepresented GO terms in transcriptomic experiments. Although the principal application of 3Omics is the integration of multiple omics datasets, it is also capable of analyzing individual omics datasets. The information obtained from the analyses of 3Omics in Case Studies 1 and 2 are also in accordance with comprehensive findings in the literature. Conclusions: 3Omics incorporates the advantages and functionality of existing software into a single platform, thereby simplifying data analysis and enabling the user to perform a one-click integrated analysis. Visualization and analysis results are downloadable for further user customization and analysis. The 3Omics software can be freely accessed at http://3omics.cmdm.tw.

155 citations


Journal ArticleDOI
TL;DR: The Systems Biology Markup Language (SBML) Qualitative Models Package (qual) as discussed by the authors is an extension of the SBML Level 3 standard designed for computer representation of qualitative models of biological networks.
Abstract: Background: Qualitative frameworks, especially those based on the logical discrete formalism, are increasingly used to model regulatory and signalling networks. A major advantage of these frameworks is that they do not require precise quantitative data, and that they are well-suited for studies of large networks. While numerous groups have developed specific computational tools that provide original methods to analyse qualitative models, a standard format to exchange qualitative models has been missing. Results: We present the Systems Biology Markup Language (SBML) Qualitative Models Package (“qual”), an extension of the SBML Level 3 standard designed for computer representation of qualitative models of biological networks. We demonstrate the interoperability of models via SBML qual through the analysis of a specific signalling network by three independent software tools. Furthermore, the collective effort to define the SBML qual format paved the way for the development of LogicalModel, an open-source model library, which will facilitate the adoption of the format as well as the collaborative development of algorithms to analyse qualitative models. Conclusions: SBML qual allows the exchange of qualitative models among a number of complementary software tools. SBML qual has the potential to promote collaborative work on the development of novel computational approaches, as well as on the specification and the analysis of comprehensive qualitative models of regulatory and signalling networks.

129 citations


Journal ArticleDOI
TL;DR: Sybil as mentioned in this paper is an open source software library for constraint-based analyses in R. Sybil provides efficient methods for flux-balance analysis (FBA), MOMA, and ROOM that are about ten times faster than previous implementations when calculating the effect of whole-genome single gene deletions in silico on a complete E. coli metabolic model.
Abstract: Constraint-based analyses of metabolic networks are widely used to simulate the properties of genome-scale metabolic networks. Publicly available implementations tend to be slow, impeding large scale analyses such as the genome-wide computation of pairwise gene knock-outs, or the automated search for model improvements. Furthermore, available implementations cannot easily be extended or adapted by users. Here, we present sybil, an open source software library for constraint-based analyses in R; R is a free, platform-independent environment for statistical computing and graphics that is widely used in bioinformatics. Among other functions, sybil currently provides efficient methods for flux-balance analysis (FBA), MOMA, and ROOM that are about ten times faster than previous implementations when calculating the effect of whole-genome single gene deletions in silico on a complete E. coli metabolic model. Due to the object-oriented architecture of sybil, users can easily build analysis pipelines in R or even implement their own constraint-based algorithms. Based on its highly efficient communication with different mathematical optimisation programs, sybil facilitates the exploration of high-dimensional optimisation problems on small time scales. Sybil and all its dependencies are open source. Sybil and its documentation are available for download from the comprehensive R archive network (CRAN).

117 citations


Journal ArticleDOI
TL;DR: The FluMap is a comprehensive pathway map that can serve as a graphically presented knowledge-base and as a platform to analyze functional interactions between IAV and host factors and demonstrate computational network analyses to identify targets using the FluMap.
Abstract: Background Influenza is a common infectious disease caused by influenza viruses. Annual epidemics cause severe illnesses, deaths, and economic loss around the world. To better defend against influenza viral infection, it is essential to understand its mechanisms and associated host responses. Many studies have been conducted to elucidate these mechanisms, however, the overall picture remains incompletely understood. A systematic understanding of influenza viral infection in host cells is needed to facilitate the identification of influential host response mechanisms and potential drug targets.

117 citations


Journal ArticleDOI
TL;DR: IO977 is a comprehensive genome-scale metabolic model that contains more reactions, metabolites and genes than previous models that was used for simulating the yeast metabolism under four different growth conditions and experimental data from these four conditions was integrated to the model.
Abstract: Background: The genome-scale metabolic model of Saccharomyces cerevisiae, first presented in 2003, was the first genome-scale network reconstruction for a eukaryotic organism. Since then continuous efforts have been made in order to improve and expand the yeast metabolic network. Results: Here we present iTO977, a comprehensive genome-scale metabolic model that contains more reactions, metabolites and genes than previous models. The model was constructed based on two earlier reconstructions, namely iIN800 and the consensus network, and then improved and expanded using gap-filling methods and by introducing new reactions and pathways based on studies of the literature and databases. The model was shown to perform well both for growth simulations in different media and gene essentiality analysis for single and double knock-outs. Further, the model was used as a scaffold for integrating transcriptomics, and flux data from four different conditions in order to identify transcriptionally controlled reactions, i.e. reactions that change both in flux and transcription between the compared conditions. Conclusion: We present a new yeast model that represents a comprehensive up-to-date collection of knowledge on yeast metabolism. The model was used for simulating the yeast metabolism under four different growth conditions and experimental data from these four conditions was integrated to the model. The model together with experimental data is a useful tool to identify condition-dependent changes of metabolism between different environmental conditions.

116 citations


Journal ArticleDOI
TL;DR: The approach has taken a more holistic approach by considering drug-disease relationships also and considered not only gene but also other features to build the disease drug networks, which could complement the current computational approaches for drug repositioning candidate discovery.
Abstract: Given the costly and time consuming process and high attrition rates in drug discovery and development, drug repositioning or drug repurposing is considered as a viable strategy both to replenish the drying out drug pipelines and to surmount the innovation gap. Although there is a growing recognition that mechanistic relationships from molecular to systems level should be integrated into drug discovery paradigms, relatively few studies have integrated information about heterogeneous networks into computational drug-repositioning candidate discovery platforms. Using known disease-gene and drug-target relationships from the KEGG database, we built a weighted disease and drug heterogeneous network. The nodes represent drugs or diseases while the edges represent shared gene, biological process, pathway, phenotype or a combination of these features. We clustered this weighted network to identify modules and then assembled all possible drug-disease pairs (putative drug repositioning candidates) from these modules. We validated our predictions by testing their robustness and evaluated them by their overlap with drug indications that were either reported in published literature or investigated in clinical trials. Previous computational approaches for drug repositioning focused either on drug-drug and disease-disease similarity approaches whereas we have taken a more holistic approach by considering drug-disease relationships also. Further, we considered not only gene but also other features to build the disease drug networks. Despite the relative simplicity of our approach, based on the robustness analyses and the overlap of some of our predictions with drug indications that are under investigation, we believe our approach could complement the current computational approaches for drug repositioning candidate discovery.

115 citations


Journal ArticleDOI
TL;DR: The integrated model of the circadian clock circuit and ABA-regulated environmental sensing allowed us to explain multiple experimental observations on the timing and stomatal responses to genetic and environmental perturbations, and crystallise a new role of TOC1 as an environmental sensor.
Abstract: Background: 24-hour biological clocks are intimately connected to the cellular signalling network, which complicates the analysis of clock mechanisms. The transcriptional regulator TOC1 (TIMING OF CAB EXPRESSION 1) is a founding component of the gene circuit in the plant circadian clock. Recent results show that TOC1 suppresses transcription of multiple target genes within the clock circuit, far beyond its previously-described regulation of the morning transcription factors LHY (LATE ELONGATED HYPOCOTYL) and CCA1 (CIRCADIAN CLOCK ASSOCIATED 1). It is unclear how this pervasive effect of TOC1 affects the dynamics of the clock and its outputs. TOC1 also appears to function in a nested feedback loop that includes signalling by the plant hormone Abscisic Acid (ABA), which is upregulated by abiotic stresses, such as drought. ABA treatments both alter TOC1 levels and affect the clock’s timing behaviour. Conversely, the clock rhythmically modulates physiological processes induced by ABA, such as the closing of stomata in the leaf epidermis. In order to understand the dynamics of the clock and its outputs under changing environmental conditions, the reciprocal interactions between the clock and other signalling pathways must be integrated. Results: We extended the mathematical model of the plant clock gene circuit by incorporating the repression of multiple clock genes by TOC1, observed experimentally. The revised model more accurately matches the data on the clock’s molecular profiles and timing behaviour, explaining the clock’s responses in TOC1 over-expression and toc1 mutant plants. A simplified representation of ABA signalling allowed us to investigate the interactions of ABA and circadian pathways. Increased ABA levels lengthen the free-running period of the clock, consistent with the experimental data. Adding stomatal closure to the model, as a key ABA- and clock-regulated downstream process allowed to describe TOC1 effects on the rhythmic gating of stomatal closure. Conclusions: The integrated model of the circadian clock circuit and ABA-regulated environmental sensing allowed us to explain multiple experimental observations on the timing and stomatal responses to genetic and environmental perturbations. These results crystallise a new role of TOC1 as an environmental sensor, which both affects the pace of the central oscillator and modulates the kinetics of downstream processes.

Journal ArticleDOI
TL;DR: A move from a descriptive approach to a predictive one: rather than correlating biological network topology to generic properties such as robustness, it is used to predict specific functions or phenotypes, which points to new avenues of research.
Abstract: Molecular interactions are often represented as network models which have become the common language of many areas of biology. Graphs serve as convenient mathematical representations of network models and have themselves become objects of study. Their topology has been intensively researched over the last decade after evidence was found that they share underlying design principles with many other types of networks. Initial studies suggested that molecular interaction network topology is related to biological function and evolution. However, further whole-network analyses did not lead to a unified view on what this relation may look like, with conclusions highly dependent on the type of molecular interactions considered and the metrics used to study them. It is unclear whether global network topology drives function, as suggested by some researchers, or whether it is simply a byproduct of evolution or even an artefact of representing complex molecular interaction networks as graphs. Nevertheless, network biology has progressed significantly over the last years. We review the literature, focusing on two major developments. First, realizing that molecular interaction networks can be naturally decomposed into subsystems (such as modules and pathways), topology is increasingly studied locally rather than globally. Second, there is a move from a descriptive approach to a predictive one: rather than correlating biological network topology to generic properties such as robustness, it is used to predict specific functions or phenotypes. Taken together, this change in focus from globally descriptive to locally predictive points to new avenues of research. In particular, multi-scale approaches are developments promising to drive the study of molecular interaction networks further.

Journal ArticleDOI
TL;DR: The present approach contributes to the understanding of lag phase, the least studied of bacterial growth phases, by developing an assay based on imaging flow cytometry of fluorescent reporter cells that overcomes the challenges inherent in studying lag phase.
Abstract: Background: Lag phase is a period of time with no growth that occurs when stationary phase bacteria are transferred to a fresh medium. Bacteria in lag phase seem inert: their biomass does not increase. The low number of cells and low metabolic activity make it difficult to study this phase. As a consequence, it has not been studied as thoroughly as other bacterial growth phases. However, lag phase has important implications for bacterial infections and food safety. We asked which, if any, genes are expressed in the lag phase of Escherichia coli, and what is their dynamic expression pattern. Results: We developed an assay based on imaging flow cytometry of fluorescent reporter cells that overcomes the challenges inherent in studying lag phase. We distinguish between lag1 phase- in which there is no biomass growth, and lag2 phase- in which there is biomass growth but no cell division. We find that in lag1 phase, most promoters are not active, except for the enzymes that utilize the specific carbon source in the medium. These genes show promoter activities that increase exponentially with time, despite the fact that the cells do not measurably increase in size. An oxidative stress promoter, katG, is also active. When cells enter lag2 and begin to grow in size, they switch to a full growth program of promoter activity including ribosomal and metabolic genes. Conclusions: The observed exponential increase in enzymes for the specific carbon source followed by an abrupt switch to production of general growth genes is a solution of an optimal control model, known as bang-bang control. The present approach contributes to the understanding of lag phase, the least studied of bacterial growth phases.

Journal ArticleDOI
TL;DR: Experimental results on model plant Arabidopsis thaliana show that, compared to an existing approach, HPGA reduces the error rate of measuring plant area by half, and raises a hypothesis that knocking out cfq changes the sensitivity of the energy distribution under fluctuating light conditions to repress leaf growth.
Abstract: Taking advantage of the current rapid development in imaging systems and computer vision algorithms, we present HPGA, a h igh-throughput p henotyping platform for plant g rowth modeling and functional a nalysis, which produces better understanding of energy distribution in regards of the balance between growth and defense. HPGA has two components, PAE (Plant Area Estimation) and GMA (Growth Modeling and Analysis). In PAE, by taking the complex leaf overlap problem into consideration, the area of every plant is measured from top-view images in four steps. Given the abundant measurements obtained with PAE, in the second module GMA, a nonlinear growth model is applied to generate growth curves, followed by functional data analysis. Experimental results on model plant Arabidopsis thaliana show that, compared to an existing approach, HPGA reduces the error rate of measuring plant area by half. The application of HPGA on the cfq mutant plants under fluctuating light reveals the correlation between low photosynthetic rates and small plant area (compared to wild type), which raises a hypothesis that knocking out cfq changes the sensitivity of the energy distribution under fluctuating light conditions to repress leaf growth. HPGA is available at http://www.msu.edu/~jinchen/HPGA .

Journal ArticleDOI
TL;DR: A STRING-based stress response network model integrating important players for the general and specialized metabolite stress response in C. acetobutylicum is built, informing the molecular basis of Clostridium responses to toxic metabolites in natural ecosystems and the microbiome.
Abstract: Background Organisms of the genus Clostridium are Gram-positive endospore formers of great importance to the carbon cycle, human normo- and pathophysiology, but also in biofuel and biorefinery applications. Exposure of Clostridium organisms to chemical and in particular toxic metabolite stress is ubiquitous in both natural (such as in the human microbiome) and engineered environments, engaging both the general stress response as well as specialized programs. Yet, despite its fundamental and applied significance, it remains largely unexplored at the systems level.

Journal ArticleDOI
TL;DR: It is demonstrated that phenotypic noise does differ quantitatively between natural populations, which supports the possibility that, if noise is adaptive, microevolution may tune it in the wild.
Abstract: Background Most quantitative measures of phenotypic traits represent macroscopic contributions of large numbers of cells. Yet, cells of a tissue do not behave similarly, and molecular studies on several organisms have shown that regulations can be highly stochastic, sometimes generating diversified cellular phenotypes within tissues. Phenotypic noise, defined here as trait variability among isogenic cells of the same type and sharing a common environment, has therefore received a lot of attention. Given the potential fitness advantage provided by phenotypic noise in fluctuating environments, the possibility that it is directly subjected to evolutionary selection is being considered. For selection to act, phenotypic noise must differ between contemporary genotypes. Whether this is the case or not remains, however, unclear because phenotypic noise has very rarely been quantified in natural populations.

Journal ArticleDOI
Anat Bren1, Yuval Hart1, Erez Dekel1, Daniel A. Koster1, Uri Alon1 
TL;DR: The observed sharp stop of growth accompanied by a pulsed expression of assimilation genes allows bacteria to compensate for the drop in nutrients, suggesting a strategy used by the cells to prolong exponential growth under limiting substrate.
Abstract: Bacterial growth as a function of nutrients has been studied for decades, but is still not fully understood. In particular, the growth laws under dynamically changing environments have been difficult to explore, because of the rapidly changing conditions. Here, we address this challenge by means of a robotic assay and measure bacterial growth rate, promoter activity and substrate level at high temporal resolution across the entire growth curve in batch culture. As a model system, we study E. coli growing under nitrogen or carbon limitation, and explore the dynamics in the last generation of growth where nutrient levels can drop rapidly. We find that growth stops abruptly under limiting nitrogen or carbon, but slows gradually when nutrients are not limiting. By measuring growth rate at a 3 min time resolution, and inferring the instantaneous substrate level, s, we find that the reduction in growth rate μ under nutrient limitation follows Monod’s law, . By following promoter activity of different genes we found that the abrupt stop of growth under nitrogen or carbon limitation is accompanied by a pulse-like up-regulation of the expression of genes in the relevant nutrient assimilation pathways. We further find that sharp stop of growth is conditional on the presence of regulatory proteins in the assimilation pathway. The observed sharp stop of growth accompanied by a pulsed expression of assimilation genes allows bacteria to compensate for the drop in nutrients, suggesting a strategy used by the cells to prolong exponential growth under limiting substrate.

Journal ArticleDOI
TL;DR: WGCNA represents an alternative strategy to large scale sequencing for the identification of potential oncogenic drivers, based on a systems view of signaling networks, and identifies spleen tyrosine kinase (SYK) both as a candidate biomarker to stratify SCLC patients and as a potential therapeutic target.
Abstract: Oncogenic mechanisms in small-cell lung cancer remain poorly understood leaving this tumor with the worst prognosis among all lung cancers. Unlike other cancer types, sequencing genomic approaches have been of limited success in small-cell lung cancer, i.e., no mutated oncogenes with potential driver characteristics have emerged, as it is the case for activating mutations of epidermal growth factor receptor in non-small-cell lung cancer. Differential gene expression analysis has also produced SCLC signatures with limited application, since they are generally not robust across datasets. Nonetheless, additional genomic approaches are warranted, due to the increasing availability of suitable small-cell lung cancer datasets. Gene co-expression network approaches are a recent and promising avenue, since they have been successful in identifying gene modules that drive phenotypic traits in several biological systems, including other cancer types. We derived an SCLC-specific classifier from weighted gene co-expression network analysis (WGCNA) of a lung cancer dataset. The classifier, termed SCLC-specific hub network (SSHN), robustly separates SCLC from other lung cancer types across multiple datasets and multiple platforms, including RNA-seq and shotgun proteomics. The classifier was also conserved in SCLC cell lines. SSHN is enriched for co-expressed signaling network hubs strongly associated with the SCLC phenotype. Twenty of these hubs are actionable kinases with oncogenic potential, among which spleen tyrosine kinase (SYK) exhibits one of the highest overall statistical associations to SCLC. In patient tissue microarrays and cell lines, SCLC can be separated into SYK-positive and -negative. SYK siRNA decreases proliferation rate and increases cell death of SYK-positive SCLC cell lines, suggesting a role for SYK as an oncogenic driver in a subset of SCLC. SCLC treatment has thus far been limited to chemotherapy and radiation. Our WGCNA analysis identifies SYK both as a candidate biomarker to stratify SCLC patients and as a potential therapeutic target. In summary, WGCNA represents an alternative strategy to large scale sequencing for the identification of potential oncogenic drivers, based on a systems view of signaling networks. This strategy is especially useful in cancer types where no actionable mutations have emerged.

Journal ArticleDOI
TL;DR: The NetGenerator V2.0 algorithm, a heuristic for network inference, is proposed and described, which automatically generates a system of differential equations modelling structure and dynamics of the network based on time-resolved gene expression data.
Abstract: Background: Inference of gene-regulatory networks (GRNs) is important for understanding behaviour and potential treatment of biological systems. Knowledge about GRNs gained from transcriptome analysis can be increased by multiple experiments and/or multiple stimuli. Since GRNs are complex and dynamical, appropriate methods and algorithms are needed for constructing models describing these dynamics. Algorithms based on heuristic approaches reduce the effort in parameter identification and computation time. Results: The NetGenerator V2.0 algorithm, a heuristic for network inference, is proposed and described. It automatically generates a system of differential equations modelling structure and dynamics of the network based on time-resolved gene expression data. In contrast to a previous version, the inference considers multi-stimuli multi-experiment data and contains different methods for integrating prior knowledge. The resulting significant changes in the algorithmic procedures are explained in detail. NetGenerator is applied to relevant benchmark examples evaluating the inference for data from experiments with different stimuli. Also, the underlying GRN of chondrogenic differentiation, a real-world multi-stimulus problem, is inferred and analysed. Conclusions: NetGenerator is able to determine the structure and parameters of GRNs and their dynamics. The new features of the algorithm extend the range of possible experimental set-ups, results and biological interpretations. Based upon benchmarks, the algorithm provides good results in terms of specificity, sensitivity, efficiency and model fit.

Journal ArticleDOI
TL;DR: In this article, the authors identify and quantify the role of cell motility, cell-to-cell adhesion, and cell proliferation in cell colony expansion, and use this information to understand how each mechanism contributes to the expansion process.
Abstract: Background The expansion of cell colonies is driven by a delicate balance of several mechanisms including cell motility, cell–to–cell adhesion and cell proliferation. New approaches that can be used to independently identify and quantify the role of each mechanism will help us understand how each mechanism contributes to the expansion process. Standard mathematical modelling approaches to describe such cell colony expansion typically neglect cell–to–cell adhesion, despite the fact that cell–to-cell adhesion is thought to play an important role.

Journal ArticleDOI
TL;DR: A framework for integration of subset models, based on a system biology approach, is proposed and demonstrated how this framework can be used to integrate mathematical models of the immune response from several published sources and describe qualitative predictions of global immune system response arising from the integrated, hybrid model.
Abstract: The complexity and multiscale nature of the mammalian immune response provides an excellent test bed for the potential of mathematical modeling and simulation to facilitate mechanistic understanding. Historically, mathematical models of the immune response focused on subsets of the immune system and/or specific aspects of the response. Mathematical models have been developed for the humoral side of the immune response, or for the cellular side, or for cytokine kinetics, but rarely have they been proposed to encompass the overall system complexity. We propose here a framework for integration of subset models, based on a system biology approach. A dynamic simulator, the Fully-integrated Immune Response Model (FIRM), was built in a stepwise fashion by integrating published subset models and adding novel features. The approach used to build the model includes the formulation of the network of interacting species and the subsequent introduction of rate laws to describe each biological process. The resulting model represents a multi-organ structure, comprised of the target organ where the immune response takes place, circulating blood, lymphoid T, and lymphoid B tissue. The cell types accounted for include macrophages, a few T-cell lineages (cytotoxic, regulatory, helper 1, and helper 2), and B-cell activation to plasma cells. Four different cytokines were accounted for: IFN-γ, IL-4, IL-10 and IL-12. In addition, generic inflammatory signals are used to represent the kinetics of IL-1, IL-2, and TGF-β. Cell recruitment, differentiation, replication, apoptosis and migration are described as appropriate for the different cell types. The model is a hybrid structure containing information from several mammalian species. The structure of the network was built to be physiologically and biochemically consistent. Rate laws for all the cellular fate processes, growth factor production rates and half-lives, together with antibody production rates and half-lives, are provided. The results demonstrate how this framework can be used to integrate mathematical models of the immune response from several published sources and describe qualitative predictions of global immune system response arising from the integrated, hybrid model. In addition, we show how the model can be expanded to include novel biological findings. Case studies were carried out to simulate TB infection, tumor rejection, response to a blood borne pathogen and the consequences of accounting for regulatory T-cells. The final result of this work is a postulated and increasingly comprehensive representation of the mammalian immune system, based on physiological knowledge and susceptible to further experimental testing and validation. We believe that the integrated nature of FIRM has the potential to simulate a range of responses under a variety of conditions, from modeling of immune responses after tuberculosis (TB) infection to tumor formation in tissues. FIRM also has the flexibility to be expanded to include both complex and novel immunological response features as our knowledge of the immune system advances.

Journal ArticleDOI
TL;DR: In this article, the authors investigated parameter correlations in nonlinear dynamic models and found that a biological model usually contains a large number of correlated parameters leading to non-identifiability problems.
Abstract: Background One of the challenging tasks in systems biology is parameter estimation in nonlinear dynamic models. A biological model usually contains a large number of correlated parameters leading to non-identifiability problems. Although many approaches have been developed to address both structural and practical non-identifiability problems, very few studies have been made to systematically investigate parameter correlations.

Journal ArticleDOI
TL;DR: Network-based correlation analysis identified conserved metabolites including malate, pyruvate, 2-oxoglutarate, glutamate and fructose-6-phosphate, which may provide a more significant marker of hypoxia in cancer.
Abstract: Background Metabolomics has become increasingly popular in the study of disease phenotypes and molecular pathophysiology. One branch of metabolomics that encompasses the high-throughput screening of cellular metabolism is metabolic profiling. In the present study, the metabolic profiles of different tumour cells from colorectal carcinoma and breast adenocarcinoma were exposed to hypoxic and normoxic conditions and these have been compared to reveal the potential metabolic effects of hypoxia on the biochemistry of the tumour cells; this may contribute to their survival in oxygen compromised environments. In an attempt to analyse the complex interactions between metabolites beyond routine univariate and multivariate data analysis methods, correlation analysis has been integrated with a human metabolic reconstruction to reveal connections between pathways that are associated with normoxic or hypoxic oxygen environments.

Journal ArticleDOI
TL;DR: The NaviCell project as discussed by the authors is one of the first efforts to combine these capabilities together in one environment, and it is based on the work of the authors of this paper.
Abstract: Background Molecular biology knowledge can be formalized and systematically represented in a computer-readable form as a comprehensive map of molecular interactions. There exist an increasing number of maps of molecular interactions containing detailed and step-wise description of various cell mechanisms. It is difficult to explore these large maps, to organize discussion of their content and to maintain them. Several efforts were recently made to combine these capabilities together in one environment, and NaviCell is one of them.

Journal ArticleDOI
TL;DR: A precise method for processing and converting KEGG pathways into initial metabolic and signaling models encoded in the standardized community pathway formats SBML and BioPAX and there is no other approach able to appropriately construct metabolic models from K EGG pathways, including correct reactions with stoichiometry.
Abstract: The KEGG PATHWAY database provides a plethora of pathways for a diversity of organisms. All pathway components are directly linked to other KEGG databases, such as KEGG COMPOUND or KEGG REACTION. Therefore, the pathways can be extended with an enormous amount of information and provide a foundation for initial structural modeling approaches. As a drawback, KGML-formatted KEGG pathways are primarily designed for visualization purposes and often omit important details for the sake of a clear arrangement of its entries. Thus, a direct conversion into systems biology models would produce incomplete and erroneous models. Here, we present a precise method for processing and converting KEGG pathways into initial metabolic and signaling models encoded in the standardized community pathway formats SBML (Levels 2 and 3) and BioPAX (Levels 2 and 3). This method involves correcting invalid or incomplete KGML content, creating complete and valid stoichiometric reactions, translating relations to signaling models and augmenting the pathway content with various information, such as cross-references to Entrez Gene, OMIM, UniProt ChEBI, and many more. Finally, we compare several existing conversion tools for KEGG pathways and show that the conversion from KEGG to BioPAX does not involve a loss of information, whilst lossless translations to SBML can only be performed using SBML Level 3, including its recently proposed qualitative models and groups extension packages. Building correct BioPAX and SBML signaling models from the KEGG database is a unique characteristic of the proposed method. Further, there is no other approach that is able to appropriately construct metabolic models from KEGG pathways, including correct reactions with stoichiometry. The resulting initial models, which contain valid and comprehensive SBML or BioPAX code and a multitude of cross-references, lay the foundation to facilitate further modeling steps.

Journal ArticleDOI
TL;DR: An in-depth overview of the BiNoM functions is provided, and novel aspects such as the support of the BioPAX Level 3 format and the implementation of a new algorithm for the quantification of pathways for influence networks are detailed.
Abstract: Background Public repositories of biological pathways and networks have greatly expanded in recent years. Such databases contain many pathways that facilitate the analysis of high-throughput experimental work and the formulation of new biological hypotheses to be tested, a fundamental principle of the systems biology approach. However, large-scale molecular maps are not always easy to mine and interpret.

Journal ArticleDOI
TL;DR: In silico experiments show that the response of the LuxR/LuxI system depends on the interplay between non-stationary and stochastic effects and that the burst size of the transcription/translation noise at the level of LuxR controls the phenotypic variability of the population.
Abstract: A wide range of bacteria species are known to communicate through the so called quorum sensing (QS) mechanism by means of which they produce a small molecule that can freely diffuse in the environment and in the cells. Upon reaching a threshold concentration, the signalling molecule activates the QS-controlled genes that promote phenotypic changes. This mechanism, for its simplicity, has become the model system for studying the emergence of a global response in prokaryotic cells. Yet, how cells precisely measure the signal concentration and act coordinately, despite the presence of fluctuations that unavoidably affects cell regulation and signalling, remains unclear. We propose a model for the QS signalling mechanism in Vibrio fischeri based on the synthetic strains lux01 and lux02. Our approach takes into account the key regulatory interactions between LuxR and LuxI, the autoinducer transport, the cellular growth and the division dynamics. By using both deterministic and stochastic models, we analyze the response and dynamics at the single-cell level and compare them to the global response at the population level. Our results show how fluctuations interfere with the synchronization of the cell activation and lead to a bimodal phenotypic distribution. In this context, we introduce the concept of precision in order to characterize the reliability of the QS communication process in the colony. We show that increasing the noise in the expression of LuxR helps cells to get activated at lower autoinducer concentrations but, at the same time, slows down the global response. The precision of the QS switch under non-stationary conditions decreases with noise, while at steady-state it is independent of the noise value. Our in silico experiments show that the response of the LuxR/LuxI system depends on the interplay between non-stationary and stochastic effects and that the burst size of the transcription/translation noise at the level of LuxR controls the phenotypic variability of the population. These results, together with recent experimental evidences on LuxR regulation in wild-type species, suggest that bacteria have evolved mechanisms to regulate the intensity of those fluctuations.

Journal ArticleDOI
TL;DR: The integrated approach gives a more complete picture of the set of miRNAs identified and the Wnt pathway, which represents an important surrogate marker of melanoma progression, which shows its promising potential.
Abstract: High-throughput (omic) data have become more widespread in both quantity and frequency of use, thanks to technological advances, lower costs and higher precision. Consequently, computational scientists are confronted by two parallel challenges: on one side, the design of efficient methods to interpret each of these data in their own right (gene expression signatures, protein markers, etc.) and, on the other side, realization of a novel, pressing request from the biological field to design methodologies that allow for these data to be interpreted as a whole, i.e. not only as the union of relevant molecules in each of these layers, but as a complex molecular signature containing proteins, mRNAs and miRNAs, all of which must be directly associated in the results of analyses that are able to capture inter-layers connections and complexity.

Journal ArticleDOI
TL;DR: The application of instationary 13C-based metabolic flux analysis to P. pastoris provides an experimental framework with improved capabilities to explore the regulation of the carbon and energy metabolism of this yeast, particularly for the case of methanol and multicarbon source metabolism.
Abstract: Background: Several studies have shown that the utilization of mixed carbon feeds instead of methanol as sole carbon source is beneficial for protein production with the methylotrophic yeast Pichia pastoris. In particular, growth under mixed feed conditions appears to alleviate the metabolic burden related to stress responses triggered by protein overproduction and secretion. Yet, detailed analysis of the metabolome and fluxome under mixed carbon source metabolizing conditions are missing. To obtain a detailed flux distribution of central carbon metabolism, including the pentose phosphate pathway under methanol-glucose conditions, we have applied metabolomics and instationary 13 C flux analysis in chemostat cultivations. Results: Instationary 13 C-based metabolic flux analysis using GC-MS and LC-MS measurements in time allowed for an accurate mapping of metabolic fluxes of glycolysis, pentose phosphate and methanol assimilation pathways. Compared to previous results from NMR-derived stationary state labelling data (proteinogenic amino acids, METAFoR) more fluxes could be determined with higher accuracy. Furthermore, using a thermodynamic metabolic network analysis the metabolite measurements and metabolic flux directions were validated. Notably, the concentration of several metabolites of the upper glycolysis and pentose phosphate pathway increased under glucose-methanol feeding compared to the reference glucose conditions, indicating a shift in the thermodynamic driving forces. Conversely, the extracellular concentrations of all measured metabolites were lower compared with the corresponding exometabolome of glucose-grown P. pastoris cells. The instationary 13 C flux analysis resulted in fluxes comparable to previously obtained from NMR datasets of proteinogenic amino acids, but allowed several additional insights. Specifically, i) in vivo metabolic flux estimations were expanded to a larger metabolic network e.g. by including trehalose recycling, which accounted for about 1.5% of the glucose uptake rate; ii) the reversibility of glycolytic/gluconeogenesis, TCA cycle and pentose phosphate pathways reactions was estimated, revealing a significant gluconeogenic flux from the dihydroxyacetone phosphate/glyceraldehydes phosphate pool to glucose-6P. The origin of this finding could be carbon recycling from the methanol assimilatory pathway to the pentose phosphate pool. Additionally, high exchange fluxes of oxaloacetate with aspartate as well as malate indicated amino acid pool buffering and the activity of the malate/ Asp shuttle; iii) the ratio of methanol oxidation vs utilization appeared to be lower (54 vs 79% assimilated methanol directly oxidized to CO2). (Continued on next page)

Journal ArticleDOI
TL;DR: The results indicate that gene expressions or CNVs indeed provide extra useful information to the original data for the identification of core modules in cancer, and provide several candidate pathways or core modules recurrently perturbed in GBM or ovarian carcinoma for further studies.
Abstract: Understanding the molecular mechanisms underlying cancer is an important step for the effective diagnosis and treatment of cancer patients. With the huge volume of data from the large-scale cancer genomics projects, an open challenge is to distinguish driver mutations, pathways, and gene sets (or core modules) that contribute to cancer formation and progression from random passengers which accumulate in somatic cells but do not contribute to tumorigenesis. Due to mutational heterogeneity, current analyses are often restricted to known pathways and functional modules for enrichment of somatic mutations. Therefore, discovery of new pathways and functional modules is a pressing need. In this study, we propose a novel method to i dentify M utated C ore M odules in C ancer (iMCMC) without any prior information other than cancer genomic data from patients with tumors. This is a network-based approach in which three kinds of data are integrated: somatic mutations, copy number variations (CNVs), and gene expressions. Firstly, the first two datasets are merged to obtain a mutation matrix, based on which a weighted mutation network is constructed where the vertex weight corresponds to gene coverage and the edge weight corresponds to the mutual exclusivity between gene pairs. Similarly, a weighted expression network is generated from the expression matrix where the vertex and edge weights correspond to the influence of a gene mutation on other genes and the Pearson correlation of gene mutation-correlated expressions, respectively. Then an integrative network is obtained by further combining these two networks, and the most coherent subnetworks are identified by using an optimization model. Finally, we obtained the core modules for tumors by filtering with significance and exclusivity tests. We applied iMCMC to the Cancer Genome Atlas (TCGA) glioblastoma multiforme (GBM) and ovarian carcinoma data, and identified several mutated core modules, some of which are involved in known pathways. Most of the implicated genes are oncogenes or tumor suppressors previously reported to be related to carcinogenesis. As a comparison, we also performed iMCMC on two of the three kinds of data, i.e., the datasets combining somatic mutations with CNVs and secondly the datasets combining somatic mutations with gene expressions. The results indicate that gene expressions or CNVs indeed provide extra useful information to the original data for the identification of core modules in cancer. This study demonstrates the utility of our iMCMC by integrating multiple data sources to identify mutated core modules in cancer. In addition to presenting a generally applicable methodology, our findings provide several candidate pathways or core modules recurrently perturbed in GBM or ovarian carcinoma for further studies.