scispace - formally typeset
Search or ask a question

Showing papers in "BMC Systems Biology in 2012"


Journal ArticleDOI
TL;DR: HINT addresses the ubiquitous need of having a repository of high-quality protein-protein interactions and can be utilized to generate specific hypotheses about specific proteins and/or pathways, as well as analyzing global properties of cellular networks.
Abstract: A global map of protein-protein interactions in cellular systems provides key insights into the workings of an organism. A repository of well-validated high-quality protein-protein interactions can be used in both large- and small-scale studies to generate and validate a wide range of functional hypotheses. We develop HINT ( http://hint.yulab.org ) - a database of high-quality protein-protein interactomes for human, Saccharomyces cerevisiae, Schizosaccharomyces pombe, and Oryza sativa. These were collected from several databases and filtered both systematically and manually to remove low-quality/erroneous interactions. The resulting datasets are classified by type (binary physical interactions vs. co-complex associations) and data source (high-throughput systematic setups vs. literature-curated small-scale experiments). We find strong sociological sampling biases in literature-curated datasets of small-scale interactions. An interactome without such sampling biases was used to understand network properties of human disease-genes - hubs are unlikely to cause disease, but if they do, they usually cause multiple disorders. HINT is of significant interest to researchers in all fields of biology as it addresses the ubiquitous need of having a repository of high-quality protein-protein interactions. These datasets can be utilized to generate specific hypotheses about specific proteins and/or pathways, as well as analyzing global properties of cellular networks. HINT will be regularly updated and all versions will be tracked.

400 citations


Journal ArticleDOI
TL;DR: A novel, robust and accurate scoring technique for stability selection, which improves the performance of feature selection with LARS, is introduced, which was ranked among the top GRN inference methods in the DREAM5 gene network inference challenge and was evaluated to be the best linear regression-based method in the challenge.
Abstract: Background Inferring the structure of gene regulatory networks (GRN) from a collection of gene expression data has many potential applications, from the elucidation of complex biological processes to the identification of potential drug targets. It is however a notoriously difficult problem, for which the many existing methods reach limited accuracy.

372 citations


Journal ArticleDOI
TL;DR: The eicosanoid metabolic pathway is identified, especially reactions catalyzing the production of leukotrienes from arachidnoic acid, as potential drug targets that selectively affect tumor tissues.
Abstract: Background Human tissues perform diverse metabolic functions. Mapping out these tissue-specific functions in genome-scale models will advance our understanding of the metabolic basis of various physiological and pathological processes. The global knowledgebase of metabolic functions categorized for the human genome (Human Recon 1) coupled with abundant high-throughput data now makes possible the reconstruction of tissue-specific metabolic models. However, the number of available tissue-specific models remains incomplete compared with the large diversity of human tissues.

268 citations


Journal ArticleDOI
TL;DR: The integration of protein-protein interaction network and gene expression data can help improve the precision of predicting essential proteins, and the proposed new centrality measure PeC is an effective essential protein discovery method.
Abstract: Background: Identification of essential proteins is always a challenging task since it requires experimental approaches that are time-consuming and laborious. With the advances in high throughput technologies, a large number of protein-protein interactions are available, which have produced unprecedented opportunities for detecting proteins’ essentialities from the network level. There have been a series of computational approaches proposed for predicting essential proteins based on network topologies. However, the network topology-based centrality measures are very sensitive to the robustness of network. Therefore, a new robust essential protein discovery method would be of great value. Results: In this paper, we propose a new centrality measure, named PeC, based on the integration of proteinprotein interaction and gene expression data. The performance of PeC is validated based on the protein-protein interaction network of Saccharomyces cerevisiae. The experimental results show that the predicted precision of PeC clearly exceeds that of the other fifteen previously proposed centrality measures: Degree Centrality (DC), Betweenness Centrality (BC), Closeness Centrality (CC), Subgraph Centrality (SC), Eigenvector Centrality (EC), Information Centrality (IC), Bottle Neck (BN), Density of Maximum Neighborhood Component (DMNC), Local Average Connectivity-based method (LAC), Sum of ECC (SoECC), Range-Limited Centrality (RL), L-index (LI), Leader Rank (LR), Normalized a-Centrality (NC), and Moduland-Centrality (MC). Especially, the improvement of PeC over the classic centrality measures (BC, CC, SC, EC, and BN) is more than 50% when predicting no more than 500 proteins. Conclusions: We demonstrate that the integration of protein-protein interaction network and gene expression data can help improve the precision of predicting essential proteins. The new centrality measure, PeC, is an effective essential protein discovery method.

209 citations


Journal ArticleDOI
TL;DR: The Cell Collective is a web-based platform that enables laboratory scientists from across the globe to collaboratively build large-scale models of various biological processes, and simulate/analyze them in real time.
Abstract: Despite decades of new discoveries in biomedical research, the overwhelming complexity of cells has been a significant barrier to a fundamental understanding of how cells work as a whole. As such, the holistic study of biochemical pathways requires computer modeling. Due to the complexity of cells, it is not feasible for one person or group to model the cell in its entirety. The Cell Collective is a platform that allows the world-wide scientific community to create these models collectively. Its interface enables users to build and use models without specifying any mathematical equations or computer code - addressing one of the major hurdles with computational research. In addition, this platform allows scientists to simulate and analyze the models in real-time on the web, including the ability to simulate loss/gain of function and test what-if scenarios in real time. The Cell Collective is a web-based platform that enables laboratory scientists from across the globe to collaboratively build large-scale models of various biological processes, and simulate/analyze them in real time. In this manuscript, we show examples of its application to a large-scale model of signal transduction.

203 citations


Journal ArticleDOI
TL;DR: In this article, the authors propose a formalism that can take advantage of complex and dynamic networks to build models of cellular signaling in both physiological and diseased situations, using context-specific medium/high throughput proteomic data.
Abstract: Background Cells process signals using complex and dynamic networks. Studying how this is performed in a context and cell type specific way is essential to understand signaling both in physiological and diseased situations. Context-specific medium/high throughput proteomic data measured upon perturbation is now relatively easy to obtain but formalisms that can take advantage of these features to build models of signaling are still comparatively scarce.

200 citations


Journal ArticleDOI
TL;DR: This paper demonstrates, in a series of examples with high relevance to the molecular systems biology community, that the proposed software framework, URDME, is a useful tool for both practitioners and developers of spatial stochastic simulation algorithms.
Abstract: Experiments in silico using stochastic reaction-diffusion models have emerged as an important tool in molecular systems biology. Designing computational software for such applications poses several challenges. Firstly, realistic lattice-based modeling for biological applications requires a consistent way of handling complex geometries, including curved inner- and outer boundaries. Secondly, spatiotemporal stochastic simulations are computationally expensive due to the fast time scales of individual reaction- and diffusion events when compared to the biological phenomena of actual interest. We therefore argue that simulation software needs to be both computationally efficient, employing sophisticated algorithms, yet in the same time flexible in order to meet present and future needs of increasingly complex biological modeling. We have developed URDME, a flexible software framework for general stochastic reaction-transport modeling and simulation. URDME uses U nstructured triangular and tetrahedral meshes to resolve general geometries, and relies on the R eaction-D iffusion M aster E quation formalism to model the processes under study. An interface to a mature geometry and mesh handling external software (Comsol Multiphysics) provides for a stable and interactive environment for model construction. The core simulation routines are logically separated from the model building interface and written in a low-level language for computational efficiency. The connection to the geometry handling software is realized via a Matlab interface which facilitates script computing, data management, and post-processing. For practitioners, the software therefore behaves much as an interactive Matlab toolbox. At the same time, it is possible to modify and extend URDME with newly developed simulation routines. Since the overall design effectively hides the complexity of managing the geometry and meshes, this means that newly developed methods may be tested in a realistic setting already at an early stage of development. In this paper we demonstrate, in a series of examples with high relevance to the molecular systems biology community, that the proposed software framework is a useful tool for both practitioners and developers of spatial stochastic simulation algorithms. Through the combined efforts of algorithm development and improved modeling accuracy, increasingly complex biological models become feasible to study through computational methods. URDME is freely available at http://www.urdme.org .

168 citations


Journal ArticleDOI
TL;DR: Vanted is a stand-alone framework which supports scientists during the data analysis and interpretation phase, which comprises a comprehensive set of seven main tasks which range from network reconstruction, data visualization, integration of various data types, network simulation to data exploration combined with a manifold support of systems biology standards for visualization and data exchange.
Abstract: Background Experimental datasets are becoming larger and increasingly complex, spanning different data domains, thereby expanding the requirements for respective tool support for their analysis. Networks provide a basis for the integration, analysis and visualization of multi-omics experimental datasets.

164 citations


Journal ArticleDOI
TL;DR: An update to the Yeast Consensus Reconstruction is constructed, Yeast 5, which expands and refines the computational reconstruction of yeast metabolism and improves the predictive accuracy of a stoichiometrically constrained yeast metabolic model.
Abstract: Background: Efforts to improve the computational reconstruction of the Saccharomyces cerevisiae biochemical reaction network and to refine the stoichiometrically constrained metabolic models that can be derived from such a reconstruction have continued since the first stoichiometrically constrained yeast genome scale metabolic model was published in 2003. Continuing this ongoing process, we have constructed an update to the Yeast Consensus Reconstruction, Yeast 5. The Yeast Consensus Reconstruction is a product of efforts to forge a community-based reconstruction emphasizing standards compliance and biochemical accuracy via evidence-based selection of reactions. It draws upon models published by a variety of independent research groups as well as information obtained from biochemical databases and primary literature. Results: Yeast 5 refines the biochemical reactions included in the reconstruction, particularly reactions involved in sphingolipid metabolism; updates gene-reaction annotations; and emphasizes the distinction between reconstruction and stoichiometrically constrained model. Although it was not a primary goal, this update also improves the accuracy of model prediction of viability and auxotrophy phenotypes and increases the number of epistatic interactions. This update maintains an emphasis on standards compliance, unambiguous metabolite naming, and computer-readable annotations available through a structured document format. Additionally, we have developed MATLAB scripts to evaluate the model’s predictive accuracy and to demonstrate basic model applications such as simulating aerobic and anaerobic growth. These scripts, which provide an independent tool for evaluating the performance of various stoichiometrically constrained yeast metabolic models using flux balance analysis, are included as Additional files 1, 2 and 3. Conclusions: Yeast 5 expands and refines the computational reconstruction of yeast metabolism and improves the predictive accuracy of a stoichiometrically constrained yeast metabolic model. It differs from previous reconstructions and models by emphasizing the distinction between the yeast metabolic reconstruction and the stoichiometrically constrained model, and makes both available as Additional file 4 and Additional file 5 and at http://yeast.sf.net/ as separate systems biology markup language (SBML) files. Through this separation, we intend to make the modeling process more accessible, explicit, transparent, and reproducible.

157 citations


Journal ArticleDOI
TL;DR: STEPS simulates models of cellular reaction–diffusion systems with complex boundaries with high accuracy and high performance in C/C++, controlled by a powerful and user-friendly Python interface.
Abstract: Models of cellular molecular systems are built from components such as biochemical reactions (including interactions between ligands and membrane-bound proteins), conformational changes and active and passive transport. A discrete, stochastic description of the kinetics is often essential to capture the behavior of the system accurately. Where spatial effects play a prominent role the complex morphology of cells may have to be represented, along with aspects such as chemical localization and diffusion. This high level of detail makes efficiency a particularly important consideration for software that is designed to simulate such systems. We describe STEPS, a stochastic reaction–diffusion simulator developed with an emphasis on simulating biochemical signaling pathways accurately and efficiently. STEPS supports all the above-mentioned features, and well-validated support for SBML allows many existing biochemical models to be imported reliably. Complex boundaries can be represented accurately in externally generated 3D tetrahedral meshes imported by STEPS. The powerful Python interface facilitates model construction and simulation control. STEPS implements the composition and rejection method, a variation of the Gillespie SSA, supporting diffusion between tetrahedral elements within an efficient search and update engine. Additional support for well-mixed conditions and for deterministic model solution is implemented. Solver accuracy is confirmed with an original and extensive validation set consisting of isolated reaction, diffusion and reaction–diffusion systems. Accuracy imposes upper and lower limits on tetrahedron sizes, which are described in detail. By comparing to Smoldyn, we show how the voxel-based approach in STEPS is often faster than particle-based methods, with increasing advantage in larger systems, and by comparing to MesoRD we show the efficiency of the STEPS implementation. STEPS simulates models of cellular reaction–diffusion systems with complex boundaries with high accuracy and high performance in C/C++, controlled by a powerful and user-friendly Python interface. STEPS is free for use and is available at http://steps.sourceforge.net/

152 citations


Journal ArticleDOI
TL;DR: A new general method is derived and shown to correctly describe the statistics of intrinsic noise about the macroscopic concentrations under timescale separation conditions, which is expected to be of widespread utility in studying the dynamics of large noisy reaction networks.
Abstract: Background: It is well known that the deterministic dynamics of biochemical reaction networks can be more easily studied if timescale separation conditions are invoked (the quasi-steady-state assumption). In this case the deterministic dynamics of a large network of elementary reactions are well described by the dynamics of a smaller network of effective reactions. Each of the latter represents a group of elementary reactions in the large network and has associated with it an effective macroscopic rate law. A popular method to achieve model reduction in the presence of intrinsic noise consists of using the effective macroscopic rate laws to heuristically deduce effective probabilities for the effective reactions which then enables simulation via the stochastic simulation algorithm (SSA). The validity of this heuristic SSA method is a priori doubtful because the reaction probabilities for the SSA have only been rigorously derived from microscopic physics arguments for elementary reactions. Results: We here obtain, by rigorous means and in closed-form, a reduced linear Langevin equation description of the stochastic dynamics of monostable biochemical networks in conditions characterized by small intrinsic noise and timescale separation. The slow-scale linear noise approximation (ssLNA), as the new method is called, is used to calculate the intrinsic noise statistics of enzyme and gene networks. The results agree very well with SSA simulations of the non-reduced network of elementary reactions. In contrast the conventional heuristic SSA is shown to overestimate the size of noise for Michaelis-Menten kinetics, considerably under-estimate the size of noise for Hill-type kinetics and in some cases even miss the prediction of noise-induced oscillations. Conclusions: A new general method, the ssLNA, is derived and shown to correctly describe the statistics of intrinsic noise about the macroscopic concentrations under timescale separation conditions. The ssLNA provides a simple and accurate means of performing stochastic model reduction and hence it is expected to be of widespread utility in studying the dynamics of large noisy reaction networks, as is common in computational and systems biology.

Journal ArticleDOI
TL;DR: Using quantitative transcriptomics data acquired from Saccharomyces cerevisiae cultures under two growth conditions, the method outperforms traditional approaches for predicting experimentally measured exometabolic flux that are reliant upon maximisation of the rate of biomass production.
Abstract: Constraint-based analysis of genome-scale metabolic models typically relies upon maximisation of a cellular objective function such as the rate or efficiency of biomass production. Whilst this assumption may be valid in the case of microorganisms growing under certain conditions, it is likely invalid in general, and especially for multicellular organisms, where cellular objectives differ greatly both between and within cell types. Moreover, for the purposes of biotechnological applications, it is normally the flux to a specific metabolite or product that is of interest rather than the rate of production of biomass per se.

Journal ArticleDOI
TL;DR: Comparing C3 and C4 metabolic networks using the improved constraint-based models for Arabidopsis and maize demonstrated that in contrast to C3, C4 plants have less dense topology, higher robustness, better modularity, and higher CO2 and radiation use efficiency.
Abstract: The C4 photosynthetic cycle supercharges photosynthesis by concentrating CO2 around ribulose-1,5-bisphosphate carboxylase and significantly reduces the oxygenation reaction. Therefore engineering C4 feature into C3 plants has been suggested as a feasible way to increase photosynthesis and yield of C3 plants, such as rice, wheat, and potato. To identify the possible transition from C3 to C4 plants, the systematic comparison of C3 and C4 metabolism is necessary. We compared C3 and C4 metabolic networks using the improved constraint-based models for Arabidopsis and maize. By graph theory, we found the C3 network exhibit more dense topology structure than C4. The simulation of enzyme knockouts demonstrated that both C3 and C4 networks are very robust, especially when optimizing CO2 fixation. Moreover, C4 plant has better robustness no matter the objective function is biomass synthesis or CO2 fixation. In addition, all the essential reactions in C3 network are also essential for C4, while there are some other reactions specifically essential for C4, which validated that the basic metabolism of C4 plant is similar to C3, but C4 is more complex. We also identified more correlated reaction sets in C4, and demonstrated C4 plants have better modularity with complex mechanism coordinates the reactions and pathways than that of C3 plants. We also found the increase of both biomass production and CO2 fixation with light intensity and CO2 concentration in C4 is faster than that in C3, which reflected more efficient use of light and CO2 in C4 plant. Finally, we explored the contribution of different C4 subtypes to biomass production by setting specific constraints. All results are consistent with the actual situation, which indicate that Flux Balance Analysis is a powerful method to study plant metabolism at systems level. We demonstrated that in contrast to C3, C4 plants have less dense topology, higher robustness, better modularity, and higher CO2 and radiation use efficiency. In addition, preliminary analysis indicated that the rate of CO2 fixation and biomass production in PCK subtype are superior to NADP-ME and NAD-ME subtypes under enough supply of water and nitrogen.

Journal ArticleDOI
TL;DR: Together, these changes improve the ability of DREM 2.0 to accurately recover dynamic regulatory networks and make it much easier to use it for analyzing such networks in several species with varying degrees of interaction information.
Abstract: Modeling dynamic regulatory networks is a major challenge since much of the protein-DNA interaction data available is static. The Dynamic Regulatory Events Miner (DREM) uses a Hidden Markov Model-based approach to integrate this static interaction data with time series gene expression leading to models that can determine when transcription factors (TFs) activate genes and what genes they regulate. DREM has been used successfully in diverse areas of biological research. However, several issues were not addressed by the original version. DREM 2.0 is a comprehensive software for reconstructing dynamic regulatory networks that supports interactive graphical or batch mode. With version 2.0 a set of new features that are unique in comparison with other softwares are introduced. First, we provide static interaction data for additional species. Second, DREM 2.0 now accepts continuous binding values and we added a new method to utilize TF expression levels when searching for dynamic models. Third, we added support for discriminative motif discovery, which is particularly powerful for species with limited experimental interaction data. Finally, we improved the visualization to support the new features. Combined, these changes improve the ability of DREM 2.0 to accurately recover dynamic regulatory networks and make it much easier to use it for analyzing such networks in several species with varying degrees of interaction information. DREM 2.0 provides a unique framework for constructing and visualizing dynamic regulatory networks. DREM 2.0 can be downloaded from: http://www.sb.cs.cmu.edu/drem .

Journal ArticleDOI
TL;DR: Seventy-two miRNA-mRNA pairs combined by 22 dysregulated miRNAs and their 58 target mRNAs identified by the multi-step approach appear to be involved in CRC tumorigenesis.
Abstract: MicroRNAs (miRNAs) are involved in carcinogenesis and tumor progression by regulating post-transcriptional gene expression. However, the miRNA-mRNA regulatory network is far from being fully understood. The objective of this study is to identify the colorectal cancer (CRC) specific miRNAs and their target mRNAs using a multi-step approach. A multi-step approach combining microarray miRNA and mRNA expression profile and bioinformatics analysis was adopted to identify the CRC specific miRNA-mRNA regulatory network. First, 32 differentially expressed miRNAs and 2916 mRNAs from CRC samples and their corresponding normal epithelial tissues were identified by miRNA and mRNA microarray, respectively. Secondly, 22 dysregulated miRNAs and their 58 target mRNAs (72 miRNA-mRNA pairs) were identified by a combination of Pearson’s correlation analysis and prediction by databases TargetScan and miRanda. Bioinformatics analysis revealed that these miRNA-mRNAs pairs were involved in Wnt signaling pathway. Additionally, 6 up-regulated miRNAs (mir-21, mir-223, mir-224, mir-29a, mir-29b, and mir-27a) and 4 down-regulated predicted target mRNAs (SFRP1, SFRP2, RNF138, and KLF4) were selected to validate the expression level and their anti-correlationship in an extended cohort of CRC patients by qRT-PCR. Except for mir-27a, the differential expression and their anti-correlationship were proven. Finally, a transfection assay was performed to validate a regulatory relationship between mir-29a and KLF4 at both RNA and protein levels. Seventy-two miRNA-mRNA pairs combined by 22 dysregulated miRNAs and their 58 target mRNAs identified by the multi-step approach appear to be involved in CRC tumorigenesis. The results in our study were worthwhile to further investigation via a functional study to fully understand the underlying regulatory mechanisms of miRNA in CRC.

Journal ArticleDOI
TL;DR: It is found that cor(K,C) reveals the profound effect of Huntington’s disease on samples from the caudate nucleus relative to other brain regions, highlighting a new strategy for exploring the effects of disease on sets of genes.
Abstract: Background: Genomic datasets generated by new technologies are increasingly prevalent in disparate areas of biological research. While many studies have sought to characterize relationships among genomic features, commensurate efforts to characterize relationships among biological samples have been less common. Consequently, the full extent of sample variation in genomic studies is often under-appreciated, complicating downstream analytical tasks such as gene co-expression network analysis. Results: Here we demonstrate the use of network methods for characterizing sample relationships in microarray data generated from human brain tissue. We describe an approach for identifying outlying samples that does not depend on the choice or use of clustering algorithms. We introduce a battery of measures for quantifying the consistency and integrity of sample relationships, which can be compared across disparate studies, technology platforms, and biological systems. Among these measures, we provide evidence that the correlation between the connectivity and the clustering coefficient (two important network concepts) is a sensitive indicator of homogeneity among biological samples. We also show that this measure, which we refer to as cor(K,C), can distinguish biologically meaningful relationships among subgroups of samples. Specifically, we find that cor(K,C) reveals the profound effect of Huntington’s disease on samples from the caudate nucleus relative to other brain regions. Furthermore, we find that this effect is concentrated in specific modules of genes that are naturally co-expressed in human caudate nucleus, highlighting a new strategy for exploring the effects of disease on sets of genes. Conclusions: These results underscore the importance of systematically exploring sample relationships in large genomic datasets before seeking to analyze genomic feature activity. We introduce a standardized platform for this purpose using freely available R software that has been designed to enable iterative and interactive exploration of sample networks.

Journal ArticleDOI
TL;DR: An iteration method for predicting essential proteins by integrating the orthology with PPI networks, named by ION, which identifies a large amount of essential proteins which have been ignored by eight other existing centrality methods because of their low-connectivity.
Abstract: Identification of essential proteins plays a significant role in understanding minimal requirements for the cellular survival and development. Many computational methods have been proposed for predicting essential proteins by using the topological features of protein-protein interaction (PPI) networks. However, most of these methods ignored intrinsic biological meaning of proteins. Moreover, PPI data contains many false positives and false negatives. To overcome these limitations, recently many research groups have started to focus on identification of essential proteins by integrating PPI networks with other biological information. However, none of their methods has widely been acknowledged. By considering the facts that essential proteins are more evolutionarily conserved than nonessential proteins and essential proteins frequently bind each other, we propose an iteration method for predicting essential proteins by integrating the orthology with PPI networks, named by ION. Differently from other methods, ION identifies essential proteins depending on not only the connections between proteins but also their orthologous properties and features of their neighbors. ION is implemented to predict essential proteins in S. cerevisiae. Experimental results show that ION can achieve higher identification accuracy than eight other existing centrality methods in terms of area under the curve (AUC). Moreover, ION identifies a large amount of essential proteins which have been ignored by eight other existing centrality methods because of their low-connectivity. Many proteins ranked in top 100 by ION are both essential and belong to the complexes with certain biological functions. Furthermore, no matter how many reference organisms were selected, ION outperforms all eight other existing centrality methods. While using as many as possible reference organisms can improve the performance of ION. Additionally, ION also shows good prediction performance in E. coli K-12. The accuracy of predicting essential proteins can be improved by integrating the orthology with PPI networks.

Journal ArticleDOI
TL;DR: AlzPathway is the first comprehensive map of intra, inter and extra cellular AD signaling pathways which can enable mechanistic deciphering of AD pathogenesis.
Abstract: Background: Alzheimer’s disease (AD) is the most common cause of dementia among the elderly. To clarify pathogenesis of AD, thousands of reports have been accumulating. However, knowledge of signaling pathways in the field of AD has not been compiled as a database before. Description: Here, we have constructed a publicly available pathway map called “AlzPathway” that comprehensively catalogs signaling pathways in the field of AD. We have collected and manually curated over 100 review articles related to AD, and have built an AD pathway map using CellDesigner. AlzPathway is currently composed of 1347 molecules and 1070 reactions in neuron, brain blood barrier, presynaptic, postsynaptic, astrocyte, and microglial cells and their cellular localizations. AlzPathway is available as both the SBML (Systems Biology Markup Language) map for CellDesigner and the high resolution image map. AlzPathway is also available as a web service (online map) based on Payao system, a community-based, collaborative web service platform for pathway model curation, enabling continuous updates by AD researchers. Conclusions: AlzPathway is the first comprehensive map of intra, inter and extra cellular AD signaling pathways which can enable mechanistic deciphering of AD pathogenesis. The AlzPathway map is accessible at http:// alzpathway.org/.

Journal ArticleDOI
TL;DR: Both genome-scale metabolic models of P. pastoris and P. stipitis are useful frameworks to explore the versatility of these yeasts and to capitalize on their biotechnological potentials.
Abstract: Background Pichia stipitis and Pichia pastoris have long been investigated due to their native abilities to metabolize every sugar from lignocellulose and to modulate methanol consumption, respectively. The latter has been driving the production of several recombinant proteins. As a result, significant advances in their biochemical knowledge, as well as in genetic engineering and fermentation methods have been generated. The release of their genome sequences has allowed systems level research.

Journal ArticleDOI
TL;DR: In this article, the prediction profile likelihood (PPL) is used to estimate the confidence intervals of the model parameters for a data-based observability analysis of a biochemical network.
Abstract: Background: Predicting a system’s behavior based on a mathematical model is a primary task in Systems Biology. If the model parameters are estimated from experimental data, the parameter uncertainty has to be translated into confidence intervals for model predictions. For dynamic models of biochemical networks, the nonlinearity in combination with the large number of parameters hampers the calculation of prediction confidence intervals and renders classical approaches as hardly feasible. Results: In this article reliable confidence intervals are calculated based on the prediction profile likelihood. Such prediction confidence intervals of the dynamic states can be utilized for a data-based observability analysis. The method is also applicable if there are non-identifiable parameters yielding to some insufficiently specified model predictions that can be interpreted as non-observability .M oreover, avalidation profile likelihood is introduced that should be applied when noisy validation experiments are to be interpreted. Conclusions: The presented methodology allows the propagation of uncertainty from experimental to model predictions. Although presented in the context of ordinary differential equations, the concept is general and also applicable to other types of models. Matlab code which can be used as a template to implement the method is provided at http://www.fdmold.uni-freiburg.de/∼ckreutz/PPL.

Journal ArticleDOI
TL;DR: Y. lipolytica iNL895 represents the first well-annotated metabolic model of an oleaginous yeast, providing a base for future metabolic improvement, and a starting point for the metabolic reconstruction of other species in the Yarrowia clade and other oleganous yeasts.
Abstract: Yarrowia lipolytica is an oleaginous yeast which has emerged as an important microorganism for several biotechnological processes, such as the production of organic acids, lipases and proteases. It is also considered a good candidate for single-cell oil production. Although some of its metabolic pathways are well studied, its metabolic engineering is hindered by the lack of a genome-scale model that integrates the current knowledge about its metabolism. Combining in silico tools and expert manual curation, we have produced an accurate genome-scale metabolic model for Y. lipolytica. Using a scaffold derived from a functional metabolic model of the well-studied but phylogenetically distant yeast S. cerevisiae, we mapped conserved reactions, rewrote gene associations, added species-specific reactions and inserted specialized copies of scaffold reactions to account for species-specific expansion of protein families. We used physiological measures obtained under lab conditions to validate our predictions. Y. lipolytica iNL895 represents the first well-annotated metabolic model of an oleaginous yeast, providing a base for future metabolic improvement, and a starting point for the metabolic reconstruction of other species in the Yarrowia clade and other oleaginous yeasts.

Journal ArticleDOI
TL;DR: The NPA scoring method leverages high-throughput measurements and a priori literature-derived knowledge in the form of network models to characterize the activity change for a broad collection of biological processes at high-resolution.
Abstract: High-throughput measurement technologies produce data sets that have the potential to elucidate the biological impact of disease, drug treatment, and environmental agents on humans. The scientific community faces an ongoing challenge in the analysis of these rich data sources to more accurately characterize biological processes that have been perturbed at the mechanistic level. Here, a new approach is built on previous methodologies in which high-throughput data was interpreted using prior biological knowledge of cause and effect relationships. These relationships are structured into network models that describe specific biological processes, such as inflammatory signaling or cell cycle progression. This enables quantitative assessment of network perturbation in response to a given stimulus.

Journal ArticleDOI
TL;DR: A simplified model of metabolic switching suggests that the extra energy generated during acetate production produces an additional optimal growth mode that smoothens the metabolic switch in E. coli.
Abstract: Low-yield metabolism is a puzzling phenomenon in many unicellular and multicellular organisms. In abundance of glucose, many cells use a highly wasteful fermentation pathway despite the availability of a high-yield pathway, producing many ATP molecules per glucose, e.g., oxidative phosphorylation. Some of these organisms, including the lactic acid bacterium Lactococcus lactis, downregulate their high-yield pathway in favor of the low-yield pathway. Other organisms, including Escherichia coli do not reduce the flux through the high-yield pathway, employing the low-yield pathway in parallel with a fully active high-yield pathway. For what reasons do some species use the high-yield and low-yield pathways concurrently and what makes others downregulate the high-yield pathway? A classic rationale for metabolic fermentation is overflow metabolism. Because the throughput of metabolic pathways is limited, influx of glucose exceeding the pathway's throughput capacity is thought to be redirected into an alternative, low-yield pathway. This overflow metabolism rationale suggests that cells would only use fermentation once the high-yield pathway runs at maximum rate, but it cannot explain why cells would decrease the flux through the high-yield pathway. Using flux balance analysis with molecular crowding (FBAwMC), a recent extension to flux balance analysis (FBA) that assumes that the total flux through the metabolic network is limited, we investigate the differences between Saccharomyces cerevisiae and L. lactis that downregulate the high-yield pathway at increasing glucose concentrations, and E. coli, which keeps the high-yield pathway functioning at maximal rate. FBAwMC correctly predicts the metabolic switching mode in these three organisms, suggesting that metabolic network architecture is responsible for differences in metabolic switching mode. Based on our analysis, we expect gradual, "overflow-like" switching behavior in organisms that have an additional energy-yielding pathway that does not consume NADH (e.g., acetate production in E. coli). Flux decrease through the high-yield pathway is expected in organisms in which the high-yield and low-yield pathways compete for NADH. In support of this analysis, a simplified model of metabolic switching suggests that the extra energy generated during acetate production produces an additional optimal growth mode that smoothens the metabolic switch in E. coli. Maintaining redox balance is key to explaining why some microbes decrease the flux through the high-yield pathway, while other microbes use "overflow-like" low-yield metabolism.

Journal ArticleDOI
TL;DR: An algorithm for modeling biological networks in a discrete framework with continuous time based on continuous time Markov process applied on a Boolean state space, which allows to describe kinetic phenomena which were difficult to handle in the original models.
Abstract: Mathematical modeling is used as a Systems Biology tool to answer biological questions, and more precisely, to validate a network that describes biological observations and predict the effect of perturbations. This article presents an algorithm for modeling biological networks in a discrete framework with continuous time.

Journal ArticleDOI
TL;DR: It is demonstrated that optimal tracer design does not need to be a pure simulation-based trial-and-error process; rather, rational insights intotracer design can be gained through the application of the EMU basis vector methodology.
Abstract: Background 13C-Metabolic flux analysis (13C-MFA) is a standard technique to probe cellular metabolism and elucidate in vivo metabolic fluxes. 13C-Tracer selection is an important step in conducting 13C-MFA, however, current methods are restricted to trial-and-error approaches, which commonly focus on an arbitrary subset of the tracer design space. To systematically probe the complete tracer design space, especially for complex systems such as mammalian cells, there is a pressing need for new rational approaches to identify optimal tracers.

Journal ArticleDOI
TL;DR: The Flux Analysis and Modeling Environment (FAME) is the first web-based modeling tool that combines the tasks of creating, editing, running, and analyzing/visualizing stoichiometric models into a single program.
Abstract: Background: The creation and modification of genome-scale metabolic models is a task that requires specialized software tools. While these are available, subsequently running or visualizing a model often relies on disjoint code, which adds additional actions to the analysis routine and, in our experience, renders these applications suboptimal for routine use by (systems) biologists. Results: The Flux Analysis and Modeling Environment (FAME) is the first web-based modeling tool that combines the tasks of creating, editing, running, and analyzing/visualizing stoichiometric models into a single program. Analysis results can be automatically superimposed on familiar KEGG-like maps. FAME is written in PHP and uses the Python-based PySCeS-CBM for its linear solving capabilities. It comes with a comprehensive manual and a quick-start tutorial, and can be accessed online at http://f-a-m-e.org/. Conclusions: With FAME, we present the community with an open source, user-friendly, web-based “one stop shop” for stoichiometric modeling. We expect the application will be of substantial use to investigators and educators alike.

Journal ArticleDOI
TL;DR: Results show that bimodal signaling response distributions do not necessarily imply digital (ultrasensitive or bistable) single cell signaling, and the interplay between protein expression noise and network topologies can bring about digital population responses from analog single cell dose responses.
Abstract: Background: Cell-to-cell variability in protein expression can be large, and its propagation through signaling networks affects biological outcomes. Here, we apply deterministic and probabilistic models and biochemical measurements to study how network topologies and cell-to-cell protein abundance variations interact to shape signaling responses. Results: We observe bimodal distributions of extracellular signal-regulated kinase (ERK) responses to epidermal growth factor (EGF) stimulation, which are generally thought to indicate bistable or ultrasensitive signaling behavior in single cells. Surprisingly, we find that a simple MAPK/ERK-cascade model with negative feedback that displays graded, analog ERK responses at a single cell level can explain the experimentally observed bimodality at the cell population level. Model analysis suggests that a conversion of graded input–output responses in single cells to digital responses at the population level is caused by a broad distribution of ERK pathway activation thresholds brought about by cell-to-cell variability in protein expression. Conclusions: Our results show that bimodal signaling response distributions do not necessarily imply digital (ultrasensitive or bistable) single cell signaling, and the interplay between protein expression noise and network topologies can bring about digital population responses from analog single cell dose responses. Thus, cells can retain the benefits of robustness arising from negative feedback, while simultaneously generating population-level on/off responses that are thought to be critical for regulating cell fate decisions.

Journal ArticleDOI
TL;DR: Results of this simulation study coincide with published experimental results and show the knockdown of the acetoacetyl-CoA transferase increases butanol to acetone selectivity, while the simultaneous over-expression of the aldehyde/alcohol dehydrogenase greatly increases ethanol production.
Abstract: Genome-scale metabolic networks and flux models are an effective platform for linking an organism genotype to its phenotype. However, few modeling approaches offer predictive capabilities to evaluate potential metabolic engineering strategies in silico. A new method called “f lux b alance a nalysis with flux ratio s (FBrAtio)” was developed in this research and applied to a new genome-scale model of Clostridium acetobutylicum ATCC 824 (i CAC490) that contains 707 metabolites and 794 reactions. FBrAtio was used to model wild-type metabolism and metabolically engineered strains of C. acetobutylicum where only flux ratio constraints and thermodynamic reversibility of reactions were required. The FBrAtio approach allowed solutions to be found through standard linear programming. Five flux ratio constraints were required to achieve a qualitative picture of wild-type metabolism for C. acetobutylicum for the production of: (i) acetate, (ii) lactate, (iii) butyrate, (iv) acetone, (v) butanol, (vi) ethanol, (vii) CO2 and (viii) H2. Results of this simulation study coincide with published experimental results and show the knockdown of the acetoacetyl-CoA transferase increases butanol to acetone selectivity, while the simultaneous over-expression of the aldehyde/alcohol dehydrogenase greatly increases ethanol production. FBrAtio is a promising new method for constraining genome-scale models using internal flux ratios. The method was effective for modeling wild-type and engineered strains of C. acetobutylicum.

Journal ArticleDOI
TL;DR: The model clearly demonstrates that of the two putative mechanisms that have been implicated in the dysregulation of cholesterol metabolism with age, alterations to the removal rate of plasma LDL-C has the most significant impact on cholesterol metabolism and small changes to the number of hepatic LDL receptors can result in a significant rise in cholesterol metabolism.
Abstract: Background: Global demographic changes have stimulated marked interest in the process of aging. There has been, and will continue to be, an unrelenting rise in the number of the oldest old ( >85 years of age). Together with an ageing population there comes an increase in the prevalence of age related disease. Of the diseases of ageing, cardiovascular disease (CVD) has by far the highest prevalence. It is regarded that a finely tuned lipid profile may help to prevent CVD as there is a long established relationship between alterations to lipid metabolism and CVD risk. In fact elevated plasma cholesterol, particularly Low Density Lipoprotein Cholesterol (LDL-C) has consistently stood out as a risk factor for having a cardiovascular event. Moreover it is widely acknowledged that LDL-C may rise with age in both sexes in a wide variety of groups. The aim of this work was to use a whole-body mathematical model to investigate why LDL-C rises with age, and to test the hypothesis that mechanistic changes to cholesterol absorption and LDL-C removal from the plasma are responsible for the rise. The whole-body mechanistic nature of the model differs from previous models of cholesterol metabolism which have either focused on intracellular cholesterol homeostasis or have concentrated on an isolated area of lipoprotein dynamics. The model integrates both current and previously published data relating to molecular biology, physiology, ageing and nutrition in an integrated fashion. Results: The model was used to test the hypothesis that alterations to the rate of cholesterol absorption and changes to the rate of removal of LDL-C from the plasma are integral to understanding why LDL-C rises with age. The model demonstrates that increasing the rate of intestinal cholesterol absorption from 50% to 80% by age 65 years can result in an increase of LDL-C by as much as 34 mg/dL in a hypothetical male subject. The model also shows that decreasing the rate of hepatic clearance of LDL-C gradually to 50% by age 65 years can result in an increase of LDL-C by as much as 116 mg/dL. Conclusions: Our model clearly demonstrates that of the two putative mechanisms that have been implicated in the dysregulation of cholesterol metabolism with age, alterations to the removal rate of plasma LDL-C has the most significant impact on cholesterol metabolism and small changes to the number of hepatic LDL receptors can result in a significant rise in LDL-C. This first whole-body systems based model of cholesterol balance could potentially be used as a tool to further improve our understanding of whole-body cholesterol metabolism and its dysregulation with age. Furthermore, given further fine tuning the model may help to investigate potential dietary and lifestyle regimes that have the potential to mitigate the effects aging has on cholesterol metabolism.

Journal ArticleDOI
Zuguang Gu1, Jialin Liu1, Kunming Cao1, Junfeng Zhang1, Jin Wang1 
TL;DR: This work demonstrates how choice of pathway structure and centrality measurement, as well as the presence of key genes, affects pathway significance and benefits the systematic analysis of biological pathways and help to extract more meaningful information from gene expression data.
Abstract: Biological pathways are important for understanding biological mechanisms. Thus, finding important pathways that underlie biological problems helps researchers to focus on the most relevant sets of genes. Pathways resemble networks with complicated structures, but most of the existing pathway enrichment tools ignore topological information embedded within pathways, which limits their applicability. A systematic and extensible pathway enrichment method in which nodes are weighted by network centrality was proposed. We demonstrate how choice of pathway structure and centrality measurement, as well as the presence of key genes, affects pathway significance. We emphasize two improvements of our method over current methods. First, allowing for the diversity of genes’ characters and the difficulty of covering gene importance from all aspects, we set centrality as an optional parameter in the model. Second, nodes rather than genes form the basic unit of pathways, such that one node can be composed of several genes and one gene may reside in different nodes. By comparing our methodology to the original enrichment method using both simulation data and real-world data, we demonstrate the efficacy of our method in finding new pathways from biological perspective. Our method can benefit the systematic analysis of biological pathways and help to extract more meaningful information from gene expression data. The algorithm has been implemented as an R package CePa, and also a web-based version of CePa is provided.