scispace - formally typeset
Search or ask a question

Showing papers presented at "Computational Methods in Systems Biology in 2018"


Book ChapterDOI
12 Sep 2018
TL;DR: An approach to data-driven control for artificial pancreas systems by learning neural network models of human insulin-glucose physiology from available patient data and using a mixed integer optimization approach to control blood glucose levels in real-time using the inferred models.
Abstract: In this paper, we provide an approach to data-driven control for artificial pancreas systems by learning neural network models of human insulin-glucose physiology from available patient data and using a mixed integer optimization approach to control blood glucose levels in real-time using the inferred models. First, our approach learns neural networks to predict the future blood glucose values from given data on insulin infusion and their resulting effects on blood glucose levels. However, to provide guarantees on the resulting model, we use quantile regression to fit multiple neural networks that predict upper and lower quantiles of the future blood glucose levels, in addition to the mean.

20 citations


Book ChapterDOI
12 Sep 2018
TL;DR: Model abstraction can play a crucial role in multi-scale modeling of biological systems, by providing simpler models of intra-cellular dynamics that are much faster to simulate so to scale better the analysis at the tissue level.
Abstract: Multi-scale modeling of biological systems, for instance of tissues composed of millions of cells, are extremely demanding to simulate, even resorting to High Performance Computing (HPC) facilities, particularly when each cell is described by a detailed model of some intra-cellular pathways and cells are coupled and interacting at the tissue level. Model abstraction can play a crucial role in this setting, by providing simpler models of intra-cellular dynamics that are much faster to simulate so to scale better the analysis at the tissue level. Abstractions themselves can be very challenging to build ab-initio. A more viable strategy is to learn them from single cell simulation data.

14 citations


Book ChapterDOI
12 Sep 2018
TL;DR: The results indicate that, subject to specific physiological constraints, optimal parameter values can be found within the mRNA–microRNA motif that can benefit the cell by lowering the gene-expression noise.
Abstract: Cells use various regulatory motifs, including feedforward loops, to control the intrinsic noise that arises in gene expression at low copy numbers. Here we study one such system, which is broadly inspired by the interaction between an mRNA molecule and an antagonistic microRNA molecule encoded by the same gene. The two reaction species are synchronously produced, individually degraded, and the second species (microRNA) exerts an antagonistic pressure on the first species (mRNA). Using linear-noise approximation, we show that the noise in the first species, which we quantify by the Fano factor, is sub-Poissonian, and exhibits a nonmonotonic response both to the species lifetime ratio and to the strength of the antagonistic interaction. Additionally, we use the Chemical Reaction Network Theory to prove that the first species distribution is Poissonian if the first species is much more stable than the second. Finally, we identify a special parametric regime, supporting a broad range of behaviour, in which the distribution can be analytically described in terms of the confluent hypergeometric limit function. We verify our analysis against large-scale kinetic Monte Carlo simulations. Our results indicate that, subject to specific physiological constraints, optimal parameter values can be found within the mRNA–microRNA motif that can benefit the cell by lowering the gene-expression noise.

10 citations


Book ChapterDOI
12 Sep 2018
TL;DR: It is shown how the formula robustness can be used in BIOCHAM-4 with no extra cost as an objective function in the parameter optimization procedure, to actually improve the model robustness.
Abstract: BIOCHAM-4 is a tool for modeling, analyzing and synthesizing biochemical reaction networks with respect to some formal, yet possibly imprecise, specification of their behavior. We focus here on one new capability of this tool to optimize the robustness of a parametric model with respect to a specification of its dynamics in quantitative temporal logic. More precisely, we present two complementary notions of robustness: the statistical notion of model robustness to parameter perturbations, defined as its mean functionality, and a metric notion of formula satisfaction robustness, defined as the penetration depth in the validity domain of the temporal logic constraints. We show how the formula robustness can be used in BIOCHAM-4 with no extra cost as an objective function in the parameter optimization procedure, to actually improve the model robustness. We illustrate these unique features with a classical example of the hybrid systems community and provide some performance figures on a model of MAPK signalling with 37 parameters.

9 citations


Book ChapterDOI
12 Sep 2018
TL;DR: A data analysis approach using Gaussian processes and machine learning techniques to infer components of the MoA of an unknown agent from time series transcriptomics, proteomics, and metabolomics data is described.
Abstract: Identifying the mechanism of action (MoA) of an unknown, possibly novel, substance (chemical, protein, or pathogen) is a significant challenge. Biologists typically spend years working out the MoA for known compounds. MoA determination is especially challenging if there is no prior knowledge and if there is an urgent need to understand the mechanism for rapid treatment and/or prevention of global health emergencies. In this paper, we describe a data analysis approach using Gaussian processes and machine learning techniques to infer components of the MoA of an unknown agent from time series transcriptomics, proteomics, and metabolomics data.

7 citations


Book ChapterDOI
12 Sep 2018
TL;DR: In this article, the authors focus on rate-independent function computation with CRNs, where the initial concentrations of some species as input and the eventual steady-state concentration of another species as output.
Abstract: Biological regulatory networks depend upon chemical interactions to process information. Engineering such molecular computing systems is a major challenge for synthetic biology and related fields. The chemical reaction network (CRN) model idealizes chemical interactions, abstracting away specifics of the molecular implementation, and allowing rigorous reasoning about the computational power of chemical kinetics. Here we focus on function computation with CRNs, where we think of the initial concentrations of some species as the input and the eventual steady-state concentration of another species as the output. Specifically, we are concerned with CRNs that are rate-independent (the computation must be correct independent of the reaction rate law) and composable (\(f \circ g\) can be computed by concatenating the CRNs computing f and g). Rate independence and composability are important engineering desiderata, permitting implementations that violate mass-action kinetics, or even “well-mixedness”, and allowing the systematic construction of complex computation via modular design. We show that to construct composable rate-independent CRNs, it is necessary and sufficient to ensure that the output species of a module is not a reactant in any reaction within the module. We then exactly characterize the functions computable by such CRNs as superadditive, positive-continuous, and piecewise rational linear. Our results show that composability severely limits rate-independent computation unless more sophisticated input/output encodings are used.

7 citations


Book ChapterDOI
12 Sep 2018
TL;DR: The approach is extended to cope with the important use case of finding diverse solutions of a problem with a large number of solutions and the results of the proposed approach are demonstrated on two different benchmark scenarios in systems biology.
Abstract: Logical modeling has been widely used to understand and expand the knowledge about protein interactions among different pathways. Realizing this, the caspo-ts system has been proposed recently to learn logical models from time series data. It uses Answer Set Programming to enumerate Boolean Networks (BNs) given prior knowledge networks and phosphoproteomic time series data. In the resulting sequence of solutions, similar BNs are typically clustered together. This can be problematic for large scale problems where we cannot explore the whole solution space in reasonable time. Our approach extends the caspo-ts system to cope with the important use case of finding diverse solutions of a problem with a large number of solutions. We first present the algorithm for finding diverse solutions and then we demonstrate the results of the proposed approach on two different benchmark scenarios in systems biology: (1) an artificial dataset to model TCR signaling and (2) the HPN-DREAM challenge dataset to model breast cancer cell lines.

7 citations


Book ChapterDOI
12 Sep 2018
TL;DR: This work formalizes the semantics of a core protocol language as a stochastic hybrid process, which provides a unified description for the models of biochemical systems being experimented on, together with the discrete events representing the liquid-handling steps of biological protocols.
Abstract: Both experimental and computational biology is becoming increasingly automated. Laboratory experiments are now performed automatically on high-throughput machinery, while computational models are synthesized or inferred automatically from data. However, integration between automated tasks in the process of biological discovery is still lacking, largely due to incompatible or missing formal representations. While theories are expressed formally as computational models, existing languages for encoding and automating experimental protocols often lack formal semantics. This makes it challenging to extract novel understanding by identifying when theory and experimental evidence disagree due to errors in the models or the protocols used to validate them. To address this, we formalize the syntax of a core protocol language, which provides a unified description for the models of biochemical systems being experimented on, together with the discrete events representing the liquid-handling steps of biological protocols. We present both a deterministic and a stochastic semantics to this language, both defined in terms of hybrid processes. In particular, the stochastic semantics captures uncertainties in equipment tolerances, making it a suitable tool for both experimental and computational biologists. We illustrate how the proposed protocol language can be used for automated verification and synthesis of laboratory experiments on case studies from the fields of chemistry and molecular programming.

6 citations


Book ChapterDOI
12 Sep 2018
TL;DR: A new, efficient algorithm for inferring, from time-series data or high-throughput data (e.g., flow cytometry), stochastic rate parameters for chemical reaction network models, that can work with incomplete datasets missing some model species, and with multiple datasets originating from experiment repetitions.
Abstract: We present a new, efficient algorithm for inferring, from time-series data or high-throughput data (e.g., flow cytometry), stochastic rate parameters for chemical reaction network models. Our algorithm combines the Gillespie stochastic simulation algorithm (including approximate variants such as tau-leaping) with the cross-entropy method. Also, it can work with incomplete datasets missing some model species, and with multiple datasets originating from experiment repetitions. We evaluate our algorithm on a number of challenging case studies, including bistable systems (Schlogl’s and toggle switch) and experimental data.

6 citations


Book ChapterDOI
12 Sep 2018
TL;DR: This work demonstrates the scalability and biological relevance of the formal reasoning approach enabling the synthesis of biological networks capable of reproducing some experimentally observed behavior by revealing the requirement for certain motifs in the network governing stem cell pluripotency.
Abstract: A recurring set of small sub-networks have been identified as the building blocks of biological networks across diverse organisms. These network motifs have been associated with certain dynamical behaviors and define key modules that are important for understanding complex biological programs. Besides studying the properties of motifs in isolation, existing algorithms often evaluate the occurrence frequency of a specific motif in a given biological network compared to that in random networks of similar structure. However, it remains challenging to relate the structure of motifs to the observed and expected behavior of the larger network. Indeed, even the precise structure of these biological networks remains largely unknown. Previously, we developed a formal reasoning approach enabling the synthesis of biological networks capable of reproducing some experimentally observed behavior. Here, we extend this approach to allow reasoning about the requirement for specific network motifs as a way of explaining how these behaviors arise. We illustrate the approach by analyzing the motifs involved in sign-sensitive delay and pulse generation. We demonstrate the scalability and biological relevance of the approach by revealing the requirement for certain motifs in the network governing stem cell pluripotency.

6 citations


Book ChapterDOI
12 Sep 2018
TL;DR: The main features of KaSa are illustrated on a model of the extracellular activation of the transforming growth factor, TGF-b, where it warns about rules that may never be applied, about potential irreversible transformations of proteins and about the potential formation of unbounded molecular compounds.
Abstract: KaSa is a static analyzer for Kappa models. Its goal is two-fold. Firstly, KaSa assists the modeler by warning about potential issues in the model. Secondly, KaSa may provide useful properties to check that what is implemented is what the modeler has in mind and to provide a quick overview of the model for the people who have not written it. The cornerstone of KaSa is a fix-point engine which detects some patterns that may never occur whatever the evolution of the system may be. From this, many useful information may be collected KaSa warns about rules that may never be applied, about potential irreversible transformations of proteins (that may not be reverted even thanks to an arbitrary number of computation steps) and about the potential formation of unbounded molecular compounds. Lastly, KaSa detects potential influences (activation/inhibition relation) between rules. In this paper, we illustrate the main features of KaSa on a model of the extracellular activation of the transforming growth factor, TGF-b.

Book ChapterDOI
12 Sep 2018
TL;DR: Boolean networks, introduced by Kauffman, is a popular and well-established framework for modelling gene regulatory networks and their associated signalling pathways and is able to capture the important dynamical properties of the system under study, thus facilitating the modelling and analysis of large biological networks as a whole.
Abstract: Boolean networks (BNs), introduced by Kauffman [3], is a popular and well-established framework for modelling gene regulatory networks and their associated signalling pathways. The main advantage of this framework is that it is relatively simple and yet able to capture the important dynamical properties of the system under study, thus facilitating the modelling and analysis of large biological networks as a whole.

Book ChapterDOI
12 Sep 2018
TL;DR: A method of defining clusters based on mutual distance, applying it to a set of transmission microscopy images of VEGF receptors, and assigns a geometric shape to each cluster, using a previously developed procedure that relates closely to distance based clustering.
Abstract: Cell membrane-bound receptors control signal initiation in many important cellular signaling pathways. In many such systems, receptor dimerization or cross-linking is a necessary step for activation, making signaling pathways sensitive to the distribution of receptors in the membrane. Microscopic imaging and modern labeling techniques reveal that certain receptor types tend to co-localize in clusters, ranging from a few to tens, and sometimes hundreds of members. The origin of these clusters is not well understood but they are likely not the result of chemical binding. Our goal is to build a simple, descriptive framework which provides quantitative measures that can be compared across samples and systems, as groundwork for more ambitious modeling aimed at uncovering specific biochemical mechanisms. Here we discuss a method of defining clusters based on mutual distance, applying it to a set of transmission microscopy images of VEGF receptors. Preliminary analysis using standard measures such as the Hopkins’ statistic reveals a compelling difference between the observed distributions and random placement. A key element to cluster identification is identifying an optimal length parameter \(L^*\). Distance based clustering hinges on the separation between two length scales: the typical distance between neighboring points within a cluster vs. the typical distance between clusters. This provides a guiding principle to identify \(L^*\) from experimentally derived cluster scaling functions. In addition, we assign a geometric shape to each cluster, using a previously developed procedure that relates closely to distance based clustering. We applied the cluster [support] identification procedure to the entire data set. The observed particle distribution results are consistent with the random placement of receptors within the clusters and, to a lesser extent, the random placement of the clusters on the cell membrane. Deviations from uniformity are typically due to large scale gradients in receptor density and/or the emergence of “mega-clusters” that are very likely the expression of a different biological function than the one behind the emergence of the quasi-ubiquitous small scale clusters.

Book ChapterDOI
12 Sep 2018
TL;DR: The new version of ASSA-PBN enables the support for context-sensitive PBNs (CPBNs), which can well balance the uncertainty and stability of the modelled biological systems.
Abstract: We present a major new release of ASSA-PBN, a software tool for modelling, simulation, and analysis of probabilistic Boolean networks (PBNs). The new version enables the support for context-sensitive PBNs (CPBNs), which can well balance the uncertainty and stability of the modelled biological systems. It contributes mainly in three aspects. Firstly, it designs a high-level language for specifying CPBNs. Secondly, it implements various simulation-based methods for simulating CPBNs and analysing their long-run dynamics. Last but not least, it provides an efficient method to identify all the attractors of a CPBN. Thanks to its divide and conquer strategy, the implemented detection algorithm can deal with large and realistic biological networks under both synchronous and asynchronous updating schemes.

Book ChapterDOI
12 Sep 2018
TL;DR: A unified approach for querying simulation traces of rule-based models about the statistical behavior of individual agents and a proposed query language that depends on the variables captured by this pattern is introduced.
Abstract: In this paper, we introduce a unified approach for querying simulation traces of rule-based models about the statistical behavior of individual agents. In our approach, a query consists in a trace pattern along with an expression that depends on the variables captured by this pattern. On a given trace, it evaluates to the multiset of all values of the expression for every possible matching of the pattern. We illustrate our proposed query language on a simple example, and then discuss its semantics and implementation for the Kappa language. Finally, we provide a detailed use case where we analyze the dynamics of \(\beta \)-catenin degradation in Wnt signaling from an agent-centric perspective.

Book ChapterDOI
12 Sep 2018
TL;DR: This work extends the recently-developed model known as thermodynamic binding networks, demonstrating programmable kinetic barriers that arise solely from the thermodynamic driving forces of bond formation and the configurational entropy of forming separate complexes.
Abstract: Engineering molecular systems that exhibit complex behavior requires the design of kinetic barriers. For example, an effective catalytic pathway must have a large barrier when the catalyst is absent. While programming such energy barriers seems to require knowledge of the specific molecular substrate, we develop a novel substrate-independent approach. We extend the recently-developed model known as thermodynamic binding networks, demonstrating programmable kinetic barriers that arise solely from the thermodynamic driving forces of bond formation and the configurational entropy of forming separate complexes. Our kinetic model makes relatively weak assumptions, which implies that energy barriers predicted by our model would exist in a wide variety of systems and conditions. We demonstrate that our model is robust by showing that several variations in its definition result in equivalent energy barriers. We apply this model to design catalytic systems with an arbitrarily large energy barrier to uncatalyzed reactions. Our results yield robust amplifiers using DNA strand displacement, a popular technology for engineering synthetic reaction pathways, and suggest design strategies for preventing undesired kinetic behavior in a variety of molecular systems.

Book ChapterDOI
12 Sep 2018
TL;DR: This poster describes a novel work-in-progress reparametrization of a frequently used non-linear ordinary differential equation model for inferring gene regulations from expression data that makes inference over the model stable and amenable to fully Bayesian treatment with state of the art Hamiltonian Monte Carlo methods.
Abstract: This poster describes a novel work-in-progress reparametrization of a frequently used non-linear ordinary differential equation (ODE) model for inferring gene regulations from expression data. We show that in its commonly used form, the model cannot always determine the sign of the regulatory effect as well as other parameters of the model. The proposed reparametrization makes inference over the model stable and amenable to fully Bayesian treatment with state of the art Hamiltonian Monte Carlo methods.

Book ChapterDOI
12 Sep 2018
TL;DR: A whole genome metabolic model (GEM) is essentially a reconstruction of a network of enzyme-enabled chemical reactions representing the metabolism of an organism, based on information present in its genome, as an overall indicator of the model’s viability.
Abstract: A whole genome metabolic model (GEM) is essentially a reconstruction of a network of enzyme-enabled chemical reactions representing the metabolism of an organism, based on information present in its genome. Such models have been designed so that flux balance analysis (FBA) can be applied in order to analyse metabolism under steady state. For this purpose, a biomass function is added to these models as an overall indicator of the model’s viability.

Book ChapterDOI
12 Sep 2018
TL;DR: In this article, the QBF (quantified Boolean formula) satisfiability problem is used to encode the synthesis questions of a partially known VTS under a given property, such as stability or undiscovered edges.
Abstract: Vesicle Traffic Systems (VTSs) are the material transport mechanisms among the compartments inside the biological cells. The compartments are viewed as nodes that are labeled with the containing chemicals and the transport channels are similarly viewed as labeled edges between the nodes. Understanding VTSs is an ongoing area of research and for many cells they are partially known. For example, there may be undiscovered edges, nodes, or their labels in a VTS of a cell. It has been speculated that there are properties that the VTSs must satisfy. For example, stability, i.e., every chemical that is leaving a compartment comes back. Many synthesis questions may arise in this scenario, where we want to complete a partially known VTS under a given property. In the paper, we present novel encodings of the above questions into the QBF (quantified Boolean formula) satisfiability problems. We have implemented the encodings in a highly configurable tool and applied to a couple of found-in-nature VTSs and several synthetic graphs. Our results demonstrate that our method can scale up to the graphs of interest.

Book ChapterDOI
12 Sep 2018
TL;DR: This work proposes a new pipeline based on taking a “systems” approach to metagenomics analysis, in this case to analyse human gut microbiome data, and examines a sample as a self-contained, open system with a distinct functional profile.
Abstract: Metagenomics is the science of analysing the structure and function of DNA samples taken from the environment (e.g. soil or human gut) as opposed to a single organism. So far, researchers have used traditional genomics tools and pipelines applied to metagenomics analysis such as species identification, sequence alignment and assembly. In addition to being computationally expensive, these approaches lack an emphasis on the functional profile of the sample regardless of species diversity, and how it changes under different conditions. It also ignores unculturable species and genes undergoing horizontal transfer. We propose a new pipeline based on taking a “systems” approach to metagenomics analysis, in this case to analyse human gut microbiome data. Instead of identifying existing species, we examine a sample as a self-contained, open system with a distinct functional profile. The pipeline was used to analyse data from an experiment performed on the gut microbiomes of lean, obese and overweight twins. Previous analysis of this data only focused on taxonomic binning. Using our systems metagenomics approach, our analysis found two very different functional profiles for lean and obese twins, with obese ones being distinctly more diverse. There are also interesting differences in metabolic pathways which could indicate specific driving forces for obesity.

Book ChapterDOI
12 Sep 2018
TL;DR: LNA++ is presented, which allows for fast derivation and simulation of the LNA including the computation of means, covariances, and temporal cross-covariances and for efficient parameter estimation and uncertainty analysis, LNA++ implements first and second order sensitivity equations.
Abstract: The linear noise approximation (LNA) provides an approximate description of the statistical moments of stochastic chemical reaction networks (CRNs). LNA is a commonly used modeling paradigm describing the probability distribution of systems of biochemical species in the intracellular environment. Unlike exact formulations, the LNA remains computationally feasible even for CRNs with many reactions. The tractability of the LNA makes it a common choice for inference of unknown chemical reaction parameters. However, this task is impeded by a lack of suitable inference tools for arbitrary CRN models. In particular, no available tool provides temporal cross-correlations, parameter sensitivities and efficient numerical integration. In this manuscript we present LNA++, which allows for fast derivation and simulation of the LNA including the computation of means, covariances, and temporal cross-covariances. For efficient parameter estimation and uncertainty analysis, LNA++ implements first and second order sensitivity equations. Interfaces are provided for easy integration with Matlab and Python.

Book ChapterDOI
12 Sep 2018
TL;DR: A new strategy to engineer genes with predefined transcription dynamics (mean and standard deviation of the distribution of RNA numbers of a cell population) is proposed, using stochastic modelling followed by genetic engineering to design synthetic promoters whose rate-limiting steps kinetics allow achieving a desired RNA production kinetics.
Abstract: Recent developments in live-cell time-lapse microscopy and signal processing methods for single-cell, single-RNA detection now allow characterizing the in vivo dynamics of RNA production of Escherichia coli promoters at the single event level. This dynamics is mostly controlled at the promoter region, which can be engineered with single nucleotide precision. Based on these developments, we propose a new strategy to engineer genes with predefined transcription dynamics (mean and standard deviation of the distribution of RNA numbers of a cell population). For this, we use stochastic modelling followed by genetic engineering, to design synthetic promoters whose rate-limiting steps kinetics allow achieving a desired RNA production kinetics. We present an example where, from a pre-defined kinetics, a stochastic model is first designed, from which a promoter is selected based on its rate-limiting steps kinetics. Next, we engineer mutant promoters and select the one that best fits the intended distribution of RNA numbers in a cell population. As the modelling strategies and databases of models, genetic constructs, and information on these constructs kinetics improve, we expect our strategy to be able to accommodate a wide variety of pre-defined RNA production kinetics.