scispace - formally typeset
Search or ask a question

Showing papers presented at "Computational Methods in Systems Biology in 2017"


Book ChapterDOI
27 Sep 2017
TL;DR: The strong (uniform computability) Turing completeness of chemical reaction networks over a finite set of molecular species under the differential semantics is derived, solving a long standing open problem.
Abstract: When seeking to understand how computation is carried out in the cell to maintain itself in its environment, process signals and make decisions, the continuous nature of protein interaction processes forces us to consider also analog computation models and mixed analog-digital computation programs. However, recent results in the theory of analog computability and complexity establish fundamental links with classical programming. In this paper, we derive from these results the strong (uniform computability) Turing completeness of chemical reaction networks over a finite set of molecular species under the differential semantics , solving a long standing open problem. Furthermore we derive from the proof a compiler of mathematical functions into elementary chemical reactions. We illustrate the reaction code generated by our compiler on trigonometric functions, and on various sigmoid functions which can serve as markers of presence or absence for implementing program control instructions in the cell and imperative programs. Then we start comparing our compiler-generated circuits to the natural circuit of the MAPK signaling network, which plays the role of an analog-digital converter in the cell with a Hill type sigmoid input/output functions.

63 citations


Book ChapterDOI
27 Sep 2017
TL;DR: The software Pint implements formal approximations of transient reachability-related properties, including mutation prediction and model reduction, which encompass Boolean and discrete networks.
Abstract: The software Pint is devoted to the scalable analysis of the traces of automata networks, which encompass Boolean and discrete networks. Pint implements formal approximations of transient reachability-related properties, including mutation prediction and model reduction.

31 citations


Book ChapterDOI
27 Sep 2017
TL;DR: This article establishes a complete characterization of temporal perturbations, whether permanent (mutations) or only temporary, to achieve the existential or inevitable reachability of an arbitrary state of the system.
Abstract: Cellular reprogramming, a technique that opens huge opportunities in modern and regenerative medicine, heavily relies on identifying key genes to perturb. Most of computational methods focus on finding mutations to apply to the initial state in order to control which attractor the cell will reach. However, it has been shown, and is proved in this article, that waiting between the perturbations and using the transient dynamics of the system allow new reprogramming strategies. To identify these temporal perturbations, we consider a qualitative model of regulatory networks, and rely on Petri nets to model their dynamics and the putative perturbations. Our method establishes a complete characterization of temporal perturbations, whether permanent (mutations) or only temporary, to achieve the existential or inevitable reachability of an arbitrary state of the system. We apply a prototype implementation on small models from the literature and show that we are able to derive temporal perturbations to achieve trans-differentiation.

18 citations


Book ChapterDOI
27 Sep 2017
TL;DR: An extensive in silico evaluation of the robust AP design is provided, demonstrating the potential of this approach, without explicit meal announcements, to support high carbohydrate disturbances and to regulate glucose levels in large clusters of virtual patients learned from population-wide survey data.
Abstract: We present a fully closed-loop design for an artificial pancreas (AP) which regulates the delivery of insulin for the control of Type I diabetes. Our AP controller operates in a fully automated fashion, without requiring any manual interaction (e.g. in the form of meal announcements) with the patient. A major obstacle to achieving closed-loop insulin control is the uncertainty in those aspects of a patient’s daily behavior that significantly affect blood glucose, especially in relation to meals and physical activity. To handle such uncertainties, we develop a data-driven robust model-predictive control framework, where we capture a wide range of individual meal and exercise patterns using uncertainty sets learned from historical data. These sets are then used in the controller and state estimator to achieve automated, precise, and personalized insulin therapy. We provide an extensive in silico evaluation of our robust AP design, demonstrating the potential of this approach, without explicit meal announcements, to support high carbohydrate disturbances and to regulate glucose levels in large clusters of virtual patients learned from population-wide survey data.

18 citations


Book ChapterDOI
27 Sep 2017
TL;DR: A theoretical framework where mutations and drug actions are seen as topological perturbations/actions on molecular networks inducing cell phenotype reprogramming is proposed and a new algorithm using abductive reasoning principles inferring the minimal causal topological actions leading to an expected behavior at stable state is presented.
Abstract: A major challenge in cancer research is to determine the genetic mutations causing the cancerous phenotype of cells and conversely, the actions of drugs initiating programmed cell death in cancer cells. However, such a challenge is compounded by the complexity of the genotype-phenotype relationship and therefore, requires to relate the molecular effects of mutations and drugs to their consequences on cellular phenotypes. Discovering these complex relationships is at the root of new molecular drug targets discovery and cancer etiology investigation. In their elucidation, computational methods play a major role for the inference of the molecular causal actions from molecular and biological networks data analysis. In this article, we propose a theoretical framework where mutations and drug actions are seen as topological perturbations/actions on molecular networks inducing cell phenotype reprogramming. The framework is based on Boolean control networks where the topological network actions are modelled by control parameters. We present a new algorithm using abductive reasoning principles inferring the minimal causal topological actions leading to an expected behavior at stable state. The framework is validated on a model of network regulating the proliferation/apoptosis switch in breast cancer by automatically discovering driver genes and finding drug targets.

18 citations


Book ChapterDOI
27 Sep 2017
TL;DR: This tool provides a finite partition of parameter space such that for each region in this partition a global description of the dynamical behavior of a network is given via a directed acyclic graph called a Morse graph.
Abstract: We present a computational tool DSGRN for exploring network dynamics across the global parameter space for switching model representations of regulatory networks. This tool provides a finite partition of parameter space such that for each region in this partition a global description of the dynamical behavior of a network is given via a directed acyclic graph called a Morse graph. Using this method, parameter regimes or entire networks may be rejected as viable models for representing the underlying regulatory mechanisms.

17 citations


Book ChapterDOI
27 Sep 2017
TL;DR: The main current functionalities of KaDE are described and some benchmarks on case studies are given and the definition of Kappa models as a set of context-free rewrite rules is explained.
Abstract: Kappa is a formal language that can be used to model systems of biochemical interactions among proteins. It offers several semantics to describe the behaviour of Kappa models at different levels of abstraction. Each Kappa model is a set of context-free rewrite rules. One way to understand the semantics of a Kappa model is to read its rules as an implicit description of a (potentially infinite) reaction network. KaDE is interpreting this definition to compile Kappa models into reaction networks (or equivalently into sets of ordinary differential equations). KaDE uses a static analysis that identifies pairs of sites that are indistinguishable from the rules point of view, to infer backward and forward bisimulations, hence reducing the size of the underlying reaction networks without having to generate them explicitly. In this paper, we describe the main current functionalities of KaDE and we give some benchmarks on case studies.

14 citations


Book ChapterDOI
27 Sep 2017
TL;DR: This manuscript proposes a method to adaptively and automatically select these population sizes using the cross-validated approximation error of a kernel density estimate of the particles in the current population to select the number of particles for the subsequent population.
Abstract: Parameter inference and model selection in systems biology often requires likelihood-free methods, such as Approximate Bayesian Computation (ABC). In recent years, this approach has frequently been combined with a Sequential Monte Carlo (ABC-SMC) scheme. In this scheme, the approximation of the posterior distribution through a population of particles is iteratively improved by a sequential sampling strategy. However, it has been difficult to give general guidelines on how to choose the size of these populations. In this manuscript, we propose a method to adaptively and automatically select these population sizes. The method exploits the cross-validated approximation error of a kernel density estimate of the particles in the current population to select the number of particles for the subsequent population.

14 citations


Book ChapterDOI
27 Sep 2017
TL;DR: This work presents a formalization of arrhythmia-detection algorithms in the language of Quantitative Regular Expressions, a flexible formal language for specifying complex numerical queries over data streams, with provable runtime and memory consumption guarantees.
Abstract: Motivated by the problem of verifying the correctness of arrhythmia-detection algorithms, we present a formalization of these algorithms in the language of Quantitative Regular Expressions. QREs are a flexible formal language for specifying complex numerical queries over data streams, with provable runtime and memory consumption guarantees. The medical-device algorithms of interest include peak detection (where a peak in a cardiac signal indicates a heartbeat) and various discriminators, each of which uses a feature of the cardiac signal to distinguish fatal from non-fatal arrhythmias. Expressing these algorithms’ desired output in current temporal logics, and implementing them via monitor synthesis, is cumbersome, error-prone, computationally expensive, and sometimes infeasible.

14 citations


Book ChapterDOI
27 Sep 2017
TL;DR: A novel method for detecting tSCCs in parametrised graphs is introduced with a parallel algorithm and evaluated on discrete abstractions of several non-linear biological models.
Abstract: Complex behaviour arising in biological systems is typically characterised by various kinds of attractors. An important problem in this area is to determine these attractors. Biological systems are usually described by highly parametrised dynamical models that can be represented as parametrised graphs typically constructed as discrete abstractions of continuous-time models. In such models, attractors are observed in the form of terminal strongly connected components (tSCCs). In this paper, we introduce a novel method for detecting tSCCs in parametrised graphs. The method is supplied with a parallel algorithm and evaluated on discrete abstractions of several non-linear biological models.

12 citations


Book ChapterDOI
27 Sep 2017
TL;DR: In this paper, the authors extend existing Hidden Markov Models (HMMs) for DNA methylation by describing the occurrence of spatial methylation patterns over time and propose several models with different neighborhood dependencies.
Abstract: DNA methylation is an epigenetic mechanism whose important role in development has been widely recognized. This epigenetic modification results in heritable changes in gene expression not encoded by the DNA sequence. The underlying mechanisms controlling DNA methylation are only partly understood and recently different mechanistic models of enzyme activities responsible for DNA methylation have been proposed. Here we extend existing Hidden Markov Models (HMMs) for DNA methylation by describing the occurrence of spatial methylation patterns over time and propose several models with different neighborhood dependencies. We perform numerical analysis of the HMMs applied to bisulfite sequencing measurements and accurately predict wild-type data. In addition, we find evidence that the enzymes’ activities depend on the left 5’ neighborhood but not on the right 3’ neighborhood.

Book ChapterDOI
27 Sep 2017
TL;DR: The general question of what constitutes bio-curation for rule-based modelling of cellular signalling is posed and a general approach is presented, based on rewriting in hierarchies of graphs, together with a specific instantiation of the methodology that addresses the particular bio- curation problem.
Abstract: The general question of what constitutes bio-curation for rule-based modelling of cellular signalling is posed. A general approach to the problem is presented, based on rewriting in hierarchies of graphs, together with a specific instantiation of the methodology that addresses our particular bio-curation problem. The current state of the ongoing development of the KAMI (Knowledge Aggregator & Model Instantiator) bio-curation tool, based on this approach, is detailed along with our plans for future development.

Book ChapterDOI
27 Sep 2017
TL;DR: This work presents an approach to analyze sets of monotonic Boolean models consistent with given signed interactions between systems components and shows that for each such model constraints on its behavior can be derived from a universally constructed state transition graph essentially capturing possible sign changes of the derivative.
Abstract: In the face of incomplete data on a system of interest, constraint-based Boolean modeling still allows for elucidating system characteristics by analyzing sets of models consistent with the available information. In this setting, methods not depending on consideration of every single model in the set are necessary for efficient analysis. Drawing from ideas developed in qualitative differential equation theory, we present an approach to analyze sets of monotonic Boolean models consistent with given signed interactions between systems components. We show that for each such model constraints on its behavior can be derived from a universally constructed state transition graph essentially capturing possible sign changes of the derivative. Reachability results of the modeled system, e.g., concerning trap or no-return sets, can then be derived without enumerating and analyzing all models in the set. The close correspondence of the graph to similar objects for differential equations furthermore opens up ways to relate Boolean and continuous models.

Book ChapterDOI
27 Sep 2017
TL;DR: This work proposes a framework that utilizes the results from state-of-the-art biomedical literature mining, biological system modeling and analysis techniques, and provides means to scientists to assemble and reason about information from voluminous, fragmented and sometimes inconsistent literature.
Abstract: Biomedical research results are being published at a high rate, and with existing search engines, the vast amount of published work is usually easily accessible. However, reproducing published results, either experimental data or observations is often not viable. In this work, we propose a framework to overcome some of the issues of reproducing previous research, and to ensure re-usability of published information. We present here a framework that utilizes the results from state-of-the-art biomedical literature mining, biological system modeling and analysis techniques, and provides means to scientists to assemble and reason about information from voluminous, fragmented and sometimes inconsistent literature. The overall process of automated reading, assembly and reasoning can speed up discoveries from the order of decades to the order of hours or days. Our framework described here allows for rapidly conducting thousands of in silico experiments that are designed as part of this process.

Book ChapterDOI
27 Sep 2017
TL;DR: Methods for using the PL STM knowledge base and the PLA tools to explain observed perturbations of signaling pathways when cells are treated with drugs targeting specific activities or protein states are described.
Abstract: Pathway Logic (PL) is a general system for modeling signal transduction and other cellular processes with the objective of understanding how cells work. Each specific model system builds on a knowledge base of rules formalizing local process steps such as post translational modification. The Pathway Logic Assistant (PLA) is a collection of visualization and reasoning tools that allow users to derive specific executable models by specifying of an initial state. The resulting network of rule instances describes possible behaviors of the modelled system. Subnets and pathways can then be computed (they are not hard wired) by specifying states to reach and/or to avoid. The STM knowledge base is a curated collection of signal transduction rules supported by experimental evidence. In this paper we describe methods for using the PL STM knowledge base and the PLA tools to explain observed perturbations of signaling pathways when cells are treated with drugs targeting specific activities or protein states. We also explore ideas for conjecturing targets of unknown drugs. We illustrate the methods on phosphoproteomics data (RPPA) from SKMEL133 melanoma cancer cells treated with different drugs targeting components of cancer signaling pathways. Existing curated knowledge allowed to us explain many of the responses. Conflicts between the STM model predictions and the data suggest missing requirements for rules to apply.

Book ChapterDOI
27 Sep 2017
TL;DR: A technique for discovering behavioural properties of bio-pathway models whose dynamics is modelled as a system of ordinary differential equations (ODEs) using a set of property templates using bounded linear-time temporal logic (BLTL).
Abstract: Identifying non-trivial requirements for large complex dynamical systems is a challenging but fruitful task. Once identified such requirements can be used to validate updated versions of the system and verify functionally similar systems. Here we present a technique for discovering behavioural properties of bio-pathway models whose dynamics is modelled as a system of ordinary differential equations (ODEs). These models are usually accompanied at best by high level functional requirements while undergoing many revisions as new experimental data becomes available. In this setting we first specify a set of property templates using bounded linear-time temporal logic (BLTL). A template will have the skeletal structure of a BLTL formula but the time bounds associated with the temporal operator as well as the value bounds associated with the system variables encoded as atomic propositions will be unknown parameters. We classify a given model’s behavior as corresponding to one of these templates using a convolutional neural network. We then synthesize a concrete property from this template by estimating its parameters via a standard search procedure combined with statistical model checking (SMC). We have synthesized and validated properties of a number of pathway models of varying complexity using our method.

Book ChapterDOI
27 Sep 2017
TL;DR: This paper investigates the use of the Probably Approximately Correct (PAC) learning framework of Leslie Valiant as a method for the automated discovery of influence models of biochemical processes from Boolean and stochastic traces and evaluates the performance of this approach on a model of T-lymphocyte differentiation.
Abstract: Automating the process of model building from experimental data is a very desirable goal to palliate the lack of modellers for many applications. However, despite the spectacular progress of machine learning techniques in data analytics, classification, clustering and prediction making, learning dynamical models from data time-series is still challenging. In this paper we investigate the use of the Probably Approximately Correct (PAC) learning framework of Leslie Valiant as a method for the automated discovery of influence models of biochemical processes from Boolean and stochastic traces. We show that Thomas’ Boolean influence systems can be naturally represented by k-CNF formulae, and learned from time-series data with a number of Boolean activation samples per species quasi-linear in the precision of the learned model, and that positive Boolean influence systems can be represented by monotone DNF formulae and learned actively with both activation samples and oracle calls. We consider Boolean traces and Boolean abstractions of stochastic simulation traces, and study the space-time tradeoff there is between the diversity of initial states and the length of the time horizon, and its impact on the error bounds provided by the PAC learning algorithms. We evaluate the performance of this approach on a model of T-lymphocyte differentiation, with and without prior knowledge, and discuss its merits as well as its limitations with respect to realistic experiments.

Book ChapterDOI
27 Sep 2017
TL;DR: The study of complex biological processes requires to forgo simplified models for extensive ones, yet these models’ size and complexity place them beyond understanding, and identifying functional patterns in such a network requires new automated methods.
Abstract: The study of complex biological processes requires to forgo simplified models for extensive ones. Yet, these models’ size and complexity place them beyond understanding. Their analysis requires new methods for identifying general patterns. The Transforming Growth Factor TGF-\(\beta \) is a multifunctional cytokine that regulates mammalian cell development, differentiation, and homeostasis. Depending on the context, it can play the antagonistic roles of growth inhibitor or of tumor promoter. Its context-dependent pleiotropic nature is associated with complex signaling pathways. The most comprehensive model of TGF-\(\beta \)-dependent signaling is composed of 15,934 chains of reactions (trajectories) linking TGF-\(\beta \) to at least one of its 159 target genes. Identifying functional patterns in such a network requires new automated methods.

BookDOI
27 Sep 2017
TL;DR: This book constitutes the refereed proceedings of the 15th International Conference on Computational Methods in Systems Biology, CMSB 2017, held in Darmstadt, Germany, in September 2017 and contains 15 full papers, 4 tool papers and 4 posters presented together with 1 invited talk.
Abstract: This book constitutes the refereed proceedings of the 15th International Conference on Computational Methods in Systems Biology, CMSB 2017, held in Darmstadt, Germany, in September 2017. The 15 full papers, 4 tool papers and 4 posters presented together with 1 invited talk were carefully reviewed and selected from 41 regular paper submissions. Topics of interest include formalisms for modeling biological processes; models and their biological applications; frameworks for model verication, validation, analysis, and simulation of biological systems; high-performance computational systems biology and parallel implementations; model inference from experimental data; model integration from biological databases; multi-scale modeling and analysis methods; and computational approaches for synthetic biology.

Proceedings Article
01 Jan 2017
TL;DR: A large homogenous population of cells is considered, where each cell is governed by the same complex biological pathway, and several representations over distributions of populations of cells obtained from several fine-grained models of pathways are compared.
Abstract: We consider a large homogenous population of cells, where each cell is governed by the same complex biological pathway A good modeling of the inherent variability of biological species is of crucial importance to the understanding of how the population evolves In this work, we handle this variability by considering multivariate distributions, where each species is a random variable Usually, the number of species in a pathway-and thus the number of variables-is high This appealing approach thus quickly faces the curse of dimensionality: representing exactly the distribution of a large number of variables is intractable To make this approach tractable, we explore different techniques to approximate the original joint distribution by meaningful and tractable ones The idea is to consider families of joint probability distributions on large sets of random variables that admit a compact representation, and then select within this family the one that best approximates the desired intractable one Natural measures of approximation accuracy can be derived from information theory We compare several representations over distributions of populations of cells obtained from several fine-grained models of pathways (eg ODEs) We also explore the interest of such approximate distributions for approximate inference algorithms [1, 2] for coarse-grained abstractions of biological pathways [3] 2 Results Our approximation scheme is to drop most correlations between variables Indeed , when many variables are conditionally independent, the multivariate distribution can be compactly represented The key is to keep the most relevant correlations, evaluated using the mutual information (MI) between two variables The simplest approximation is called fully factored (FF), and assumes that all the variables are independent It leads to very compact representation and fast computations, but it also leads to fairly inaccurate results as correlations between variables are entirely lost, even for highly correlated species (MI = 06) Alternately, one can preserve a few of the strongest correlations, selected using MI, giving rise to a set of disjoint clusters of variables For efficiency reason, we used clusters of size two This model was able to capture some of the most significant correlations between pairs of variables (representing around 30% of the total MI), but dropped significant ones (MI = 02)

Book ChapterDOI
27 Sep 2017
TL;DR: A more abstract framework making equilibria first-class citizens is proposed, which fosters the detection of toxicity pathways and describes qualitative equilibrium changes and the chaining of rules is controlled by constraints expressed in extended temporal logic.
Abstract: Toxicology aims at studying the adverse effects of exogenous chemicals on organisms. As these effects mainly concern metabolic pathways, reasoning about toxicity would involve metabolism modeling approaches. Usually, metabolic network models approaches are rule-based and describe chemical reactions, indirectly depicting equilibria as results of competing rule kinetics. By altering these kinetics, an exogenous compound can shift the system equilibria and induce toxicity. As equilibria are kept implicit, the identification of possible toxicity pathways is hindered as they require a fine understanding of chemical reactions dynamics to infer possible equilibria disruptions. Paradoxically, the toxicity pathways are based on a succession of very abstract (coarse grained) events. To reduce this mismatch, we propose a more abstract framework making equilibria first-class citizens. Our rules describe qualitative equilibrium changes and the chaining of rules is controlled by constraints expressed in extended temporal logic. This higher abstraction level fosters the detection of toxicity pathways, as we will show through an example of endocrine disruption of the thyroid hormone system.

Book ChapterDOI
27 Sep 2017
TL;DR: The package TransferEntropyPT provides R functions to calculate the transfer entropy (TE) for time series of (binned) data and a function to assess the statistical significance of the TE using permutation tests on the sequential data of the time series.
Abstract: The package TransferEntropyPT provides R functions to calculate the transfer entropy (TE) [6] for time series of (binned) data. The package provides a function to assess the statistical significance of the TE using permutation tests on the sequential data of the time series. The underlying code base is written in C++ for computational efficiency and makes use of the boost and OpenMP libraries for parallelization of the data-parallel tasks in the permutation tests. In addition to p-values from hypothesis tests on independence, the package provides direct access to the percentiles themselves. An anticipatory toy model, as well as a biological network is used as show cases. Here, every time series concentrations of a single molecular species is tested and assessed against each other.