Proceedings Article•DOI•

Modeling gene expression with differential equations.

Ting Chen¹, Hongyu He, George M. Church•Institutions (1)

01 Dec 1998-pp 29-40

TL;DR: The results suggest that a minor set of temporal data may be sufficient to construct the model at the genome level, and a comprehensive discussion of other extended models are given: the RNA Model, the Protein Model, and the Time Delay Model.

read less

Abstract: We propose a differential equation model for gene expression and provide two methods to construct the model from a set of temporal data. We model both transcription and translation by kinetic equations with feedback loops from translation products to transcription. Degradation of proteins and mRNAs is also incorporated. We study two methods to construct the model from experimental data: Minimum Weight Solutions to Linear Equations (MWSLE), which determines the regulation by solving under-determined linear equations, and Fourier Transform for Stable Systems (FTSS), which refines the model with cell cycle constraints. The results suggest that a minor set of temporal data may be sufficient to construct the model at the genome level. We also give a comprehensive discussion of other extended models: the RNA Model, the Protein Model, and the Time Delay Model.

...read moreread less

Summary (2 min read)

Jump to: [Introduction] – [Dynamic System for Gene Expression] – [Linear Transcription Model] – [Reconstructing Models from Temporal Data] – [Fourier Transform for Stable Systems] – [Minimum Weight Solutions to Linear Equations] – [RNA Model] – [Time-Delay Model] and [Limitations of the models and the approaches]

Introduction

The progress of genome sequencing and gene recognition has been quite signi cant in the last few years.
It is widely believed that gene expression data contains rich information that could discover the higher-order structures of an organism and even interpret its behavior.
In addition to the boolean networks, other models are also studied.
In summary, We propose a Linear Transcription Model for gene expression, as well as two algorithms to construct the model from a set of temporal samples of mRNAs and proteins: Minimum Weight Solutions to Linear Equations and Fourier Transform for Stable Systems.the authors.the authors.

Dynamic System for Gene Expression

The transcription of a gene begins with transcription elements, mostly proteins and RNAs, binding to regulatory sites on DNA.
The frequency of this binding a ects the level of expression.
On the other hand, since the DNA sequence is unchanged, the transcription is mostly determined by the amounts of transcription proteins.
The authors assume the translation mechanism is relatively stable (at least for a short time), so the feedback from proteins to mRNAs has no e ect.
The change in mRNA concentrations (dr=dt) equals the transcription (f (p)) minus the degradation (V r), and similarly, the change in protein concentrations (dp=dt) equals the translation (Lr) minus the degradation (Up).

Linear Transcription Model

Otherwise, the authors can still make the assumption from the following argument.
Because both V and U , the degradation rates, are nonsingular diagonal matrices, the authors can assume the equation has a unique solution.

Reconstructing Models from Temporal Data

Unfortunately, matrix M has yet to be determined because its sub-matrices are mostly unknown.
The authors will discuss how to determine M from temporal experimental data.
The authors will assume that they obtain a set of time-series samples of x(t0);x(t1); :::;x(tk), where x includes both mRNA and protein concentrations.

Fourier Transform for Stable Systems

The system is semistable if all the real parts of the eigenvalues of are non-positive.
The system is stable if it is semistable and all the polynomials qij(t) are constants.
Let matrix Q = fqijg, so Equation 3 can be simpli ed as x(t) = Qet (4) We observe that at every cell cycle, many genes repeat their expression patterns.the authors.the authors.
The transcription analysis of the yeast mitotic cell cycle10 revealed many similar expression patterns between two consecutive cell cycles.

Minimum Weight Solutions to Linear Equations

The over-determined linear equations can be solved by using least-square analysis, which takes O(k) time.
The authors can apply Lemma 1 into Equations (6)-(9) and obtain the following theorem: Theorem 2 Model 1 can be constructed in O(nh+1) time.
The additional \+1" comes from solving n genes.

RNA Model

Various recent techniques have focused on pro ling mRNA concentrations.
Gene expression can be partially modeled by the following dynamic system of mRNA concentrations.
There exists one general inverse C 1 that matches the real situation.
This is consistent with their understanding that proteins (and other subsumed feedbacks) are major operators in transcription and translation, and thus determine the fate of gene expression.
MRNA concentrations alone, handled in this manner at least, are not su cient to model the whole system of gene expression.

Time-Delay Model

The real gene expression mechanism has time delays in transcription and translation.
Therefore, the authors obtain the following theorem.
This theorem is in the same style as the other theorems the authors have proved, but apparently weaker: the constraints of the degree of Q(t) do not hold.
All the interesting questions regarding stability of Model 4 can be answered through the studies on the set S.

Limitations of the models and the approaches

Like many other models, the Linear Transcription Model (Model 1) does not consider time delays in transcription and translation.
This assumption greatly reduces the complexity of the problem.
The most signi cant limitation comes from ignorance of other regulators such as metabolites.
It is known that many genes and other factors directly or indirectly a ect the pathway that feeds back to transcription.
This assumption does not hold for some genes, and cell cycle length may vary too.

Did you find this useful? Give us your feedback

Figures (2)

Figure 1: Simpli ed dynamic system of gene regulation emphasizing feedback on transcription.

Figure 2: A stable system (top) and a semistable system (bottom).

Content maybe subject to copyright Report

MODELING GENE EXPRESSION WITH DIFFERENTIAL

EQUATIONS

TING CHEN

Department of Genetics, HarvardMedical School

Room 407, 77 Avenue Louis Pasteur, Boston, MA 02115 USA

tchen@salt2.med.harvard.edu

HONGYU L. HE

Department of Mathematics, Massachusetts Institute of Technology

Room 2-487, Cambridge, MA 02139 USA

hongyu@math.mit.edu

GEORGE M. CHURCH

Department of Genetics, HarvardMedical School

200 LongwoodAvenue, Boston, MA 02115 USA

church@salt2.med.harvard.edu

We prop ose a dierential equation mo del for gene expression and provide two

methods to construct the model from a set of temporal data. We model b oth tran-

scription and translation by kinetic equations with feedback loops from translation

products to transcription. Degradation of proteins and mRNAs is also incorpo-

rated. We study two metho ds to construct the model from exp erimental data:

Minimum Weight Solutions to Linear Equations (MWSLE), which determines the

regulation by solving under-determined linear equations, and Fourier Transform

for Stable Systems (FTSS), which renes the model with cell cycle constraints.

The results suggest that a minor set of temporal data may b e sucienttocon-

struct the mo del at the genome level. We also give a comprehensive discussion of

other extended models: the RNA Mo del, the Protein Model, and the Time Delay

Model.

Introduction

The progress of genome sequencing and gene recognition has been quite sig-

nicant in the last few years. However, the gap between a complete genome

sequence and a functional understanding of an organism is still huge. Many

questions about gene functions, expression mechanisms, and global integration

of individual mechanisms remain open. Due to the recent success of bioengi-

neering techniques, a series of large-scale analysis to ols have b een developed to

discover the functional organization of cells. DNA arrays and Mass spectrom-

etry have emerged as powerful techniques that are capable of proling RNA

and protein expression at a whole-genome level.

Published in 1999 Pacic Symposium of Biocomputing

It is widely b elieved that gene expression data contains rich information

that could discover the higher-order structures of an organism and even inter-

pret its b ehavior. Conceivably within a few years, a large amount of expression

data will be produced regularly as the cost of such experiments diminishes.

Biologists are expecting p owerful computational tools to extract functional in-

formation from the data. Critical eort is b eing made recently to build models

to analyze them.

One of the most studied mo dels is the Bo olean Network, where a gene has

oneofonlytwo states (ON and OFF), and the state is determined by a b oolean

function of the states of some other genes. Somogyi and Sniegoski

showed

that b oolean networks have features similar to those in biological systems,

such as global complex behavior, self-organization, stability, redundancy,and

perio dicity. Liang

et al.

implemented a reverse-engineering algorithm to infer

gene regulations and bo olean functions by computing the mutual information

between a gene and its candidate regulatory genes. Akutsu

et al.

gavean

algorithmic analysis of the problem of identifying b oolean networks from data

obtained bymultiple gene disruption and gene over-expressions in regard to

the number of experiments and the complexity of experiments.

In addition to the bo olean networks, other models are also studied. Thi-

ery and Thomas

discussed a generalized logical mo del and a feedback-loop

analysis. They suggested that a logical approach can be used to get a rst

overview of a dierential model and thus help to build and rene the model.

McAdams and Shapiro

proposed a nice hybrid model that integrates a conven-

tional biochemical kinetic model within the framework of a circuit simulation.

However, itisnotclearhow to determine mo del parameters from exp erimen-

tal data. Gene expression data can also b e analyzed directly by statistical and

optimization metho ds. Michaels

et al.

measured gene expression temporally

and applied statistical clustering methods to reveal the correlations b etween

patterns of gene expression and phenotypic changes. Chen

et al.

transferred

experimental data into a gene regulation graph and imposed optimization con-

straints to infer the true regulation by eliminating the errors in the graph.

In practice, the determination of the networks has to (1) derive regulatory

functions from a small set of data samples; (2) scale up to the genome level;

and (3) takeinto account the time delay in transcription and translation. In

this pap er, we prop ose a linear dierential equation mo del for gene expression

and two algorithms to solve the dierential equations. Potentially, our methods

answer the practical questions in (1) and (2), and we also make an eort to

incorporate (3). In summary,



We prop ose a Linear Transcription Model for gene expression, as well as

two algorithms to construct the mo del from a set of temp oral samples of

Genes mRNAs Proteins

Degradation

Feedback Control

Figure 1: Simplied dynamic system of gene regulation emphasizing feedback on transcrip-

tion.

mRNAs and proteins: Minimum Weight Solutions to Linear Equations

and Fourier Transform for Stable Systems.



We discuss three extended models: RNA Mo del, Protein Mo del and

Time Delay Model, among which the Protein Model parameters can be

reconstructed through a set of temp oral samples of protein expression

levels.



Our results suggest that it is p ossible to determine most of the gene

regulation in the genome level from a minor set of accurate temporal

data.

Dynamic System for Gene Expression

The transcription of a gene begins with transcription elements, mostly proteins

and RNAs, binding to regulatory sites on DNA. The frequency of this bind-

ing aects the level of expression. Experiments haveveried that a stronger

binding site will increase the eect of a protein on transcription rate. On the

other hand, since the DNA sequence is unchanged, the transcription is mostly

determined by the amounts of transcription proteins. In translation, proteins

are synthesized at rib osomes. An mRNA can be translated into one or multiple

copies of corresp onding proteins, which can further change the transcription

of other genes. A feedback network of genes, mRNAs and proteins is shown in

Figure 1.

In Figure 1, we ignore other feedbacksuch as mRNAs to genes, since

we subsume such eects in the protein feedback indicated. We assume the

translation mechanism is relatively stable (at least for a short time), so the

feedback from proteins to mRNAs has no eect. Each mRNA and protein

molecule degrades randomly, and its components are recycled in the cell. One

important feedback missing here is from metab olites to the transcription, which

also plays a key role in signaling. Then, Figure 1 can be mo deled as a nonlinear

dynamic system:

(

)

;

(1)

where the variables are functions of time

and dened as follows:

The number of genes in the genome;

mRNA concentrations,

-dimensional vector-valued functions of

;

Protein concentrations,

-dimensional vector-valued functions of

;

(

) Transcription functions,

-dimensional vector polynomials on

;

Translational constants,



non-degenerate diagonal matrix;

Degradation rates of mRNAs;



non-degenerate diagonal matrix;

Degradation rates of Proteins,



non-degenerate diagonal matrix;

The change in mRNA concentrations (

=dt

) equals the transcription (

(

))

minus the degradation (

), and similarly,the change in protein concentra-

tions (

=dt

) equals the translation (

)minus the degradation (

). Here,

and

are non-degenerate diagonal matrices, because we assume both

the translation rates and the degradation rates are constants for each species.

Also, we consider zero time delay in transcription and translation, and leave

the time delay case to a later section.

Linear Transcription Mo del

First we assume the transcription functions,

(

), to be linear functions of

(

.For example, a combined eect of activators and inhibitors in tran-

scription can be described by a linear function in the form of

[

activator s

]

;

[

inhibitors

], where

and

are contributions of the activators and the

inhibitors to the gene regulation. Otherwise, we can still make the assumption

from the following argument.

Welet

be the value of

at time zero, and take the rst-order Taylor

approximation:

(

) =

(

)

(

;

)

where

(

)

and

(

)

;

(

)

. Therefore, wemay study

Equation 1 (near

;

To eliminate

byvariable substitution, we apply

and

into Equation 1 to calculate what constants

and

suce to eliminate

and obtain

;

)

where

and

can be determined by the following equation:



;

V C

;







;



Because both

and

, the degradation rates, are nonsingular diagonal ma-

trices, we can assume the equation has a unique solution. Therefore it suces

to consider the following dynamic system even if

(

) is nonlinear.

;

(2)

We can dene the Linear Transcription Model as

Model 1

Let

;

)

be variables for mRNAs and proteins, M bea



transition matrix, and gene expression can bemodeled by the fol lowing

dynamic system:

wher e M



;

V C

;



Solution to Linear Transcription Model

Assume

has

eigenvalues



:::

)

.Itiswell-known that the

dynamic system in Model 1 has the following solution:

Theorem 1

The solution to Model 1 is of the form

(

)

t

(3)

where

(

)

satises

deg(

(

)) + 1



for

;

;:::;

(

)isa2



matrix whose elements are p olynomial functions of

,and

deg() returns the degree of a polynomial function.

HTML Viewer

Frequently Asked Questions (18)

Q1. What are the two techniques that can be used to construct the model?

DNA arrays and Mass spectrometry have emerged as powerful techniques that are capable of pro ling RNA and protein expression at a whole-genome level.

Q2. What can be done to change the transcription of other genes?

An mRNA can be translated into one or multiple copies of corresponding proteins, which can further change the transcription of other genes.

Q3. What is the studied model of the Boolean Network?

One of the most studied models is the Boolean Network, where a gene has one of only two states (ON and OFF), and the state is determined by a boolean function of the states of some other genes.

Q4. What is the definition of a stable system?

The gene expression system has to be a stable system since an exponential or a polynomial growth rate of a gene or a protein is unlikely to happen.

Q5. What are the main features of the boolean networks?

Somogyi and Sniegoski 2 showed that boolean networks have features similar to those in biological systems, such as global complex behavior, self-organization, stability, redundancy, and periodicity.

Q6. How many years will a large amount of expression data be produced regularly?

Conceivably within a few years, a large amount of expression data will be produced regularly as the cost of such experiments diminishes.

Q7. What is the simplest way to determine qij?

The system is unstable if there exists a positive eigenvalue of , because the term qij(t)e j t is an exponential function if j has a positive value.

Q8. What is the main purpose of this paper?

In this paper, the authors propose a linear di erential equation model for gene expression and two algorithms to solve the di erential equations.

Q9. What is the way to determine the transcription of a gene?

The authors assume the translation mechanism is relatively stable (at least for a short time), so the feedback from proteins to mRNAs has no e ect.

Q10. How did Chen and Thomas create the graph?

Chen et al. 6 transferred experimental data into a gene regulation graph and imposed optimization constraints to infer the true regulation by eliminating the errors in the graph.

Q11. what is the dt of a mRNA?

The number of genes in the genome; r mRNA concentrations, n-dimensional vector-valued functions of t; p Protein concentrations, n-dimensional vector-valued functions of t; f (p) Transcription functions, n-dimensional vector polynomials on p; L Translational constants, n n non-degenerate diagonal matrix; V Degradation rates of mRNAs; n n non-degenerate diagonal matrix; U Degradation rates of Proteins, n n non-degenerate diagonal matrix;The change in mRNA concentrations (dr=dt) equals the transcription (f (p)) minus the degradation (V r), and similarly, the change in protein concentrations (dp=dt) equals the translation (Lr) minus the degradation (Up).

Q12. what is the cij in model 1?

the transcription matrix C in Model 1 represents gene regulatory networks: cij 6= 0 indicates gene j is a regulator for the transcription of gene i, and cij = 0 indicates gene j is not a regulator for gene i.

Q13. What is the important feedback for the transcription of a gene?

On the other hand, since the DNA sequence is unchanged, the transcription is mostly determined by the amounts of transcription proteins.

Q14. What is the other approach of MWSLE?

The other approach of MWSLE assumes the number of regulators of a gene is a small constant, but the actual number may be much larger than expected and the solution may be intractable computationally.

Q15. What is the simplest way to determine M from temporal experimental data?

The authors will assume that the authors obtain a set of time-series samples of x(t0);x(t1); :::;x(tk), where x includes both mRNA and protein concentrations.

Q16. What is the nal equation for dt2?

The nal equation isd2p dt2 = ( LVL 1 U ) dp dt + ( LVL 1U + LC )p (11)Here, L is a non-degenerate diagonal matrix and its inverse L 1 exists.

Q17. what is the dt function in a dynamic system?

It is well-known that the dynamic system in Model 1 has the following solution: Theorem 1 The solution to Model 1 is of the formx(t) = Q(t)et (3)where Q(t) = fqij(t)g satis es2nX j=1 deg(qij(t)) + 1 2n for i = 1; 2; :::; 2nQ(t) is a 2n 2n matrix whose elements are polynomial functions of t, and deg() returns the degree of a polynomial function.

Q18. what is the r p= q(t)e t?

The solutions to Model 4 are of the following form:r p= Q(t)e twhere are eigenvalues of S, and Q(t) is a matrix whose elements are polynomials on t.

Modeling gene expression with differential equations.

Summary (2 min read)

Introduction

Dynamic System for Gene Expression

Linear Transcription Model

Reconstructing Models from Temporal Data

Fourier Transform for Stable Systems

Minimum Weight Solutions to Linear Equations

RNA Model

Time-Delay Model

Limitations of the models and the approaches

Figures (2)

Citations

Cites background from "Modeling gene expression with diffe..."

Cites background from "Modeling gene expression with diffe..."

Cites methods from "Modeling gene expression with diffe..."

References

Additional excerpts

Related Papers (5)

Frequently Asked Questions (18)

Q1. What are the two techniques that can be used to construct the model?

Q2. What can be done to change the transcription of other genes?

Q3. What is the studied model of the Boolean Network?

Q4. What is the definition of a stable system?

Q5. What are the main features of the boolean networks?

Q6. How many years will a large amount of expression data be produced regularly?

Q7. What is the simplest way to determine qij?

Q8. What is the main purpose of this paper?

Q9. What is the way to determine the transcription of a gene?

Q10. How did Chen and Thomas create the graph?

Q11. what is the dt of a mRNA?

Q12. what is the cij in model 1?

Q13. What is the important feedback for the transcription of a gene?

Q14. What is the other approach of MWSLE?

Q15. What is the simplest way to determine M from temporal experimental data?

Q16. What is the nal equation for dt2?

Q17. what is the dt function in a dynamic system?

Q18. what is the r p= q(t)e t?