scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Modeling gene expression with differential equations.

01 Dec 1998-pp 29-40
TL;DR: The results suggest that a minor set of temporal data may be sufficient to construct the model at the genome level, and a comprehensive discussion of other extended models are given: the RNA Model, the Protein Model, and the Time Delay Model.
Abstract: We propose a differential equation model for gene expression and provide two methods to construct the model from a set of temporal data. We model both transcription and translation by kinetic equations with feedback loops from translation products to transcription. Degradation of proteins and mRNAs is also incorporated. We study two methods to construct the model from experimental data: Minimum Weight Solutions to Linear Equations (MWSLE), which determines the regulation by solving under-determined linear equations, and Fourier Transform for Stable Systems (FTSS), which refines the model with cell cycle constraints. The results suggest that a minor set of temporal data may be sufficient to construct the model at the genome level. We also give a comprehensive discussion of other extended models: the RNA Model, the Protein Model, and the Time Delay Model.

Summary (2 min read)

Introduction

  • The progress of genome sequencing and gene recognition has been quite signi cant in the last few years.
  • It is widely believed that gene expression data contains rich information that could discover the higher-order structures of an organism and even interpret its behavior.
  • In addition to the boolean networks, other models are also studied.
  • In summary, We propose a Linear Transcription Model for gene expression, as well as two algorithms to construct the model from a set of temporal samples of mRNAs and proteins: Minimum Weight Solutions to Linear Equations and Fourier Transform for Stable Systems.the authors.the authors.

Dynamic System for Gene Expression

  • The transcription of a gene begins with transcription elements, mostly proteins and RNAs, binding to regulatory sites on DNA.
  • The frequency of this binding a ects the level of expression.
  • On the other hand, since the DNA sequence is unchanged, the transcription is mostly determined by the amounts of transcription proteins.
  • The authors assume the translation mechanism is relatively stable (at least for a short time), so the feedback from proteins to mRNAs has no e ect.
  • The change in mRNA concentrations (dr=dt) equals the transcription (f (p)) minus the degradation (V r), and similarly, the change in protein concentrations (dp=dt) equals the translation (Lr) minus the degradation (Up).

Linear Transcription Model

  • Otherwise, the authors can still make the assumption from the following argument.
  • Because both V and U , the degradation rates, are nonsingular diagonal matrices, the authors can assume the equation has a unique solution.

Reconstructing Models from Temporal Data

  • Unfortunately, matrix M has yet to be determined because its sub-matrices are mostly unknown.
  • The authors will discuss how to determine M from temporal experimental data.
  • The authors will assume that they obtain a set of time-series samples of x(t0);x(t1); :::;x(tk), where x includes both mRNA and protein concentrations.

Fourier Transform for Stable Systems

  • The system is semistable if all the real parts of the eigenvalues of are non-positive.
  • The system is stable if it is semistable and all the polynomials qij(t) are constants.
  • Let matrix Q = fqijg, so Equation 3 can be simpli ed as x(t) = Qet (4) We observe that at every cell cycle, many genes repeat their expression patterns.the authors.the authors.
  • The transcription analysis of the yeast mitotic cell cycle10 revealed many similar expression patterns between two consecutive cell cycles.

Minimum Weight Solutions to Linear Equations

  • The over-determined linear equations can be solved by using least-square analysis, which takes O(k) time.
  • The authors can apply Lemma 1 into Equations (6)-(9) and obtain the following theorem: Theorem 2 Model 1 can be constructed in O(nh+1) time.
  • The additional \+1" comes from solving n genes.

RNA Model

  • Various recent techniques have focused on pro ling mRNA concentrations.
  • Gene expression can be partially modeled by the following dynamic system of mRNA concentrations.
  • There exists one general inverse C 1 that matches the real situation.
  • This is consistent with their understanding that proteins (and other subsumed feedbacks) are major operators in transcription and translation, and thus determine the fate of gene expression.
  • MRNA concentrations alone, handled in this manner at least, are not su cient to model the whole system of gene expression.

Time-Delay Model

  • The real gene expression mechanism has time delays in transcription and translation.
  • Therefore, the authors obtain the following theorem.
  • This theorem is in the same style as the other theorems the authors have proved, but apparently weaker: the constraints of the degree of Q(t) do not hold.
  • All the interesting questions regarding stability of Model 4 can be answered through the studies on the set S.

Limitations of the models and the approaches

  • Like many other models, the Linear Transcription Model (Model 1) does not consider time delays in transcription and translation.
  • This assumption greatly reduces the complexity of the problem.
  • The most signi cant limitation comes from ignorance of other regulators such as metabolites.
  • It is known that many genes and other factors directly or indirectly a ect the pathway that feeds back to transcription.
  • This assumption does not hold for some genes, and cell cycle length may vary too.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

MODELING GENE EXPRESSION WITH DIFFERENTIAL
EQUATIONS
a
TING CHEN
Department of Genetics, HarvardMedical School
Room 407, 77 Avenue Louis Pasteur, Boston, MA 02115 USA
tchen@salt2.med.harvard.edu
HONGYU L. HE
Department of Mathematics, Massachusetts Institute of Technology
Room 2-487, Cambridge, MA 02139 USA
hongyu@math.mit.edu
GEORGE M. CHURCH
Department of Genetics, HarvardMedical School
200 LongwoodAvenue, Boston, MA 02115 USA
church@salt2.med.harvard.edu
We prop ose a dierential equation mo del for gene expression and provide two
methods to construct the model from a set of temporal data. We model b oth tran-
scription and translation by kinetic equations with feedback loops from translation
products to transcription. Degradation of proteins and mRNAs is also incorpo-
rated. We study two metho ds to construct the model from exp erimental data:
Minimum Weight Solutions to Linear Equations (MWSLE), which determines the
regulation by solving under-determined linear equations, and Fourier Transform
for Stable Systems (FTSS), which renes the model with cell cycle constraints.
The results suggest that a minor set of temporal data may b e sucienttocon-
struct the mo del at the genome level. We also give a comprehensive discussion of
other extended models: the RNA Mo del, the Protein Model, and the Time Delay
Model.
Introduction
The progress of genome sequencing and gene recognition has been quite sig-
nicant in the last few years. However, the gap between a complete genome
sequence and a functional understanding of an organism is still huge. Many
questions about gene functions, expression mechanisms, and global integration
of individual mechanisms remain open. Due to the recent success of bioengi-
neering techniques, a series of large-scale analysis to ols have b een developed to
discover the functional organization of cells. DNA arrays and Mass spectrom-
etry have emerged as powerful techniques that are capable of proling RNA
and protein expression at a whole-genome level.
a
Published in 1999 Pacic Symposium of Biocomputing

It is widely b elieved that gene expression data contains rich information
that could discover the higher-order structures of an organism and even inter-
pret its b ehavior. Conceivably within a few years, a large amount of expression
data will be produced regularly as the cost of such experiments diminishes.
Biologists are expecting p owerful computational tools to extract functional in-
formation from the data. Critical eort is b eing made recently to build models
to analyze them.
One of the most studied mo dels is the Bo olean Network, where a gene has
oneofonlytwo states (ON and OFF), and the state is determined by a b oolean
function of the states of some other genes. Somogyi and Sniegoski
2
showed
that b oolean networks have features similar to those in biological systems,
such as global complex behavior, self-organization, stability, redundancy,and
perio dicity. Liang
et al.
3
implemented a reverse-engineering algorithm to infer
gene regulations and bo olean functions by computing the mutual information
between a gene and its candidate regulatory genes. Akutsu
et al.
4
gavean
algorithmic analysis of the problem of identifying b oolean networks from data
obtained bymultiple gene disruption and gene over-expressions in regard to
the number of experiments and the complexity of experiments.
In addition to the bo olean networks, other models are also studied. Thi-
ery and Thomas
5
discussed a generalized logical mo del and a feedback-loop
analysis. They suggested that a logical approach can be used to get a rst
overview of a dierential model and thus help to build and rene the model.
McAdams and Shapiro
1
proposed a nice hybrid model that integrates a conven-
tional biochemical kinetic model within the framework of a circuit simulation.
However, itisnotclearhow to determine mo del parameters from exp erimen-
tal data. Gene expression data can also b e analyzed directly by statistical and
optimization metho ds. Michaels
et al.
7
measured gene expression temporally
and applied statistical clustering methods to reveal the correlations b etween
patterns of gene expression and phenotypic changes. Chen
et al.
6
transferred
experimental data into a gene regulation graph and imposed optimization con-
straints to infer the true regulation by eliminating the errors in the graph.
In practice, the determination of the networks has to (1) derive regulatory
functions from a small set of data samples; (2) scale up to the genome level;
and (3) takeinto account the time delay in transcription and translation. In
this pap er, we prop ose a linear dierential equation mo del for gene expression
and two algorithms to solve the dierential equations. Potentially, our methods
answer the practical questions in (1) and (2), and we also make an eort to
incorporate (3). In summary,
We prop ose a Linear Transcription Model for gene expression, as well as
two algorithms to construct the mo del from a set of temp oral samples of

Genes mRNAs Proteins
L
VU
Degradation
C
Feedback Control
Figure 1: Simplied dynamic system of gene regulation emphasizing feedback on transcrip-
tion.
mRNAs and proteins: Minimum Weight Solutions to Linear Equations
and Fourier Transform for Stable Systems.
We discuss three extended models: RNA Mo del, Protein Mo del and
Time Delay Model, among which the Protein Model parameters can be
reconstructed through a set of temp oral samples of protein expression
levels.
Our results suggest that it is p ossible to determine most of the gene
regulation in the genome level from a minor set of accurate temporal
data.
Dynamic System for Gene Expression
The transcription of a gene begins with transcription elements, mostly proteins
and RNAs, binding to regulatory sites on DNA. The frequency of this bind-
ing aects the level of expression. Experiments haveveried that a stronger
binding site will increase the eect of a protein on transcription rate. On the
other hand, since the DNA sequence is unchanged, the transcription is mostly
determined by the amounts of transcription proteins. In translation, proteins
are synthesized at rib osomes. An mRNA can be translated into one or multiple
copies of corresp onding proteins, which can further change the transcription
of other genes. A feedback network of genes, mRNAs and proteins is shown in
Figure 1.
In Figure 1, we ignore other feedbacksuch as mRNAs to genes, since
we subsume such eects in the protein feedback indicated. We assume the
translation mechanism is relatively stable (at least for a short time), so the
feedback from proteins to mRNAs has no eect. Each mRNA and protein

molecule degrades randomly, and its components are recycled in the cell. One
important feedback missing here is from metab olites to the transcription, which
also plays a key role in signaling. Then, Figure 1 can be mo deled as a nonlinear
dynamic system:
d
r
dt
=
f
(
p
)
;
V
r
d
p
dt
=
L
r
;
U
p
(1)
where the variables are functions of time
t
and dened as follows:
n
The number of genes in the genome;
r
mRNA concentrations,
n
-dimensional vector-valued functions of
t
;
p
Protein concentrations,
n
-dimensional vector-valued functions of
t
;
f
(
p
) Transcription functions,
n
-dimensional vector polynomials on
p
;
L
Translational constants,
n
n
non-degenerate diagonal matrix;
V
Degradation rates of mRNAs;
n
n
non-degenerate diagonal matrix;
U
Degradation rates of Proteins,
n
n
non-degenerate diagonal matrix;
The change in mRNA concentrations (
d
r
=dt
) equals the transcription (
f
(
p
))
minus the degradation (
V
r
), and similarly,the change in protein concentra-
tions (
d
p
=dt
) equals the translation (
L
r
)minus the degradation (
U
p
). Here,
L
,
V
and
U
are non-degenerate diagonal matrices, because we assume both
the translation rates and the degradation rates are constants for each species.
Also, we consider zero time delay in transcription and translation, and leave
the time delay case to a later section.
Linear Transcription Mo del
First we assume the transcription functions,
f
(
p
), to be linear functions of
p
,
f
(
p
)=
C
p
.For example, a combined eect of activators and inhibitors in tran-
scription can be described by a linear function in the form of
w
a
[
activator s
]
;
w
i
[
inhibitors
], where
w
a
and
w
i
are contributions of the activators and the
inhibitors to the gene regulation. Otherwise, we can still make the assumption
from the following argument.
Welet
p
0
be the value of
p
at time zero, and take the rst-order Taylor
approximation:
f
(
p
) =
f
(
p
0
)+
d
f
(
p
)
d
p
j
p
0
(
p
;
p
0
)
=
C
p
+
s

where
C
=
d
f
(
p
)
d
p
j
p
0
and
s
=
f
(
p
0
)
;
d
f
(
p
)
d
p
j
p
0
p
0
. Therefore, wemay study
Equation 1 (near
p
0
):
d
r
dt
=
C
p
;
V
r
+
s
d
p
dt
=
L
r
;
U
p
To eliminate
s
byvariable substitution, we apply
r
=
r
+
r
s
and
p
=
p
+
p
s
into Equation 1 to calculate what constants
r
s
and
p
s
suce to eliminate
s
and obtain
d
r
dt
=
C
p
;
V
r
+(
C
p
s
;
V
r
s
)+
s
d
p
dt
=
L
r
;
U
p
+(
L
r
s
;
U
p
s
)
where
r
s
and
p
s
can be determined by the following equation:
;
V C
L
;
U

r
s
p
s
=
;
s
0
Because both
V
and
U
, the degradation rates, are nonsingular diagonal ma-
trices, we can assume the equation has a unique solution. Therefore it suces
to consider the following dynamic system even if
f
(
p
) is nonlinear.
d
r
dt
=
C
p
;
V
r
d
p
dt
=
L
r
;
U
p
(2)
We can dene the Linear Transcription Model as
Model 1
Let
x
=(
r
;
p
)
T
be variables for mRNAs and proteins, M bea
2
n
2
n
transition matrix, and gene expression can bemodeled by the fol lowing
dynamic system:
d
x
dt
=
M
x
wher e M
=
;
V C
L
;
U
Solution to Linear Transcription Model
Assume
M
has
2n
eigenvalues
=(
1
2
:::
2
n
)
T
.Itiswell-known that the
dynamic system in Model 1 has the following solution:
Theorem 1
The solution to Model 1 is of the form
x
(
t
)=
Q
(
t
)
e
t
(3)
where
Q
(
t
)=
f
q
ij
(
t
)
g
satises
2
n
X
j
=1
deg(
q
ij
(
t
)) + 1
2
n
for
i
=1
;
2
;:::;
2
n
Q
(
t
)isa2
n
2
n
matrix whose elements are p olynomial functions of
t
,and
deg() returns the degree of a polynomial function.

Citations
More filters
Journal ArticleDOI
TL;DR: This paper reviews formalisms that have been employed in mathematical biology and bioinformatics to describe genetic regulatory systems, in particular directed graphs, Bayesian networks, Boolean networks and their generalizations, ordinary and partial differential equations, qualitative differential equation, stochastic equations, and so on.
Abstract: The spatiotemporal expression of genes in an organism is determined by regulatory systems that involve a large number of genes connected through a complex network of interactions. As an intuitive understanding of the behavior of these systems is hard to obtain, computer tools for the modeling and simulation of genetic regulatory networks will be indispensable. This report reviews formalisms that have been employed in mathematical biology and bioinformatics to describe genetic regulatory systems, in particular directed graphs, Bayesian networks, ordinary and partial differential equations, stochastic equations, Boolean networks and their generalizations, qualitative differential equations, and rule-based formalisms. In addition, the report discusses how these formalisms have been used in the modeling and simulation of regulatory systems.

2,739 citations


Cites background from "Modeling gene expression with diffe..."

  • ...…putative regulatory connections between coexpressed genes in the graph, such as the analysis of time lags (Arkin and Ross, 1995; Arkin et al., 1997; Chen et al., 1999a), the performance of perturbation experiments (Holstege et al., 1998; Hughes et al., 2000; Laub et al., 2000; Spellman et al.,…...

    [...]

  • ...Related work taking ordinary differential equations and difference equations as their point of departure has come up recently (Chen et al., 1999b; D’haeseleer et al., 2000; Noda et al., 1998; van Someren et al., 2000; Weaver et al., 1999; Wessels et al., 2001)....

    [...]

Journal ArticleDOI
TL;DR: It is concluded that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting and bioengineering.
Abstract: Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.

1,010 citations

Journal ArticleDOI
TL;DR: This review deals with the reconstruction of gene regulatory networks (GRNs) from experimental data through computational methods and approaches are discussed that enable the modelling of the dynamics of Gene regulatory systems.
Abstract: Systems biology aims to develop mathematical models of biological systems by integrating experimental and theoretical techniques. During the last decade, many systems biological approaches that base on genome-wide data have been developed to unravel the complexity of gene regulation. This review deals with the reconstruction of gene regulatory networks (GRNs) from experimental data through computational methods. Standard GRN inference methods primarily use gene expression data derived from microarrays. However, the incorporation of additional information from heterogeneous data sources, e.g. genome sequence and protein-DNA interaction data, clearly supports the network inference process. This review focuses on promising modelling approaches that use such diverse types of molecular biological information. In particular, approaches are discussed that enable the modelling of the dynamics of gene regulatory systems. The review provides an overview of common modelling schemes and learning algorithms and outlines current challenges in GRN modelling.

742 citations

Journal ArticleDOI
TL;DR: The findings demonstrate how the network inference performance varies with the training set size, the degree of inadequacy of prior assumptions, the experimental sampling strategy and the inclusion of further, sequence-based information.
Abstract: Motivation: Bayesian networks have been applied to infer genetic regulatory interactions from microarray gene expression data. This inference problem is particularly hard in that interactions between hundreds of genes have to be learned from very small data sets, typically containing only a few dozen time points during a cell cycle. Most previous studies have assessed the inference results on real gene expression data by comparing predicted genetic regulatory interactions with those known from the biological literature. This approach is controversial due to the absence of known gold standards, which renders the estimation of the sensitivity and specificity, that is, the true and (complementary) false detection rate, unreliable and difficult. The objective of the present study is to test the viability of the Bayesian network paradigm in a realistic simulation study. First, gene expression data are simulated from a realistic biological network involving DNAs, mRNAs, inactive protein monomers and active protein dimers. Then, interaction networks are inferred from these data in a reverse engineering approach, using Bayesian networks and Bayesian learning with Markov chain Monte Carlo. Results: The simulation results are presented as receiver operator characteristics curves. This allows estimating the proportion of spurious gene interactions incurred for a specified target proportion of recovered true interactions. The findings demonstrate how the network inference performance varies with the training set size, the degree of inadequacy of prior assumptions, the experimental sampling strategy and the inclusion of further, sequence-based information. Availability: The programs and data used in the present study are available from http://www.bioss.sari.ac.uk/~dirk/ Supplements

564 citations


Cites background from "Modeling gene expression with diffe..."

  • ...…refined level of detail is a mathematical description of the biophysical processes in terms of a system of coupled differential equations that describe, for example, the processes of transcription factor binding, diffusion, protein and RNA degradation, etc.; see, for instance, Chen et al. (1999)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a model for genetic regulatory networks with time delays, described by functional differential equations or delay differential equations (DDE), provides necessary and sufficient conditions for simplifying the genetic network model, and further analyze nonlinear properties of the model in terms of local stability and bifurcation.
Abstract: Presents a model for genetic regulatory networks with time delays, which is described by functional differential equations or delay differential equations (DDE), provide necessary and sufficient conditions for simplifying the genetic network model, and further analyze nonlinear properties of the model in terms of local stability and bifurcation. The proposed model transforms the original interacting network into several simple transcendental equations at an equilibrium, thereby significantly reducing the computational complexity and making analysis of stability and bifurcation tractable for even large-scale networks. Finally, to test the theory, a repressilator model is used as an example for numerical simulation.

406 citations


Cites methods from "Modeling gene expression with diffe..."

  • ...He is also with CREST, Japan Science and Technology Corporation (JST), Kawaguchi, Saitama 332, Japan (e-mail: aihara@sat.t.u-tokyo.ac.jp)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The genome-wide characterization of mRNA transcript levels during the cell cycle of the budding yeast S. cerevisiae indicates a mechanism for local chromosomal organization in global mRNA regulation and links a range of human genes to cell cycle period-specific biological functions.

2,232 citations


Additional excerpts

  • ...(9) ri(tk) ri(tk 1) tk tk 1 = ci1p1(tk) + :::+ cinpn(tk) viiri(tk) (10)...

    [...]

Proceedings Article
01 Jan 1998
TL;DR: This study investigates the possibility of completely infer a complex regulatory network architecture from input/output patterns of its variables using binary models of genetic networks, and finds the problem to be tractable within the conditions tested so far.
Abstract: Given the immanent gene expression mapping covering whole genomes during development, health and disease, we seek computational methods to maximize functional inference from such large data sets. Is it possible, in principle, to completely infer a complex regulatory network architecture from input/output patterns of its variables? We investigated this possibility using binary models of genetic networks. Trajectories, or state transition tables of Boolean nets, resemble time series of gene expression. By systematically analyzing the mutual information between input states and output states, one is able to infer the sets of input elements controlling each element or gene in the network. This process is unequivocal and exact for complete state transition tables. We implemented this REVerse Engineering ALgorithm (REVEAL) in a C program, and found the problem to be tractable within the conditions tested so far. For n = 50 (elements) and k = 3 (inputs per element), the analysis of incomplete state transition tables (100 state transition pairs out of a possible 10(exp 15)) reliably produced the original rule and wiring sets. While this study is limited to synchronous Boolean networks, the algorithm is generalizable to include multi-state models, essentially allowing direct application to realistic biological data sets. The ability to adequately solve the inverse problem may enable in-depth analysis of complex dynamic systems in biology and other fields.

1,031 citations

Journal ArticleDOI
04 Aug 1995-Science
TL;DR: A hybrid modeling approach is proposed that integrates conventional biochemical kinetic modeling within the framework of a circuit simulation for in vivo behavior of phage lambda.
Abstract: Genetic networks with tens to hundreds of genes are difficult to analyze with currently available techniques. Because of the many parallels in the function of these biochemically based genetic circuits and electrical circuits, a hybrid modeling approach is proposed that integrates conventional biochemical kinetic modeling within the framework of a circuit simulation. The circuit diagram of the bacteriophage lambda lysislysogeny decision circuit represents connectivity in signal paths of the biochemical components. A key feature of the lambda genetic circuit is that operons function as active integrated logic components and introduce signal time delays essential for the in vivo behavior of phage lambda.

538 citations

Journal ArticleDOI
TL;DR: An introduction to Boolean networks and their relevance to present-day experimental research is provided, bringing us closer to an understanding of complex molecular physiological processes like brain development and intractable medical problems of immediate importance.
Abstract: Molecular genetics presents an increasingly complex picture of the genome and biological function. Evidence is mounting for distributed function, redundancy, and combinatorial coding in the regulation of genes. Satisfactory explanation will require the concept of a parallel processing signaling network. Here we provide an introduction to Boolean networks and their relevance to present-day experimental research. Boolean network models exhibit global complex behavior, self-organization, stability, redundancy and periodicity, properties that deeply characterize biological systems. While the life sciences must inevitably face the issue of complexity, we may well look to cybernetics for a modeling language such as Boolean networks which can manageably describe parallel processing biological systems and provide a framework for the growing accumulation of data. We finally discuss experimental strategies and database systems that will enable mapping of genetic networks. The synthesis of these approaches holds an immense potential for new discoveries on the intimate nature of genetic networks, bringing us closer to an understanding of complex molecular physiological processes like brain development, and intractable medical problems of immediate importance, such as neurodegenerative disorders, cancer, and a variety of genetic diseases.

365 citations

Proceedings Article
01 Jan 1998
TL;DR: This work presents a strategy for the analysis for large-scale quantitative gene expression measurement data from time course experiments that takes advantage of cluster analysis and graphical visualization methods to reveal correlated patterns of gene expression from time series data.
Abstract: The discovery of any new gene requires an analysis of the expression context for that gene. Now that the cDNA and genomic sequencing projects are progressing at such a rapid rate, high throughput gene expression screening approaches are beginning to appear to take advantage of that data. We present a strategy for the analysis for large-scale quantitative gene expression measurement data from time course experiments. Our approach takes advantage of cluster analysis and graphical visualization methods to reveal correlated patterns of gene expression from time series data. The coherence of these patterns suggests an order that conforms to a notion of shared pathways and control processes that can be experimentally verified.

221 citations

Frequently Asked Questions (18)
Q1. What are the two techniques that can be used to construct the model?

DNA arrays and Mass spectrometry have emerged as powerful techniques that are capable of pro ling RNA and protein expression at a whole-genome level. 

An mRNA can be translated into one or multiple copies of corresponding proteins, which can further change the transcription of other genes. 

One of the most studied models is the Boolean Network, where a gene has one of only two states (ON and OFF), and the state is determined by a boolean function of the states of some other genes. 

The gene expression system has to be a stable system since an exponential or a polynomial growth rate of a gene or a protein is unlikely to happen. 

Somogyi and Sniegoski 2 showed that boolean networks have features similar to those in biological systems, such as global complex behavior, self-organization, stability, redundancy, and periodicity. 

Conceivably within a few years, a large amount of expression data will be produced regularly as the cost of such experiments diminishes. 

The system is unstable if there exists a positive eigenvalue of , because the term qij(t)e j t is an exponential function if j has a positive value. 

In this paper, the authors propose a linear di erential equation model for gene expression and two algorithms to solve the di erential equations. 

The authors assume the translation mechanism is relatively stable (at least for a short time), so the feedback from proteins to mRNAs has no e ect. 

Chen et al. 6 transferred experimental data into a gene regulation graph and imposed optimization constraints to infer the true regulation by eliminating the errors in the graph. 

The number of genes in the genome; r mRNA concentrations, n-dimensional vector-valued functions of t; p Protein concentrations, n-dimensional vector-valued functions of t; f (p) Transcription functions, n-dimensional vector polynomials on p; L Translational constants, n n non-degenerate diagonal matrix; V Degradation rates of mRNAs; n n non-degenerate diagonal matrix; U Degradation rates of Proteins, n n non-degenerate diagonal matrix;The change in mRNA concentrations (dr=dt) equals the transcription (f (p)) minus the degradation (V r), and similarly, the change in protein concentrations (dp=dt) equals the translation (Lr) minus the degradation (Up). 

the transcription matrix C in Model 1 represents gene regulatory networks: cij 6= 0 indicates gene j is a regulator for the transcription of gene i, and cij = 0 indicates gene j is not a regulator for gene i. 

On the other hand, since the DNA sequence is unchanged, the transcription is mostly determined by the amounts of transcription proteins. 

The other approach of MWSLE assumes the number of regulators of a gene is a small constant, but the actual number may be much larger than expected and the solution may be intractable computationally. 

The authors will assume that the authors obtain a set of time-series samples of x(t0);x(t1); :::;x(tk), where x includes both mRNA and protein concentrations. 

The nal equation isd2p dt2 = ( LVL 1 U ) dp dt + ( LVL 1U + LC )p (11)Here, L is a non-degenerate diagonal matrix and its inverse L 1 exists. 

It is well-known that the dynamic system in Model 1 has the following solution: Theorem 1 The solution to Model 1 is of the formx(t) = Q(t)et (3)where Q(t) = fqij(t)g satis es2nX j=1 deg(qij(t)) + 1 2n for i = 1; 2; :::; 2nQ(t) is a 2n 2n matrix whose elements are polynomial functions of t, and deg() returns the degree of a polynomial function. 

The solutions to Model 4 are of the following form:r p= Q(t)e twhere are eigenvalues of S, and Q(t) is a matrix whose elements are polynomials on t.