scispace - formally typeset
Search or ask a question
Book ChapterDOI

Incorporating Heuristics in a Swarm Intelligence Framework for Inferring Gene Regulatory Networks from Gene Expression Time Series

22 Sep 2008-pp 323-330
TL;DR: This paper implements an ant system to generate candidate network structures using a particle swarm optimization algorithm, and extends this approach by incorporating domain-specific heuristics to the ant system, as a mechanism that has the potential to bias the pheromone amplification effect towards biologically plausible relationships.
Abstract: In this paper, we address the problem of reverse-engineering a gene regulatory network from gene expression time series. We approach the problem by implementing an ant system to generate candidate network structures. The quality of a candidate structure is evaluated using a particle swarm optimization algorithm that tunes the parameters of the corresponding model, by minimizing the error between the actual time series and the trained model's output. We extend this approach by incorporating domain-specific heuristics to the ant system, as a mechanism that has the potential to bias the pheromone amplification effect towards biologically plausible relationships. We apply the method to a subset of genes from a real world data set and report on the results.

Summary (2 min read)

1 Introduction

  • Certain genes code for special proteins called transcription factors, which are responsible for regulating the expression of other genes .
  • The authors model a GRN as a graph, upon which the ant colony optimization (ACO) meta-heuristic is implemented for the selection of putative GRN architectures.
  • The selected structure is then modelled as a recurrent neural network (RNN), whose parameters (weights and bias terms) are optimized using particle swarm optimization (PSO), so as to minimize the error between the model’s output and the actual time series.
  • In the next section, the authors present an overview of existing approaches to the problem of GRN inference from time course gene expression data.

2 Existing Approaches

  • The earliest approaches to the problem of inferring gene relationships from time course gene expression data, were cluster analysis methods, mostly based on global correlation metrics, such as Pearson correlation coefficient, mutual information etc., that extracted co-regulation information out of co-expressed gene clusters [5][6].
  • Nevertheless, cluster analysis is still useful, primarily as a technique to reduce the search space and improve the performance of algorithms.
  • Their strength in representing noisy, stochastic processes due to their probabilistic nature, makes them good candidates for addressing the problem of inferring gene regulatory networks [9].
  • Ressom et al. [4] implement a swarm intelligence framework where an ant system, driven only by pheromone amplification, is used for the selection of putative network structures.
  • After a structure has been formed, the corresponding model (RNN) is optimized using PSO, in order to evaluate the quality of the selected structure.

3 Methods

  • Network architectures are constructed using the ACO meta-heuristic [14], whereby artificial ants navigate a graph of N nodes, where N is the number of genes in the time series.
  • After the threshold of maximum allowed PSO iterations has been reached, the minimum achieved error ǫ(wS) is returned to the ACO algorithm as the quality of the selected structure S.
  • These expression changes in a gene’s temporal profile are encoded as ‘events’, by calculating the slope of the expression profile at every time interval and classifying it as either ‘R’ , ‘F’ or ‘C’ .
  • This is done by swapping ‘R’s with ‘F’s, while ‘C’s remain intact.

4 Results

  • The authors selected 5 cyclin genes that are known to be involved in cell cycle regulation, from the S. cerevisiae data set published in Spellman et al. [18], for the purpose of comparing their results to those of Ressom et al. [4].
  • The yeast data set contains multiple time series from the yeast cell cycle; the authors chose the cdc15 time series, consisting of 24 time points (more than the others).
  • The authors performed 10 such experiments and recorded the number of times each edge was selected.
  • The incorporation of the selected heuristic metric does not seem to influence structure selection in a decisive manner.

5 Further Work

  • The reported early results that have been presented in this paper, form part of an ongoing study into a swarm intelligence perspective to the problem of reverseengineering gene regulatory networks.
  • The proposed framework allows for the incorporation of an arbitrary number of problem-specific heuristics, perhaps with an appropriately defined weighting scheme, to a model-based optimization approach.
  • Furthermore, the authors note that their experiments have used a hand-picked subset of temporal gene expression profiles.
  • An investigation of the algorithm’s scalability is necessary, particularly when considering the full set of genes, whose expression levels are captured in a real world data set.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Incorporating heuristics in a swarm intelligence
framework for inferring gene regulatory
networks from gene expression time series
Kyriakos Kentzoglanakis, Matthew Poole, and Carl Adams
School of Computing
University of Portsmouth
Portsmouth, UK
kyriakos.kentzoglanakis@port.ac.uk
matthew.poole@port.ac.uk
carl.adams@port.ac.uk
Abstract. In this pap er, we address the problem of reverse-engineering
a gene regulatory network from gene expression time series. We approach
the problem by implementing an ant system to generate candidate net-
work structures. The quality of a candidate structure is evaluated using
a particle swarm optimization algorithm that tunes the parameters of
the corresponding model, by minimizing the error between the actual
time series and the trained model’s output. We extend this approach by
incorp or ating domain-specific heuristics to the ant system, as a mecha-
nism that has the potential to bias the pheromone amplification effect
towards biologically plausible relationships. We app ly the method to a
subset of genes from a real world data set and report on the results.
1 Introduction
Gene expression is the process by which a gene’s DNA sequence is converted
through a series of steps into a functional product: the protein. This cellular
process constitutes the central dogma of molecular biology, i.e. that genes code
for proteins. During this process, DNA is first transcribed (copied) to an in-
termediate macromolecular form, the mRNA (messenger RNA), which is then
translated to protein. Proteins are involved in essential functions of a living or-
ganism, including transcription, the catalysis of chemical reactions, cell signalling
etc.
Certain genes code for special proteins called transcription factors, which are
responsible for regulating the expression of other genes (targets). Transcription
factors bind a cis-regulatory site in the promoter region of the target gene, thus
inducing a change in the target’s rate of transcription. The nature of change
specifies this effect as either activatory, in case of an increase in the target’s rate
of transcription, or repressive (inhibitory) in case of a decrease [1].
A gene regulatory network (GRN) is a complex network of causal relation-
ships between genes, where connections represent regulatory interactions be-
tween activators or repressors and targets.

With the advent of DNA microarray technology that measures the mRNA
levels of thousands of targets, it has become possible to observe such complex
biological processes by taking snapshots of the cellular state and capturing the
expression profiles of thousands of genes simultaneously. Gene expression data
can either be static, with gene profiles from different organisms, each typically
characterized by a class value, or dynamic in the form of gene expression time
series from the same organism.
The problem of reverse-engineering GRNs from gene expression data is a
major iss ue in systems biology [2]. A principal obstacle is the relative insufficiency
of observations (typically tens or a few hundreds) compared to the number of
genes measured (in the order of thousands or a few tens of thousands), the
so-called curse of dimensionality.
Additionally, the common practice of validating the biological plausibility
of inferred causal relationships by consulting the relevant literature, albeit un-
avoidable, is controversial because, in the absence of such experimental evidence
for a putative connection, there is no apparent method of classifying it either as
a previously unknown interaction or as just a spurious edge [3].
In this paper, we describe a swarm intelligence approach to the problem of
reverse-engineering GRNs from gene expression time series. We model a GRN as
a graph, upon which the ant colony optimization (ACO) meta-heuristic is imple-
mented for the selection of putative GRN architectures. The selected structure is
then modelled as a recurrent neural network (RNN), whose parameters (weights
and bias terms) are optimized using particle swarm optimization (PSO), so as
to minimize the error b etween the model’s output and the actual time ser ies.
Our approach extends the work by Ressom et al. [4], fir st by changing the way
candidate architectures are constructed by individual artificial ants and, second,
by introducing a heuristic metric with the intention to bias the probabilistic edge
selection process towards biologically plausible relationships.
In the next section, we present an overview of existing approaches to the
problem of GRN inference from time course gene expression data. In section
3, the proposed framework is outlined by describing its components and their
interrelationships. In section 4, we report on the results of applying the method
to a subset of known genes from the yeast gene expression data set and we
discuss some of the issues that emerged, before the paper’s conclusion in section
5.
2 Existing Approaches
The earliest approaches to the problem of inferring gene relationships from time
course gene expression data, were cluster analysis methods, mostly based on
global correlation metrics, such as Pearson correlation coefficient, mutual infor-
mation etc., that extracted co-regulation information out of co-expressed gene
clusters [5][6]. These pioneering, model-free methods essentially group genes ac-
cording to their expression levels, providing an insight into the functionality of
unknown genes based on the cluster in which they belong. However, they do not

take the temporal nature of data into consideration and do not assign regulatory
roles to genes, since, given two genes that are co-expressed (have similar expres-
sion), it is not clear which regulates the other. Nevertheless, cluster analysis is
still useful, primarily as a technique to reduce the search space and improve the
performance of algorithms.
Model-based methods , on the other hand, operate by assuming the existence
of a model that represents the gene regulatory network and attempt to train
this model based on the available artificial or experimental data. In essence,
they attempt to reconstruct the architecture by reproducing the system dynam-
ics. Such models include Boolean networks, Bayesian networks, linear additive
models, systems of differential equations, power law systems etc. [7]
In Boolean networks, the state of a node at one time point is a boolean
function of the states of K other nodes at the previous time point. As such,
they constitute binary idealizations of genetic network architectures that, while
succeeding in the simulation and analysis of global dynamics [8], seem to suffer
from the problem of information loss during data binarization.
Dynamic Bayesian networks are models of joint, multivariate probability dis-
tributions that attempt to represent conditional independence relationships be-
tween variables. Their strength in representing noisy, stochastic processes due
to their probabilistic nature, makes them good candidates for addressing the
problem of inferring gene regulatory networks [9].
In linear additive (neural) models [10][11], the output of each node is a com-
bination of inputs from all other nodes, a function of the weighted sum of their
expression levels. Zero weights indicate no regulation, positive weights signify
activation, while negative weights signify repression. The assumption of linear-
ity is not a severe one [12], especially if one considers the statistical treatment
of microarray data and the increased levels of noise.
Ressom et al. [4] implement a s warm intelligence framework where an ant
system, driven only by pheromone amplification, is used for the selection of pu-
tative network structures. For each gene (regulator), each artificial ant considers
all 2
n
regulator-target combinations, where n is the number of genes, for the
construction of a candidate architecture. After a structure has been formed, the
corresponding model (RNN) is optimized using PSO, in order to evaluate the
quality of the selected structure.
Xu et al. [13] deploy a discrete version of PSO for structure selection and a
continuous version for model training. They also discuss the relative difficulty
of reconstructing the correct regulatory network structure over reproducing the
correct dynamics, explaining that there is no unique network to satisfy the data
upon which inference is based. Reconstructing the structure depends upon re-
producing the system dynamics and, therefore, is a problem of higher order.
3 Methods
Our approach uses an ACO implementation, on a graph with nodes representing
genes and directed edges representing regulatory (causal) relationships, to select

putative network architectures, driven by pheromone amplification and heuristic
information, where:
pheromone trails are updated according to the ability of the model (RNN)
that represents the selected structure to reproduce the time series, after
having been trained using a PSO algorithm.
the desirability value for a particular edge is calculated by a suitably defined
heuristic function.
A candidate gene network structure is represented by a recurrent neural
network model, whose update equation is given by:
x
i
(t) = f(
N
X
j=1
w
ij
x
j
(t 1) + b
i
) (1)
where x
i
(t) is the value (expression level) of node i at time t, b
i
a bias term and
weights w
ij
express the influence of node j to node i, ranging from -1 (gene j
represses gene i) to 1 (gene j activates gene i). A value of 0 signifies no regulation.
f is a nonlinear transfer function, either the logistic or the hyperbolic tangent.
Network architectures are constructed using the ACO meta-heuristic [14],
whereby artificial ants navigate a graph of N nodes, where N is the number of
genes in the time series. Each artificial ant probabilistically selects K regula-
tor nodes for each target node in the graph, resulting in a candidate network
structure S = {e
ji
} of N K connections. The parameter K reflects the fact that
gene networks are sparse and that a gene is regulated by only a handful of other
genes. An edge e
ji
represents a regulatory relationship from node j to node i.
The probability of selection of node j as a potential regulator of node i is given
by:
p
ij
=
τ
α
ij
η
β
ij
P
N
j=1
τ
α
ij
η
β
ij
(2)
where τ
ij
is the pheromone value of edge e
ji
, η
ij
is the selection desirability
of edge e
ji
based on a suitably defined heuristic function and α, β are their
respective relative influences.
After a candidate structure S has been constructed, its quality is assessed by
tuning the corresponding model’s parameters in order to compare its predicted
output with the actual time series. The synaptic weights of the edges that are
not part of the selected structure are locked to 0.
Optimization of the model’s parameters is performed using a PSO algorithm
[15], where each particle’s position is encoded as a vector w
S
of size N(K + 1)
that contains the weights of the selected edges, as well as the bias terms. The
quality of a particle’s position is determined by calculating the MSE between
the predicted mo del output and the actual time series:
ǫ(w
S
) =
1
T N
T
X
t=1
N
X
i=1
[x
i
(t) x
w
S
i
(t)]
2
(3)

where T is the number of available time points, N is the number of genes, x
i
(t)
is the actual expression level of the i
th
gene at time t and x
w
S
i
(t) is the predicted
expression level of the i
th
gene at time t. The predicted time series are calculated
by setting up the model using w
S
and running it using each state of the actual
time series, in order to obtain the next state of the predicted time series.
After the threshold of maximum allowed PSO iterations has been reached, the
minimum achieved error ǫ(w
S
) is returned to the ACO algorithm as the quality
of the selected structure S. The pheromone matrix is then updated according
to:
τ
ij
=
1
ǫ(w
S
)
e
ji
S (4)
The incorporation of heuristics to probabilistic structure selection offers a
way of enriching a domain-agnostic procedure with problem-specific insights. The
heuristic factor η
ij
from equation (2) can be defined as a function η : N ×N R
that maps a pair (i, j) to a score that reflects the strength and nature of gene’s j
influence on gene i. In this context, strength means the likelihood of regulation
and nature means the type of regulation (activation or repression).
Table 1. Scoring matrix for event matching. The score of a pair of symbols is a function of the
time lag dt between two events. S(dt) is a linearly decreasing function with 0 < S(dt) < 1, so that
the bigger the time lag, the less likely a causal effect is to be assumed. In case of a negative dt, the
match is assigned a maximum penalty. Parameters a and b range from 0 to 1 and their role is to
emphasize particular matching forms, based on biological arguments [16].
R C F
R S(dt) 0 bS(dt)
C 0 0 0
F bS(dt) 0 aS(dt)
For the purpose of demonstrating our approach, we are using a heuristic
proposed by Kwon et al. [16]. They hypothesize that if a rise in the expression
of gene A is followed by a rise in the expression of gene B, then this indicates
that gene A potentially activates gene B. Conversely, if a rise in the expression
of gene A is followed by a fall in the expression of gene B, then gene A is a
potential repressor for gene B.
These expression changes in a gene’s temporal profile are encoded as ‘events’,
by calculating the slope of the expression profile at every time interval and
classifying it as either ‘R’ (rising), ‘F’ (falling) or ‘C’ (constant). A variation of
the Needleman-Wunsch algorithm for sequence alignment [17] is then used to
determine the best possible alignment for a pair of event strings, by using the
event scoring matrix shown in Table 1.
Given the expression levels of two genes, one of which is assumed to be the
regulator and the other the target, the algorithm first calculates the score for
the pr esumed activatory relationship and then for the inhibitory relationship,
by complementing the event string of the target. This is done by swapping ‘R’s
with ‘F’s, while ‘C’s remain intact. The maximum score of the two is returned

Citations
More filters
Journal ArticleDOI
TL;DR: Results demonstrate the relative advantage of utilizing problem-specific knowledge regarding biologically plausible structural properties of gene networks over conducting a problem-agnostic search in the vast space of network architectures.
Abstract: In this paper, we investigate the problem of reverse engineering the topology of gene regulatory networks from temporal gene expression data. We adopt a computational intelligence approach comprising swarm intelligence techniques, namely particle swarm optimization (PSO) and ant colony optimization (ACO). In addition, the recurrent neural network (RNN) formalism is employed for modeling the dynamical behavior of gene regulatory systems. More specifically, ACO is used for searching the discrete space of network architectures and PSO for searching the corresponding continuous space of RNN model parameters. We propose a novel solution construction process in the context of ACO for generating biologically plausible candidate architectures. The objective is to concentrate the search effort into areas of the structure space that contain architectures which are feasible in terms of their topological resemblance to real-world networks. The proposed framework is initially applied to the reconstruction of a small artificial network that has previously been studied in the context of gene network reverse engineering. Subsequently, we consider an artificial data set with added noise for reconstructing a subnetwork of the genetic interaction network of S. cerevisiae (yeast). Finally, the framework is applied to a real-world data set for reverse engineering the SOS response system of the bacterium Escherichia coli. Results demonstrate the relative advantage of utilizing problem-specific knowledge regarding biologically plausible structural properties of gene networks over conducting a problem-agnostic search in the vast space of network architectures.

59 citations

Journal ArticleDOI
TL;DR: Throughout the simulations, the bees algorithm outperformed simulated annealing, showing the effectiveness of this swarm intelligence technique for this particular application.
Abstract: Learning gene regulatory networks under the threshold Boolean network model is presented. To accomplish this, the swarm intelligence technique called the bees algorithm is formulated to learn networks with predefined attractors. The resulting technique is compared with simulated annealing through simulations. The ability of the networks to preserve the attractors when the updating schemes is changed from parallel to sequential is analyzed as well. Results show that Boolean networks are not very robust when the updating scheme is changed. Robust networks were found only for limit cycle length equal to two and specific network topologies. Throughout the simulations, the bees algorithm outperformed simulated annealing, showing the effectiveness of this swarm intelligence technique for this particular application.

33 citations


Cites methods from "Incorporating Heuristics in a Swarm..."

  • ...In [11], an ant system is implemented to generate candidate network structures, and in [22], the bees algorithm is used to generate Boolean network examples to support a theorem presented in that work....

    [...]

Proceedings ArticleDOI
09 May 2012
TL;DR: Given the input-output data of the mammalian cell cycle network under a parallel updating scheme, an attempt to construct a threshold Boolean network with the same dynamics is presented, but results shows that the network is not robust since different limit cycles of different lengths appear.
Abstract: Given the input-output data of the mammalian cell cycle network under a parallel updating scheme, an attempt to construct a threshold Boolean network with the same dynamics is presented. To accomplish this, mutual information is used to find the network structure, then a swarm intelligence optimization technique called the bees algorithm is used to find the weights and thresholds for the network. It is shown that out of the ten regulatory elements (nodes) of the network, only nine can be modeled as a single threshold function, thus, the resulting network is almost a threshold Boolean network with the exception of the CycA protein which remains with its logical rules instead. The robustness of the network is explored with respect to update perturbations, in particular, what happens to the limit cycle attractors when changing from parallel to a sequential updating scheme. Results shows that the network is not robust since different limit cycles of different lengths appear.

21 citations


Cites methods from "Incorporating Heuristics in a Swarm..."

  • ...In [22], an ant system is implemented to generate candidate network structures and in [23] the bees algorithm is used to generate Boolean network examples to support a theorem presented in that work....

    [...]

Proceedings ArticleDOI
12 Dec 2012
TL;DR: One of the synthetic networks built using the bees algorithm was analyzed by a biologist concluding that the resulting model was quite consistent from a biological point of view, supporting the proposed method as a tool for biologist to construct synthetic networks with desired characteristics.
Abstract: A swarm intelligence technique called the bees algorithm is formulated to build synthetic networks of the budding yeast cell-cycle. The resulting networks contain the original fixed points of the budding yeast cell-cycle network plus additional fixed points to reduce the basin size of the fixed point associated to the G1 phase of the cell-cycle, with the purpose of promoting cell proliferation for biotechnological applications. One thousand synthetic networks were found using the bees algorithm, 84.5% had basins size for the G1 fixed point less or equal to 10, whereas the original model has a basin size for that fixed point of 1764. One of the synthetic networks was analyzed by a biologist concluding that the resulting model was quite consistent from a biological point of view, supporting the proposed method as a tool for biologist to construct synthetic networks with desired characteristics.

13 citations

Book ChapterDOI
TL;DR: Although the results of the study indicate that the generated landscapes have a positive fitness-distance correlation, the error values span several orders of magnitude over very short distance variations, which suggests that the fitness landscape has extremely deep valleys, which can make general-purpose state-of-the-art continuous optimization algorithms exhibit a very poor performance.
Abstract: Inferring gene regulatory networks from expression profiles is a challenging problem that has been tackled using many different approaches. When posed as an optimization problem, the typical goal is to minimize the value of an error measure, such as the relative squared error, between the real profiles and those generated with a model whose parameters are to be optimized. In this paper, we use dynamic recurrent neural networks to model regulatory interactions and study systematically the "fitness landscape" that results from measuring the relative squared error. Although the results of the study indicate that the generated landscapes have a positive fitness-distance correlation, the error values span several orders of magnitude over very short distance variations. This suggests that the fitness landscape has extremely deep valleys, which can make general-purpose state-of-the-art continuous optimization algorithms exhibit a very poor performance. Further results, obtained from an analysis based on perturbations of the optimal network topology, support approaches in which the spaces of network topologies and of network parameters are decoupled.

1 citations


Cites background from "Incorporating Heuristics in a Swarm..."

  • ...Research in this direction has already been done, for example in [27, 23, 13], but no analysis of the underlying fitness landscape had been performed before....

    [...]

  • ...Many mathematical models exist in the literature to describe gene regulatory interactions: Relevance Networks [17], Boolean Networks [16], Dynamic Bayesian Networks [6] and systems of additive or differential equations, being them linear [1], ordinary nonlinear [7, 13, 23, 25, 28, 27] (including recurrent neural networks) or S-systems [21, 14, 24]....

    [...]

References
More filters
Proceedings ArticleDOI
06 Aug 2002
TL;DR: A concept for the optimization of nonlinear functions using particle swarm methodology is introduced, and the evolution of several paradigms is outlined, and an implementation of one of the paradigm is discussed.
Abstract: A concept for the optimization of nonlinear functions using particle swarm methodology is introduced. The evolution of several paradigms is outlined, and an implementation of one of the paradigms is discussed. Benchmark testing of the paradigm is described, and applications, including nonlinear function optimization and neural network training, are proposed. The relationships between particle swarm optimization and both artificial life and genetic algorithms are described.

35,104 citations


"Incorporating Heuristics in a Swarm..." refers methods in this paper

  • ...Optimization of the model’s parameters is performed using a PSO algorithm [15], where each particle’s position is encoded as a vector wS of size N(K + 1) that contains the weights of the selected edges, as well as the bias terms....

    [...]

Journal ArticleDOI
TL;DR: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.
Abstract: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is de- scribed that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be inter- preted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly charac- terized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.

16,371 citations


"Incorporating Heuristics in a Swarm..." refers background in this paper

  • ..., that extracted co-regulation information out of co-expressed gene clusters [5][6]....

    [...]

Journal ArticleDOI
TL;DR: A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed and it is possible to determine whether significant homology exists between the proteins to trace their possible evolutionary development.

11,844 citations


"Incorporating Heuristics in a Swarm..." refers methods in this paper

  • ...A variation of the Needleman-Wunsch algorithm for sequence alignment [17] is then used to determine the best possible alignment for a pair of event strings, by using the event scoring matrix shown in Table 1....

    [...]

BookDOI
01 Jan 1999
TL;DR: This chapter discusses Ant Foraging Behavior, Combinatorial Optimization, and Routing in Communications Networks, and its application to Data Analysis and Graph Partitioning.
Abstract: 1. Introduction 2. Ant Foraging Behavior, Combinatorial Optimization, and Routing in Communications Networks 3. Division of Labor and Task Allocation 4. Cemetery Organization, Brood Sorting, Data Analysis, and Graph Partitioning 5. Self-Organization and Templates: Application to Data Analysis and Graph Partitioning 6. Nest Building and Self-Assembling 7. Cooperative Transport by Insects and Robots 8. Epilogue

5,822 citations


"Incorporating Heuristics in a Swarm..." refers methods in this paper

  • ...Network architectures are constructed using the ACO meta-heuristic [14], whereby artificial ants navigate a graph of N nodes, where N is the number of genes in the time series....

    [...]

Journal ArticleDOI
TL;DR: A comprehensive catalog of yeast genes whose transcript levels vary periodically within the cell cycle is created, and it is found that the mRNA levels of more than half of these 800 genes respond to one or both of these cyclins.
Abstract: We sought to create a comprehensive catalog of yeast genes whose transcript levels vary periodically within the cell cycle. To this end, we used DNA microarrays and samples from yeast cultures sync...

5,176 citations

Frequently Asked Questions (12)
Q1. What have the authors contributed in "Incorporating heuristics in a swarm intelligence framework for inferring gene regulatory networks from gene expression time series" ?

In this paper, the authors address the problem of reverse-engineering a gene regulatory network from gene expression time series. The quality of a candidate structure is evaluated using a particle swarm optimization algorithm that tunes the parameters of the corresponding model, by minimizing the error between the actual time series and the trained model ’ s output. The authors apply the method to a subset of genes from a real world data set and report on the results. The authors approach the problem by implementing an ant system to generate candidate network structures. The authors extend this approach by incorporating domain-specific heuristics to the ant system, as a mechanism that has the potential to bias the pheromone amplification effect towards biologically plausible relationships. 

Dynamic Bayesian networks are models of joint, multivariate probability distributions that attempt to represent conditional independence relationships between variables. 

After a candidate structure S has been constructed, its quality is assessed by tuning the corresponding model’s parameters in order to compare its predicted output with the actual time series. 

Their strength in representing noisy, stochastic processes due to their probabilistic nature, makes them good candidates for addressing the problem of inferring gene regulatory networks [9]. 

Such models include Boolean networks, Bayesian networks, linear additive models, systems of differential equations, power law systems etc. [7] 

The proposed framework allows for the incorporation of an arbitrary number of problem-specific heuristics, perhaps with an appropriately defined weighting scheme, to a model-based optimization approach. 

Their approach extends the work by Ressom et al. [4], first by changing the way candidate architectures are constructed by individual artificial ants and, second, by introducing a heuristic metric with the intention to bias the probabilistic edge selection process towards biologically plausible relationships. 

After the threshold of maximum allowed PSO iterations has been reached, the minimum achieved error ǫ(wS) is returned to the ACO algorithm as the quality of the selected structure S. The pheromone matrix is then updated according to:τij = 1ǫ(wS) ∀eji ∈ S (4)The incorporation of heuristics to probabilistic structure selection offers a way of enriching a domain-agnostic procedure with problem-specific insights. 

These expression changes in a gene’s temporal profile are encoded as ‘events’, by calculating the slope of the expression profile at every time interval and classifying it as either ‘R’ (rising), ‘F’ (falling) or ‘C’ (constant). 

The probability of selection of node j as a potential regulator of node i is given by:pij = ταijη β ij∑Nj=1 τ α ijη β ij(2)where τij is the pheromone value of edge eji, ηij is the selection desirability of edge eji based on a suitably defined heuristic function and α, β are their respective relative influences. 

A variation of the Needleman-Wunsch algorithm for sequence alignment [17] is then used to determine the best possible alignment for a pair of event strings, by using the event scoring matrix shown in Table 1. 

Each artificial ant probabilistically selects K regulator nodes for each target node in the graph, resulting in a candidate network structure S = {eji} of NK connections.