scispace - formally typeset
Search or ask a question

Showing papers in "PLOS Computational Biology in 2008"


Journal ArticleDOI
TL;DR: This study provides new evidence that there is disrupted organization of functional brain networks in AD, and suggests that these network measures may be useful as an imaging-based biomarker to distinguish AD from healthy aging.
Abstract: Functional brain networks detected in task-free (“resting-state”) functional magnetic resonance imaging (fMRI) have a small-world architecture that reflects a robust functional organization of the brain. Here, we examined whether this functional organization is disrupted in Alzheimer's disease (AD). Task-free fMRI data from 21 AD subjects and 18 age-matched controls were obtained. Wavelet analysis was applied to the fMRI data to compute frequency-dependent correlation matrices. Correlation matrices were thresholded to create 90-node undirected-graphs of functional brain networks. Small-world metrics (characteristic path length and clustering coefficient) were computed using graph analytical methods. In the low frequency interval 0.01 to 0.05 Hz, functional brain networks in controls showed small-world organization of brain activity, characterized by a high clustering coefficient and a low characteristic path length. In contrast, functional brain networks in AD showed loss of small-world properties, characterized by a significantly lower clustering coefficient (p<0.01), indicative of disrupted local connectivity. Clustering coefficients for the left and right hippocampus were significantly lower (p<0.01) in the AD group compared to the control group. Furthermore, the clustering coefficient distinguished AD participants from the controls with a sensitivity of 72% and specificity of 78%. Our study provides new evidence that there is disrupted organization of functional brain networks in AD. Small-world metrics can characterize the functional organization of the brain in AD, and our findings further suggest that these network measures may be useful as an imaging-based biomarker to distinguish AD from healthy aging.

1,086 citations


Journal ArticleDOI
TL;DR: It is argued that elaborating principled and informed models is a prerequisite for grounding empirical neuroscience in a cogent theoretical framework, commensurate with the achievements in the physical sciences.
Abstract: The cortex is a complex system, characterized by its dynamics and architecture, which underlie many functions such as action, perception, learning, language, and cognition Its structural architecture has been studied for more than a hundred years; however, its dynamics have been addressed much less thoroughly In this paper, we review and integrate, in a unifying framework, a variety of computational approaches that have been used to characterize the dynamics of the cortex, as evidenced at different levels of measurement Computational models at different space-time scales help us understand the fundamental mechanisms that underpin neural processes and relate these processes to neuroscience data Modeling at the single neuron level is necessary because this is the level at which information is exchanged between the computing elements of the brain; the neurons Mesoscopic models tell us how neural elements interact to yield emergent behavior at the level of microcolumns and cortical columns Macroscopic models can inform us about whole brain dynamics and interactions between large-scale neural systems such as cortical regions, the thalamus, and brain stem Each level of description relates uniquely to neuroscience data, from single-unit recordings, through local field potentials to functional magnetic resonance imaging (fMRI), electroencephalogram (EEG), and magnetoencephalogram (MEG) Models of the cortex can establish which types of large-scale neuronal networks can perform computations and characterize their emergent properties Mean-field and related formulations of dynamics also play an essential and complementary role as forward models that can be inverted given empirical data This makes dynamic models critical in integrating theory and experiments We argue that elaborating principled and informed models is a prerequisite for grounding empirical neuroscience in a cogent theoretical framework, commensurate with the achievements in the physical sciences

986 citations


Journal ArticleDOI
TL;DR: It appears that the ability of MHC class II molecules to bind variable length peptides, which requires the correct assignment of peptide binding cores, is a critical factor limiting the performance of existing prediction tools.
Abstract: The identification of MHC class II restricted peptide epitopes is an important goal in immunological research. A number of computational tools have been developed for this purpose, but there is a lack of large-scale systematic evaluation of their performance. Herein, we used a comprehensive dataset consisting of more than 10,000 previously unpublished MHC-peptide binding affinities, 29 peptide/MHC crystal structures, and 664 peptides experimentally tested for CD4+ T cell responses to systematically evaluate the performances of publicly available MHC class II binding prediction tools. While in selected instances the best tools were associated with AUC values up to 0.86, in general, class II predictions did not perform as well as historically noted for class I predictions. It appears that the ability of MHC class II molecules to bind variable length peptides, which requires the correct assignment of peptide binding cores, is a critical factor limiting the performance of existing prediction tools. To improve performance, we implemented a consensus prediction approach that combines methods with top performances. We show that this consensus approach achieved best overall performance. Finally, we make the large datasets used publicly available as a benchmark to facilitate further development of MHC class II binding peptide prediction methods.

784 citations


Journal ArticleDOI
TL;DR: A general model that subsumes many parametric models for continuous data that can be inverted using exactly the same scheme, namely, dynamic expectation maximization, and is formulated as a simple neural network that may provide a useful metaphor for inference and learning in the brain.
Abstract: This paper describes a general model that subsumes many parametric models for continuous data. The model comprises hidden layers of state-space or dynamic causal models, arranged so that the output of one provides input to another. The ensuing hierarchy furnishes a model for many types of data, of arbitrary complexity. Special cases range from the general linear model for static data to generalised convolution models, with system noise, for nonlinear time-series analysis. Crucially, all of these models can be inverted using exactly the same scheme, namely, dynamic expectation maximization. This means that a single model and optimisation scheme can be used to invert a wide range of models. We present the model and a brief review of its inversion to disclose the relationships among, apparently, diverse generative models of empirical data. We then show that this inversion can be formulated as a simple neural network and may provide a useful metaphor for inference and learning in the brain.

771 citations


Journal ArticleDOI
TL;DR: This work describes conditions when a close relationship exists between network analysis and microarray data analysis techniques, and provides a rough dictionary for translating between the two fields.
Abstract: The merging of network theory and microarray data analysis techniques has spawned a new field: gene coexpression network analysis. While network methods are increasingly used in biology, the network vocabulary of computational biologists tends to be far more limited than that of, say, social network theorists. Here we review and propose several potentially useful network concepts. We take advantage of the relationship between network theory and the field of microarray data analysis to clarify the meaning of and the relationship among network concepts in gene coexpression networks. Network theory offers a wealth of intuitive concepts for describing the pairwise relationships among genes, which are depicted in cluster trees and heat maps. Conversely, microarray data analysis techniques (singular value decomposition, tests of differential expression) can also be used to address difficult problems in network theory. We describe conditions when a close relationship exists between network analysis and microarray data analysis techniques, and provide a rough dictionary for translating between the two fields. Using the angular interpretation of correlations, we provide a geometric interpretation of network theoretic concepts and derive unexpected relationships among them. We use the singular value decomposition of module expression data to characterize approximately factorizable gene coexpression networks, i.e., adjacency matrices that factor into node specific contributions. High and low level views of coexpression networks allow us to study the relationships among modules and among module genes, respectively. We characterize coexpression networks where hub genes are significant with respect to a microarray sample trait and show that the network concept of intramodular connectivity can be interpreted as a fuzzy measure of module membership. We illustrate our results using human, mouse, and yeast microarray gene expression data. The unification of coexpression network methods with traditional data mining methods can inform the application and development of systems biologic methods.

731 citations


Journal ArticleDOI
TL;DR: Support vector machines are widely used in computational biology due to their high accuracy, their ability to deal with high-dimensional and large datasets, and their flexibility in modeling diverse sources of data.
Abstract: The increasing wealth of biological data coming from a large variety of platforms and the continued development of new high-throughput methods for probing biological systems require increasingly more sophisticated computational approaches. Putting all these data in simple-to-use databases is a first step; but realizing the full potential of the data requires algorithms that automatically extract regularities from the data, which can then lead to biological insight. Many of the problems in computational biology are in the form of prediction: starting from prediction of a gene's structure, prediction of its function, interactions, and role in disease. Support vector machines (SVMs) and related kernel methods are extremely good at solving such problems [1]–[3]. SVMs are widely used in computational biology due to their high accuracy, their ability to deal with high-dimensional and large datasets, and their flexibility in modeling diverse sources of data [2], [4]–[6]. The simplest form of a prediction problem is binary classification: trying to discriminate between objects that belong to one of two categories—positive (+1) or negative (−1). SVMs use two key concepts to solve this problem: large margin separation and kernel functions. The idea of large margin separation can be motivated by classification of points in two dimensions (see Figure 1). A simple way to classify the points is to draw a straight line and call points lying on one side positive and on the other side negative. If the two sets are well separated, one would intuitively draw the separating line such that it is as far as possible away from the points in both sets (see Figures 2 and ​and3).3). This intuitive choice captures the idea of large margin separation, which is mathematically formulated in the section Classification with Large Margin. Open in a separate window Figure 1 A linear classifier separating two classes of points (squares and circles) in two dimensions. The decision boundary divides the space into two sets depending on the sign of f(x) = 〈w,x〉+b. The grayscale level represents the value of the discriminant function f(x): dark for low values and a light shade for high values.

660 citations


Journal ArticleDOI
TL;DR: It is shown that a simple V1-like model—a neuroscientist's “null” model—outperforms state-of-the-art object recognition systems (biologically inspired and otherwise) on a standard, ostensibly natural image recognition test.
Abstract: Progress in understanding the brain mechanisms underlying vision requires the construction of computational models that not only emulate the brain's anatomy and physiology, but ultimately match its performance on visual tasks. In recent years, “natural” images have become popular in the study of vision and have been used to show apparently impressive progress in building such models. Here, we challenge the use of uncontrolled “natural” images in guiding that progress. In particular, we show that a simple V1-like model—a neuroscientist's “null” model, which should perform poorly at real-world visual object recognition tasks—outperforms state-of-the-art object recognition systems (biologically inspired and otherwise) on a standard, ostensibly natural image recognition test. As a counterpoint, we designed a “simpler” recognition test to better span the real-world variation in object pose, position, and scale, and we show that this test correctly exposes the inadequacy of the V1-like model. Taken together, these results demonstrate that tests based on uncontrolled natural images can be seriously misleading, potentially guiding progress in the wrong direction. Instead, we reexamine what it means for images to be natural and argue for a renewed focus on the core problem of object recognition—real-world image variation.

608 citations


Journal ArticleDOI
TL;DR: It is shown that comparable resting state networks emerge from a stability analysis of the network dynamics using biologically realistic primate brain connectivity, although anatomical information alone does not identify the network.
Abstract: Traditionally brain function is studied through measuring physiological responses in controlled sensory, motor, and cognitive paradigms. However, even at rest, in the absence of overt goal-directed behavior, collections of cortical regions consistently show temporally coherent activity. In humans, these resting state networks have been shown to greatly overlap with functional architectures present during consciously directed activity, which motivates the interpretation of rest activity as day dreaming, free association, stream of consciousness, and inner rehearsal. In monkeys, it has been shown though that similar coherent fluctuations are present during deep anesthesia when there is no consciousness. Here, we show that comparable resting state networks emerge from a stability analysis of the network dynamics using biologically realistic primate brain connectivity, although anatomical information alone does not identify the network. We specifically demonstrate that noise and time delays via propagation along connecting fibres are essential for the emergence of the coherent fluctuations of the default network. The spatiotemporal network dynamics evolves on multiple temporal scales and displays the intermittent neuroelectric oscillations in the fast frequency regimes, 1–100 Hz, commonly observed in electroencephalographic and magnetoencephalographic recordings, as well as the hemodynamic oscillations in the ultraslow regimes, <0.1 Hz, observed in functional magnetic resonance imaging. The combination of anatomical structure and time delays creates a space–time structure in which the neural noise enables the brain to explore various functional configurations representing its dynamic repertoire.

563 citations


Journal ArticleDOI
TL;DR: This work represents progress towards producing constraint-based models of metabolism that are specific to the conditions where the expression profiling data is available, and provides a quantitative inconsistency score indicating how consistent a set of gene expression data is with a particular metabolic objective.
Abstract: Reconstructions of cellular metabolism are publicly available for a variety of different microorganisms and some mammalian genomes. To date, these reconstructions are “genome-scale” and strive to include all reactions implied by the genome annotation, as well as those with direct experimental evidence. Clearly, many of the reactions in a genome-scale reconstruction will not be active under particular conditions or in a particular cell type. Methods to tailor these comprehensive genome-scale reconstructions into context-specific networks will aid predictive in silico modeling for a particular situation. We present a method called Gene Inactivity Moderated by Metabolism and Expression (GIMME) to achieve this goal. The GIMME algorithm uses quantitative gene expression data and one or more presupposed metabolic objectives to produce the context-specific reconstruction that is most consistent with the available data. Furthermore, the algorithm provides a quantitative inconsistency score indicating how consistent a set of gene expression data is with a particular metabolic objective. We show that this algorithm produces results consistent with biological experiments and intuition for adaptive evolution of bacteria, rational design of metabolic engineering strains, and human skeletal muscle cells. This work represents progress towards producing constraint-based models of metabolism that are specific to the conditions where the expression profiling data is available.

538 citations


Journal ArticleDOI
TL;DR: In this article, a mathematical model that exploits the temporal structure of fast sensory input to track the slower trajectories of their underlying causes is proposed, which can be used to understand brain function.
Abstract: In this paper, we suggest that cortical anatomy recapitulates the temporal hierarchy that is inherent in the dynamics of environmental states. Many aspects of brain function can be understood in terms of a hierarchy of temporal scales at which representations of the environment evolve. The lowest level of this hierarchy corresponds to fast fluctuations associated with sensory processing, whereas the highest levels encode slow contextual changes in the environment, under which faster representations unfold. First, we describe a mathematical model that exploits the temporal structure of fast sensory input to track the slower trajectories of their underlying causes. This model of sensory encoding or perceptual inference establishes a proof of concept that slowly changing neuronal states can encode the paths or trajectories of faster sensory states. We then review empirical evidence that suggests that a temporal hierarchy is recapitulated in the macroscopic organization of the cortex. This anatomic-temporal hierarchy provides a comprehensive framework for understanding cortical function: the specific time-scale that engages a cortical area can be inferred by its location along a rostro-caudal gradient, which reflects the anatomical distance from primary sensory areas. This is most evident in the prefrontal cortex, where complex functions can be explained as operations on representations of the environment that change slowly. The framework provides predictions about, and principled constraints on, cortical structure–function relationships, which can be tested by manipulating the time-scales of sensory input.

533 citations


Journal ArticleDOI
TL;DR: It is shown that the space of shapes adopted by the nematode Caenorhabditis elegans is low dimensional, with just four dimensions accounting for 95% of the shape variance, and Stimulus-dependent correlations among the different modes suggest that one can generate more reliable behaviors by synchronizing stimuli to the state of the worm in shape space.
Abstract: A major challenge in analyzing animal behavior is to discover some underlying simplicity in complex motor actions. Here, we show that the space of shapes adopted by the nematode Caenorhabditis elegans is low dimensional, with just four dimensions accounting for 95% of the shape variance. These dimensions provide a quantitative description of worm behavior, and we partially reconstruct “equations of motion” for the dynamics in this space. These dynamics have multiple attractors, and we find that the worm visits these in a rapid and almost completely deterministic response to weak thermal stimuli. Stimulus-dependent correlations among the different modes suggest that one can generate more reliable behaviors by synchronizing stimuli to the state of the worm in shape space. We confirm this prediction, effectively “steering” the worm in real time.

Journal ArticleDOI
TL;DR: It is concluded that it is difficult to annotate an RNA unequivocally as protein-coding or noncoding, with overlapping protein- coding and nonc coding transcripts further confounding this distinction.
Abstract: The assumption that RNA can be readily classified into either protein-coding or non-protein–coding categories has pervaded biology for close to 50 years. Until recently, discrimination between these two categories was relatively straightforward: most transcripts were clearly identifiable as protein-coding messenger RNAs (mRNAs), and readily distinguished from the small number of well-characterized non-protein–coding RNAs (ncRNAs), such as transfer, ribosomal, and spliceosomal RNAs. Recent genome-wide studies have revealed the existence of thousands of noncoding transcripts, whose function and significance are unclear. The discovery of this hidden transcriptome and the implicit challenge it presents to our understanding of the expression and regulation of genetic information has made the need to distinguish between mRNAs and ncRNAs both more pressing and more complicated. In this Review, we consider the diverse strategies employed to discriminate between protein-coding and noncoding transcripts and the fundamental difficulties that are inherent in what may superficially appear to be a simple problem. Misannotations can also run in both directions: some ncRNAs may actually encode peptides, and some of those currently thought to do so may not. Moreover, recent studies have shown that some RNAs can function both as mRNAs and intrinsically as functional ncRNAs, which may be a relatively widespread phenomenon. We conclude that it is difficult to annotate an RNA unequivocally as protein-coding or noncoding, with overlapping protein-coding and noncoding transcripts further confounding this distinction. In addition, the finding that some transcripts can function both intrinsically at the RNA level and to encode proteins suggests a false dichotomy between mRNAs and ncRNAs. Therefore, the functionality of any transcript at the RNA level should not be discounted.

Journal ArticleDOI
TL;DR: The evolution of new enzymatic activities, both in nature and in the laboratory, is dependent on the compensatory, stabilizing effect of apparently “silent” mutations in regions of the protein that are irrelevant to its function.
Abstract: Numerous studies have noted that the evolution of new enzymatic specificities is accompanied by loss of the protein's thermodynamic stability (DeltaDeltaG), thus suggesting a tradeoff between the acquisition of new enzymatic functions and stability. However, since most mutations are destabilizing (DeltaDeltaG>0), one should ask how destabilizing mutations that confer new or altered enzymatic functions relative to all other mutations are. We applied DeltaDeltaG computations by FoldX to analyze the effects of 548 mutations that arose from the directed evolution of 22 different enzymes. The stability effects, location, and type of function-altering mutations were compared to DeltaDeltaG changes arising from all possible point mutations in the same enzymes. We found that mutations that modulate enzymatic functions are mostly destabilizing (average DeltaDeltaG = +0.9 kcal/mol), and are almost as destabilizing as the "average" mutation in these enzymes (+1.3 kcal/mol). Although their stability effects are not as dramatic as in key catalytic residues, mutations that modify the substrate binding pockets, and thus mediate new enzymatic specificities, place a larger stability burden than surface mutations that underline neutral, non-adaptive evolutionary changes. How are the destabilizing effects of functional mutations balanced to enable adaptation? Our analysis also indicated that many mutations that appear in directed evolution variants with no obvious role in the new function exert stabilizing effects that may compensate for the destabilizing effects of the crucial function-altering mutations. Thus, the evolution of new enzymatic activities, both in nature and in the laboratory, is dependent on the compensatory, stabilizing effect of apparently "silent" mutations in regions of the protein that are irrelevant to its function.

Journal ArticleDOI
TL;DR: It is shown that classifiers using pathway activity achieve better performance than classifiers based on individual gene expression, for both simple and complex case-control studies including differentiation of perturbed from non-perturbed cells and subtyping of several different kinds of cancer.
Abstract: The advent of microarray technology has made it possible to classify disease states based on gene expression profiles of patients. Typically, marker genes are selected by measuring the power of their expression profiles to discriminate among patients of different disease states. However, expression-based classification can be challenging in complex diseases due to factors such as cellular heterogeneity within a tissue sample and genetic heterogeneity across patients. A promising technique for coping with these challenges is to incorporate pathway information into the disease classification procedure in order to classify disease based on the activity of entire signaling pathways or protein complexes rather than on the expression levels of individual genes or proteins. We propose a new classification method based on pathway activities inferred for each patient. For each pathway, an activity level is summarized from the gene expression levels of its condition-responsive genes (CORGs), defined as the subset of genes in the pathway whose combined expression delivers optimal discriminative power for the disease phenotype. We show that classifiers using pathway activity achieve better performance than classifiers based on individual gene expression, for both simple and complex case-control studies including differentiation of perturbed from non-perturbed cells and subtyping of several different kinds of cancer. Moreover, the new method outperforms several previous approaches that use a static (i.e., non-conditional) definition of pathways. Within a pathway, the identified CORGs may facilitate the development of better diagnostic markers and the discovery of core alterations in human disease.

Journal ArticleDOI
TL;DR: Computer simulations are used to show that the experimentally observed territory shapes and spatial distances between marked chromosome sites for human, Drosophila, and budding yeast chromosomes can be reproduced by a parameter-free minimal model of decondensing chromosomes.
Abstract: During interphase chromosomes decondense, but fluorescent in situ hybridization experiments reveal the existence of distinct territories occupied by individual chromosomes inside the nuclei of most eukaryotic cells. We use computer simulations to show that the existence and stability of territories is a kinetic effect that can be explained without invoking an underlying nuclear scaffold or protein-mediated interactions between DNA sequences. In particular, we show that the experimentally observed territory shapes and spatial distances between marked chromosome sites for human, Drosophila, and budding yeast chromosomes can be reproduced by a parameter-free minimal model of decondensing chromosomes. Our results suggest that the observed interphase structure and dynamics are due to generic polymer effects: confined Brownian motion conserving the local topological state of long chain molecules and segregation of mutually unentangled chains due to topological constraints.

Journal ArticleDOI
TL;DR: The proposed network model, coordinating the physical body of a humanoid robot through high-dimensional sensori-motor control, also successfully situated itself within a physical environment and suggests that it is not only the spatial connections between neurons but also the timescales of neural activity that act as important mechanisms leading to functional hierarchy in neural systems.
Abstract: It is generally thought that skilled behavior in human beings results from a functional hierarchy of the motor control system, within which reusable motor primitives are flexibly integrated into various sensori-motor sequence patterns. The underlying neural mechanisms governing the way in which continuous sensori-motor flows are segmented into primitives and the way in which series of primitives are integrated into various behavior sequences have, however, not yet been clarified. In earlier studies, this functional hierarchy has been realized through the use of explicit hierarchical structure, with local modules representing motor primitives in the lower level and a higher module representing sequences of primitives switched via additional mechanisms such as gate-selecting. When sequences contain similarities and overlap, however, a conflict arises in such earlier models between generalization and segmentation, induced by this separated modular structure. To address this issue, we propose a different type of neural network model. The current model neither makes use of separate local modules to represent primitives nor introduces explicit hierarchical structure. Rather than forcing architectural hierarchy onto the system, functional hierarchy emerges through a form of self-organization that is based on two distinct types of neurons, each with different time properties (“multiple timescales”). Through the introduction of multiple timescales, continuous sequences of behavior are segmented into reusable primitives, and the primitives, in turn, are flexibly integrated into novel sequences. In experiments, the proposed network model, coordinating the physical body of a humanoid robot through high-dimensional sensori-motor control, also successfully situated itself within a physical environment. Our results suggest that it is not only the spatial connections between neurons but also the timescales of neural activity that act as important mechanisms leading to functional hierarchy in neural systems.

Journal ArticleDOI
TL;DR: A genome-wide map of ∼380,000 yeast nucleosomes is used and it is found that Poly(dA:dT) tracts are an important component of these nucleosome positioning signals and that their nucleosom-disfavoring action results in large nucleosite depletion over them and over their flanking regions and enhances the accessibility of transcription factors to their cognate sites.
Abstract: The detailed positions of nucleosomes profoundly impact gene regulation and are partly encoded by the genomic DNA sequence. However, less is known about the functional consequences of this encoding. Here, we address this question using a genome-wide map of ∼380,000 yeast nucleosomes that we sequenced in their entirety. Utilizing the high resolution of our map, we refine our understanding of how nucleosome organizations are encoded by the DNA sequence and demonstrate that the genomic sequence is highly predictive of the in vivo nucleosome organization, even across new nucleosome-bound sequences that we isolated from fly and human. We find that Poly(dA:dT) tracts are an important component of these nucleosome positioning signals and that their nucleosome-disfavoring action results in large nucleosome depletion over them and over their flanking regions and enhances the accessibility of transcription factors to their cognate sites. Our results suggest that the yeast genome may utilize these nucleosome positioning signals to regulate gene expression with different transcriptional noise and activation kinetics and DNA replication with different origin efficiency. These distinct functions may be achieved by encoding both relatively closed (nucleosome-covered) chromatin organizations over some factor binding sites, where factors must compete with nucleosomes for DNA access, and relatively open (nucleosome-depleted) organizations over other factor sites, where factors bind without competition.

Journal ArticleDOI
TL;DR: A rigorous analysis of six variants of the genomewide protein interaction network for Saccharomyces cerevisiae demonstrated that the majority of hubs are essential due to their involvement in Essential Complex Biological Modules, a group of densely connected proteins with shared biological function that are enriched in essential proteins.
Abstract: The centrality-lethality rule, which notes that high-degree nodes in a protein interaction network tend to correspond to proteins that are essential, suggests that the topological prominence of a protein in a protein interaction network may be a good predictor of its biological importance. Even though the correlation between degree and essentiality was confirmed by many independent studies, the reason for this correlation remains illusive. Several hypotheses about putative connections between essentiality of hubs and the topology of protein–protein interaction networks have been proposed, but as we demonstrate, these explanations are not supported by the properties of protein interaction networks. To identify the main topological determinant of essentiality and to provide a biological explanation for the connection between the network topology and essentiality, we performed a rigorous analysis of six variants of the genomewide protein interaction network for Saccharomyces cerevisiae obtained using different techniques. We demonstrated that the majority of hubs are essential due to their involvement in Essential Complex Biological Modules, a group of densely connected proteins with shared biological function that are enriched in essential proteins. Moreover, we rejected two previously proposed explanations for the centrality-lethality rule, one relating the essentiality of hubs to their role in the overall network connectivity and another relying on the recently published essential protein interactions model.

Journal ArticleDOI
TL;DR: Nonnegative matrix factorization is reviewed as a data analytical and interpretive tool in computational biology with an emphasis on molecular pattern discovery, class comparison and prediction, cross-platform and cross-species analysis, functional characterization of genes and biomedical informatics.
Abstract: In the last decade, advances in high-throughput technologies such as DNA microarrays have made it possible to simultaneously measure the expression levels of tens of thousands of genes and proteins. This has resulted in large amounts of biological data requiring analysis and interpretation. Nonnegative matrix factorization (NMF) was introduced as an unsupervised, parts-based learning paradigm involving the decomposition of a nonnegative matrix V into two nonnegative matrices, W and H, via a multiplicative updates algorithm. In the context of a p×n gene expression matrix V consisting of observations on p genes from n samples, each column of W defines a metagene, and each column of H represents the metagene expression pattern of the corresponding sample. NMF has been primarily applied in an unsupervised setting in image and natural language processing. More recently, it has been successfully utilized in a variety of applications in computational biology. Examples include molecular pattern discovery, class comparison and prediction, cross-platform and cross-species analysis, functional characterization of genes and biomedical informatics. In this paper, we review this method as a data analytical and interpretive tool in computational biology with an emphasis on these applications.

Journal ArticleDOI
TL;DR: A time- and state-dependent measure of integrated information, φ, which captures the repertoire of causal states available to a system as a whole and appears to be a useful metric to characterize the capacity of any physical system to integrate information.
Abstract: This paper introduces a time- and state-dependent measure of integrated information, φ, which captures the repertoire of causal states available to a system as a whole. Specifically, φ quantifies how much information is generated (uncertainty is reduced) when a system enters a particular state through causal interactions among its elements, above and beyond the information generated independently by its parts. Such mathematical characterization is motivated by the observation that integrated information captures two key phenomenological properties of consciousness: (i) there is a large repertoire of conscious experiences so that, when one particular experience occurs, it generates a large amount of information by ruling out all the others; and (ii) this information is integrated, in that each experience appears as a whole that cannot be decomposed into independent parts. This paper extends previous work on stationary systems and applies integrated information to discrete networks as a function of their dynamics and causal architecture. An analysis of basic examples indicates the following: (i) φ varies depending on the state entered by a network, being higher if active and inactive elements are balanced and lower if the network is inactive or hyperactive. (ii) φ varies for systems with identical or similar surface dynamics depending on the underlying causal architecture, being low for systems that merely copy or replay activity states. (iii) φ varies as a function of network architecture. High φ values can be obtained by architectures that conjoin functional specialization with functional integration. Strictly modular and homogeneous systems cannot generate high φ because the former lack integration, whereas the latter lack information. Feedforward and lattice architectures are capable of generating high φ but are inefficient. (iv) In Hopfield networks, φ is low for attractor states and neutral states, but increases if the networks are optimized to achieve tension between local and global interactions. These basic examples appear to match well against neurobiological evidence concerning the neural substrates of consciousness. More generally, φ appears to be a useful metric to characterize the capacity of any physical system to integrate information.

Journal ArticleDOI
TL;DR: Brain signal variability increased with age, and showed strong negative correlations with intrasubject RT variability and positive correlations with accuracy, suggesting that the moment-to-moment variability in brain activity may be a critical index of the cognitive capacity of the brain.
Abstract: As the brain matures, its responses become optimized. Behavioral measures show this through improved accuracy and decreased trial-to-trial variability. The question remains whether the supporting brain dynamics show a similar decrease in variability. We examined the relation between variability in single trial evoked electrical activity of the brain (measured with EEG) and performance of a face memory task in children (8–15 y) and young adults (20–33 y). Behaviorally, children showed slower, more variable response times (RT), and less accurate recognition than adults. However, brain signal variability increased with age, and showed strong negative correlations with intrasubject RT variability and positive correlations with accuracy. Thus, maturation appears to lead to a brain with greater functional variability, which is indicative of enhanced neural complexity. This variability may reflect a broader repertoire of metastable brain states and more fluid transitions among them that enable optimum responses. Our results suggest that the moment-to-moment variability in brain activity may be a critical index of the cognitive capacity of the brain.

Journal ArticleDOI
TL;DR: This simulated network of excitatory and inhibitory neurons modeling a local cortical population encoded static input spike rates into gamma-range oscillations generated by inhibitory–excitatory neural interactions and encoded slow dynamic features of the input into slow LFP fluctuations mediated by stimulus–neural interactions.
Abstract: Recordings of local field potentials (LFPs) reveal that the sensory cortex displays rhythmic activity and fluctuations over a wide range of frequencies and amplitudes. Yet, the role of this kind of activity in encoding sensory information remains largely unknown. To understand the rules of translation between the structure of sensory stimuli and the fluctuations of cortical responses, we simulated a sparsely connected network of excitatory and inhibitory neurons modeling a local cortical population, and we determined how the LFPs generated by the network encode information about input stimuli. We first considered simple static and periodic stimuli and then naturalistic input stimuli based on electrophysiological recordings from the thalamus of anesthetized monkeys watching natural movie scenes. We found that the simulated network produced stimulus-related LFP changes that were in striking agreement with the LFPs obtained from the primary visual cortex. Moreover, our results demonstrate that the network encoded static input spike rates into gamma-range oscillations generated by inhibitory–excitatory neural interactions and encoded slow dynamic features of the input into slow LFP fluctuations mediated by stimulus–neural interactions. The model cortical network processed dynamic stimuli with naturalistic temporal structure by using low and high response frequencies as independent communication channels, again in agreement with recent reports from visual cortex responses to naturalistic movies. One potential function of this frequency decomposition into independent information channels operated by the cortical network may be that of enhancing the capacity of the cortical column to encode our complex sensory environment.

Journal ArticleDOI
TL;DR: An effective solution for the problem of sequential decision making, represented as a fixed time game: a player takes sequential actions in a changing noisy environment so as to maximize a cumulative reward.
Abstract: The idea that cognitive activity can be understood using nonlinear dynamics has been intensively discussed at length for the last 15 years. One of the popular points of view is that metastable states play a key role in the execution of cognitive functions. Experimental and modeling studies suggest that most of these functions are the result of transient activity of large-scale brain networks in the presence of noise. Such transients may consist of a sequential switching between different metastable cognitive states. The main problem faced when using dynamical theory to describe transient cognitive processes is the fundamental contradiction between reproducibility and flexibility of transient behavior. In this paper, we propose a theoretical description of transient cognitive dynamics based on the interaction of functionally dependent metastable cognitive states. The mathematical image of such transient activity is a stable heteroclinic channel, i.e., a set of trajectories in the vicinity of a heteroclinic skeleton that consists of saddles and unstable separatrices that connect their surroundings. We suggest a basic mathematical model, a strongly dissipative dynamical system, and formulate the conditions for the robustness and reproducibility of cognitive transients that satisfy the competing requirements for stability and flexibility. Based on this approach, we describe here an effective solution for the problem of sequential decision making, represented as a fixed time game: a player takes sequential actions in a changing noisy environment so as to maximize a cumulative reward. As we predict and verify in computer simulations, noise plays an important role in optimizing the gain.

Journal ArticleDOI
TL;DR: It is shown it is possible to deduce whether players make inferences about each other and quantify their sophistication on the basis of choices in sequential games, and exactly the same sophisticated behaviour can be achieved by optimising the utility function itself (through prosocial utility), producing unsophisticated but apparently altruistic agents.
Abstract: This paper introduces a model of ‘theory of mind’, namely, how we represent the intentions and goals of others to optimise our mutual interactions. We draw on ideas from optimum control and game theory to provide a ‘game theory of mind’. First, we consider the representations of goals in terms of value functions that are prescribed by utility or rewards. Critically, the joint value functions and ensuing behaviour are optimised recursively, under the assumption that I represent your value function, your representation of mine, your representation of my representation of yours, and so on ad infinitum. However, if we assume that the degree of recursion is bounded, then players need to estimate the opponent's degree of recursion (i.e., sophistication) to respond optimally. This induces a problem of inferring the opponent's sophistication, given behavioural exchanges. We show it is possible to deduce whether players make inferences about each other and quantify their sophistication on the basis of choices in sequential games. This rests on comparing generative models of choices with, and without, inference. Model comparison is demonstrated using simulated and real data from a ‘stag-hunt’. Finally, we note that exactly the same sophisticated behaviour can be achieved by optimising the utility function itself (through prosocial utility), producing unsophisticated but apparently altruistic agents. This may be relevant ethologically in hierarchal game theory and coevolution.

Journal ArticleDOI
TL;DR: Gapped Local Alignment of Motifs is especially promising for short protein motifs, and it should improve the ability to identify the protein cleavage sites, interaction sites, post-translational modification attachment sites, etc., that underlie much of biology.
Abstract: Biology is encoded in molecular sequences: deciphering this encoding remains a grand scientific challenge. Functional regions of DNA, RNA, and protein sequences often exhibit characteristic but subtle motifs; thus, computational discovery of motifs in sequences is a fundamental and much-studied problem. However, most current algorithms do not allow for insertions or deletions (indels) within motifs, and the few that do have other limitations. We present a method, GLAM2 (Gapped Local Alignment of Motifs), for discovering motifs allowing indels in a fully general manner, and a companion method GLAM2SCAN for searching sequence databases using such motifs. glam2 is a generalization of the gapless Gibbs sampling algorithm. It re-discovers variable-width protein motifs from the PROSITE database significantly more accurately than the alternative methods PRATT and SAM-T2K. Furthermore, it usefully refines protein motifs from the ELM database: in some cases, the refined motifs make orders of magnitude fewer overpredictions than the original ELM regular expressions. GLAM2 performs respectably on the BAliBASE multiple alignment benchmark, and may be superior to leading multiple alignment methods for “motif-like” alignments with N- and C-terminal extensions. Finally, we demonstrate the use of GLAM2 to discover protein kinase substrate motifs and a gapped DNA motif for the LIM-only transcriptional regulatory complex: using GLAM2SCAN, we identify promising targets for the latter. GLAM2 is especially promising for short protein motifs, and it should improve our ability to identify the protein cleavage sites, interaction sites, post-translational modification attachment sites, etc., that underlie much of biology. It may be equally useful for arbitrarily gapped motifs in DNA and RNA, although fewer examples of such motifs are known at present. GLAM2 is public domain software, available for download at http://bioinformatics.org.au/glam2.

Journal ArticleDOI
TL;DR: It is conjecture that both expected score distributions have simple, predictable forms when full probabilistic modeling methods are used, which enables efficient and accurate determination of expectation values (E-values) for both Viterbi and Forward scores for Probabilistic local alignments.
Abstract: Sequence database searches require accurate estimation of the statistical significance of scores. Optimal local sequence alignment scores follow Gumbel distributions, but determining an important parameter of the distribution (lambda) requires time-consuming computational simulation. Moreover, optimal alignment scores are less powerful than probabilistic scores that integrate over alignment uncertainty ("Forward" scores), but the expected distribution of Forward scores remains unknown. Here, I conjecture that both expected score distributions have simple, predictable forms when full probabilistic modeling methods are used. For a probabilistic model of local sequence alignment, optimal alignment bit scores ("Viterbi" scores) are Gumbel-distributed with constant lambda = log 2, and the high scoring tail of Forward scores is exponential with the same constant lambda. Simulation studies support these conjectures over a wide range of profile/sequence comparisons, using 9,318 profile-hidden Markov models from the Pfam database. This enables efficient and accurate determination of expectation values (E-values) for both Viterbi and Forward scores for probabilistic local alignments.

Journal ArticleDOI
TL;DR: The resulting learning theory predicts that even difficult credit-assignment problems can be solved in a self-organizing manner through reward-modulated STDP, and provides a possible functional explanation for trial-to-trial variability, which is characteristic for cortical networks of neurons but has no analogue in currently existing artificial computing systems.
Abstract: Reward-modulated spike-timing-dependent plasticity (STDP) has recently emerged as a candidate for a learning rule that could explain how behaviorally relevant adaptive changes in complex networks of spiking neurons could be achieved in a self-organizing manner through local synaptic plasticity. However, the capabilities and limitations of this learning rule could so far only be tested through computer simulations. This article provides tools for an analytic treatment of reward-modulated STDP, which allows us to predict under which conditions reward-modulated STDP will achieve a desired learning effect. These analytical results imply that neurons can learn through reward-modulated STDP to classify not only spatial but also temporal firing patterns of presynaptic neurons. They also can learn to respond to specific presynaptic firing patterns with particular spike patterns. Finally, the resulting learning theory predicts that even difficult credit-assignment problems, where it is very hard to tell which synaptic weights should be modified in order to increase the global reward for the system, can be solved in a self-organizing manner through reward-modulated STDP. This yields an explanation for a fundamental experimental result on biofeedback in monkeys by Fetz and Baker. In this experiment monkeys were rewarded for increasing the firing rate of a particular neuron in the cortex and were able to solve this extremely difficult credit assignment problem. Our model for this experiment relies on a combination of reward-modulated STDP with variable spontaneous firing activity. Hence it also provides a possible functional explanation for trial-to-trial variability, which is characteristic for cortical networks of neurons but has no analogue in currently existing artificial computing systems. In addition our model demonstrates that reward-modulated STDP can be applied to all synapses in a large recurrent neural network without endangering the stability of the network dynamics.

Journal ArticleDOI
TL;DR: This work demonstrates that long-term evolution of complex gene regulatory networks in a changing environment can lead to a striking increase in the efficiency of generating beneficial mutations, and shows that the population evolves towards genotype-phenotype mappings that allow for an orchestrated network-wide change in the gene expression pattern.
Abstract: Gene regulatory networks are perhaps the most important organizational level in the cell where signals from the cell state and the outside environment are integrated in terms of activation and inhibition of genes. For the last decade, the study of such networks has been fueled by large-scale experiments and renewed attention from the theoretical field. Different models have been proposed to, for instance, investigate expression dynamics, explain the network topology we observe in bacteria and yeast, and for the analysis of evolvability and robustness of such networks. Yet how these gene regulatory networks evolve and become evolvable remains an open question. An individual-oriented evolutionary model is used to shed light on this matter. Each individual has a genome from which its gene regulatory network is derived. Mutations, such as gene duplications and deletions, alter the genome, while the resulting network determines the gene expression pattern and hence fitness. With this protocol we let a population of individuals evolve under Darwinian selection in an environment that changes through time. Our work demonstrates that long-term evolution of complex gene regulatory networks in a changing environment can lead to a striking increase in the efficiency of generating beneficial mutations. We show that the population evolves towards genotype-phenotype mappings that allow for an orchestrated network-wide change in the gene expression pattern, requiring only a few specific gene indels. The genes involved are hubs of the networks, or directly influencing the hubs. Moreover, throughout the evolutionary trajectory the networks maintain their mutational robustness. In other words, evolution in an alternating environment leads to a network that is sensitive to a small class of beneficial mutations, while the majority of mutations remain neutral: an example of evolution of evolvability.

Journal ArticleDOI
TL;DR: The analysis of the network structure revealed the paths through which light, food, and heat can entrain the circadian clock and identified that NR3C1 and FKBP/HSP90 complexes are central to the control of circadian genes through diverse environmental signals.
Abstract: Circadian rhythm is fundamental in regulating a wide range of cellular, metabolic, physiological, and behavioral activities in mammals. Although a small number of key circadian genes have been identified through extensive molecular and genetic studies in the past, the existence of other key circadian genes and how they drive the genomewide circadian oscillation of gene expression in different tissues still remains unknown. Here we try to address these questions by integrating all available circadian microarray data in mammals. We identified 41 common circadian genes that showed circadian oscillation in a wide range of mouse tissues with a remarkable consistency of circadian phases across tissues. Comparisons across mouse, rat, rhesus macaque, and human showed that the circadian phases of known key circadian genes were delayed for 4–5 hours in rat compared to mouse and 8–12 hours in macaque and human compared to mouse. A systematic gene regulatory network for the mouse circadian rhythm was constructed after incorporating promoter analysis and transcription factor knockout or mutant microarray data. We observed the significant association of cis-regulatory elements: EBOX, DBOX, RRE, and HSE with the different phases of circadian oscillating genes. The analysis of the network structure revealed the paths through which light, food, and heat can entrain the circadian clock and identified that NR3C1 and FKBP/HSP90 complexes are central to the control of circadian genes through diverse environmental signals. Our study improves our understanding of the structure, design principle, and evolution of gene regulatory networks involved in the mammalian circadian rhythm.

Journal ArticleDOI
TL;DR: A genome-scale constraint-based model of the metabolism of Pseudomonas putida yields valuable insights into genotype–phenotype relationships and provides a sound framework to explore this versatile bacterium and to capitalize on its vast biotechnological potential.
Abstract: A cornerstone of biotechnology is the use of microorganisms for the efficient production of chemicals and the elimination of harmful waste. Pseudomonas putida is an archetype of such microbes due to its metabolic versatility, stress resistance, amenability to genetic modifications, and vast potential for environmental and industrial applications. To address both the elucidation of the metabolic wiring in P. putida and its uses in biocatalysis, in particular for the production of non-growth-related biochemicals, we developed and present here a genome-scale constraint-based model of the metabolism of P. putida KT2440. Network reconstruction and flux balance analysis (FBA) enabled definition of the structure of the metabolic network, identification of knowledge gaps, and pin-pointing of essential metabolic functions, facilitating thereby the refinement of gene annotations. FBA and flux variability analysis were used to analyze the properties, potential, and limits of the model. These analyses allowed identification, under various conditions, of key features of metabolism such as growth yield, resource distribution, network robustness, and gene essentiality. The model was validated with data from continuous cell cultures, high-throughput phenotyping data, 13C-measurement of internal flux distributions, and specifically generated knock-out mutants. Auxotrophy was correctly predicted in 75% of the cases. These systematic analyses revealed that the metabolic network structure is the main factor determining the accuracy of predictions, whereas biomass composition has negligible influence. Finally, we drew on the model to devise metabolic engineering strategies to improve production of polyhydroxyalkanoates, a class of biotechnologically useful compounds whose synthesis is not coupled to cell survival. The solidly validated model yields valuable insights into genotype–phenotype relationships and provides a sound framework to explore this versatile bacterium and to capitalize on its vast biotechnological potential.