scispace - formally typeset
Search or ask a question

Showing papers on "Interaction network published in 2008"


Journal ArticleDOI
01 Jul 2008
TL;DR: This article characterize four classes of drug–target interaction networks in humans involving enzymes, ion channels, G-protein-coupled receptors (GPCRs) and nuclear receptors, and reveal significant correlations between drug structure similarity, target sequence similarity and the drug– target interaction network topology.
Abstract: Motivation: The identification of interactions between drugs and target proteins is a key area in genomic drug discovery. Therefore, there is a strong incentive to develop new methods capable of detecting these potential drug–target interactions efficiently. Results: In this article, we characterize four classes of drug–target interaction networks in humans involving enzymes, ion channels, G-protein-coupled receptors (GPCRs) and nuclear receptors, and reveal significant correlations between drug structure similarity, target sequence similarity and the drug–target interaction network topology. We then develop new statistical methods to predict unknown drug–target interaction networks from chemical structure and genomic sequence information simultaneously on a large scale. The originality of the proposed method lies in the formalization of the drug–target interaction inference as a supervised learning problem for a bipartite graph, the lack of need for 3D structure information of the target proteins, and in the integration of chemical and genomic spaces into a unified space that we call ‘pharmacological space’. In the results, we demonstrate the usefulness of our proposed method for the prediction of the four classes of drug–target interaction networks. Our comprehensively predicted drug–target interaction networks enable us to suggest many potential drug–target interactions and to increase research productivity toward genomic drug discovery. Availability: Softwares are available upon request. Contact: Yoshihiro.Yamanishi@ensmp.fr Supplementary information: Datasets and all prediction results are available at http://web.kuicr.kyoto-u.ac.jp/supp/yoshi/drugtarget/.

926 citations


Journal ArticleDOI
01 Jul 2008
TL;DR: This work applies the first exact solution for functional modules in PPI networks by computing optimal-scoring subnetworks based on integer-linear programming and its connection to the well-known prize-collecting Steiner tree problem from Operations Research.
Abstract: Motivation: With the exponential growth of expression and protein–protein interaction (PPI) data, the frontier of research in systems biology shifts more and more to the integrated analysis of these large datasets. Of particular interest is the identification of functional modules in PPI networks, sharing common cellular function beyond the scope of classical pathways, by means of detecting differentially expressed regions in PPI networks. This requires on the one hand an adequate scoring of the nodes in the network to be identified and on the other hand the availability of an effective algorithm to find the maximally scoring network regions. Various heuristic approaches have been proposed in the literature. Results: Here we present the first exact solution for this problem, which is based on integer-linear programming and its connection to the well-known prize-collecting Steiner tree problem from Operations Research. Despite the NP-hardness of the underlying combinatorial problem, our method typically computes provably optimal subnetworks in large PPI networks in a few minutes. An essential ingredient of our approach is a scoring function defined on network nodes. We propose a new additive score with two desirable properties: (i) it is scalable by a statistically interpretable parameter and (ii) it allows a smooth integration of data from various sources. We apply our method to a well-established lymphoma microarray dataset in combination with associated survival data and the large interaction network of HPRD to identify functional modules by computing optimal-scoring subnetworks. In particular, we find a functional interaction module associated with proliferation over-expressed in the aggressive ABC subtype as well as modules derived from non-malignant by-stander cells. Availability: Our software is available freely for non-commercial purposes at http://www.planet-lisa.net. Contact: tobias.mueller@biozentrum.uni-wuerzburg.de

513 citations


Journal ArticleDOI
TL;DR: The high temporal plasticity in species composition and interaction identity coupled with the low variation in network structure properties imply that tight and specialized coevolution might not be as important as previously suggested and that plant-pollinator interaction networks might be less prone to detrimental effects of disturbance than previously thought.
Abstract: We analysed the dynamics of a plant-pollinator interaction network of a scrub community surveyed over four consecutive years. Species composition within the annual networks showed high temporal variation. Temporal dynamics were also evident in the topology of the network, as interactions among plants and pollinators did not remain constant through time. This change involved both the number and the identity of interacting partners. Strikingly, few species and interactions were consistently present in all four annual plant-pollinator networks (53% of the plant species, 21% of the pollinator species and 4.9% of the interactions). The high turnover in species-to-species interactions was mainly the effect of species turnover (c. 70% in pairwise comparisons among years), and less the effect of species flexibility to interact with new partners (c. 30%). We conclude that specialization in plant-pollinator interactions might be highly overestimated when measured over short periods of time. This is because many plant or pollinator species appear as specialists in 1 year, but tend to be generalists or to interact with different partner species when observed in other years. The high temporal plasticity in species composition and interaction identity coupled with the low variation in network structure properties (e.g. degree centralization, connectance, nestedness, average distance and network diameter) imply (i) that tight and specialized coevolution might not be as important as previously suggested and (ii) that plant-pollinator interaction networks might be less prone to detrimental effects of disturbance than previously thought. We suggest that this may be due to the opportunistic nature of plant and animal species regarding the available partner resources they depend upon at any particular time.

483 citations


Journal ArticleDOI
01 Jun 2008-Ecology
TL;DR: The day-to-day dynamics of an arctic pollination interaction network over two consecutive seasons are studied and temporal dynamics provides a mechanistic explanation for previously reported network patterns such as the heterogeneous distribution of number of interactions across species.
Abstract: Despite a strong current interest in ecological networks, the bulk of studies are static descriptions of the structure of networks, and very few analyze their temporal dynamics. Yet, understanding network dynamics is important in order to relate network patterns to ecological processes. We studied the day-to-day dynamics of an arctic pollination interaction network over two consecutive seasons. First, we found that new species entering the network tend to interact with already well-connected species, although there are deviations from this trend due, for example, to morphological mismatching between plant and pollinator traits and nonoverlapping phenophases of plant and pollinator species. Thus, temporal dynamics provides a mechanistic explanation for previously reported network patterns such as the heterogeneous distribution of number of interactions across species. Second, we looked for the ecological properties most likely to be mediating this dynamical process and found that both abundance and phenophase length were important determinants of the number of links per species.

456 citations


Journal ArticleDOI
TL;DR: A rigorous analysis of six variants of the genomewide protein interaction network for Saccharomyces cerevisiae demonstrated that the majority of hubs are essential due to their involvement in Essential Complex Biological Modules, a group of densely connected proteins with shared biological function that are enriched in essential proteins.
Abstract: The centrality-lethality rule, which notes that high-degree nodes in a protein interaction network tend to correspond to proteins that are essential, suggests that the topological prominence of a protein in a protein interaction network may be a good predictor of its biological importance. Even though the correlation between degree and essentiality was confirmed by many independent studies, the reason for this correlation remains illusive. Several hypotheses about putative connections between essentiality of hubs and the topology of protein–protein interaction networks have been proposed, but as we demonstrate, these explanations are not supported by the properties of protein interaction networks. To identify the main topological determinant of essentiality and to provide a biological explanation for the connection between the network topology and essentiality, we performed a rigorous analysis of six variants of the genomewide protein interaction network for Saccharomyces cerevisiae obtained using different techniques. We demonstrated that the majority of hubs are essential due to their involvement in Essential Complex Biological Modules, a group of densely connected proteins with shared biological function that are enriched in essential proteins. Moreover, we rejected two previously proposed explanations for the centrality-lethality rule, one relating the essentiality of hubs to their role in the overall network connectivity and another relying on the recently published essential protein interactions model.

410 citations


Journal ArticleDOI
01 Dec 2008-Ecology
TL;DR: It is shown that both unweightednetwork metrics (connectance, nestedness, and degree distribution) and weighted network metrics (interaction evenness, interaction strength asymmetry) are strongly constrained and biased by the number of observations.
Abstract: The structure of ecological interaction networks is often interpreted as a product of meaningful ecological and evolutionary mechanisms that shape the degree of specialization in community associations. However, here we show that both unweighted network metrics (connectance, nestedness, and degree distribution) and weighted network metrics (interaction evenness, interaction strength asymmetry) are strongly constrained and biased by the number of observations. Rarely observed species are inevitably regarded as "specialists," irrespective of their actual associations, leading to biased estimates of specialization. Consequently, a skewed distribution of species observation records (such as the lognormal), combined with a relatively low sampling density typical for ecological data, already generates a "nested" and poorly "connected" network with "asymmetric interaction strengths" when interactions are neutral. This is confirmed by null model simulations of bipartite networks, assuming that partners associate randomly in the absence of any specialization and any variation in the correspondence of biological traits between associated species (trait matching). Variation in the skewness of the frequency distribution fundamentally changes the outcome of network metrics. Therefore, interpretation of network metrics in terms of fundamental specialization and trait matching requires an appropriate control for such severe constraints imposed by information deficits. When using an alternative approach that controls for these effects, most natural networks of mutualistic or antagonistic systems show a significantly higher degree of reciprocal specialization (exclusiveness) than expected under neutral conditions. A higher exclusiveness is coherent with a tighter coevolution and suggests a lower ecological redundancy than implied by nested networks.

401 citations


Book
01 Jan 2008
TL;DR: InnateDB as mentioned in this paper is a publicly available, manually curated, integrative biology database of the human and mouse molecules, experimentally verified interactions and pathways involved in innate immunity, along with centralized annotation on the broader human-and mouse interactomes.
Abstract: Although considerable progress has been made in dissecting the signaling pathways involved in the innate immune response, it is now apparent that this response can no longer be productively thought of in terms of simple linear pathways. InnateDB (www.innatedb.ca) has been developed to facilitate systems-level analyses that will provide better insight into the complex networks of pathways and interactions that govern the innate immune response. InnateDB is a publicly available, manually curated, integrative biology database of the human and mouse molecules, experimentally verified interactions and pathways involved in innate immunity, along with centralized annotation on the broader human and mouse interactomes. To date, more than 3500 innate immunity-relevant interactions have been contextually annotated through the review of 1000 plus publications. Integrated into InnateDB are novel bioinformatics resources, including network visualization software, pathway analysis, orthologous interaction network construction and the ability to overlay user-supplied gene expression data in an intuitively displayed molecular interaction network and pathway context, which will enable biologists without a computational background to explore their data in a more systems-oriented manner.

356 citations


Journal ArticleDOI
01 Jul 2008
TL;DR: This work introduces an automatic approach based on text mining and network analysis to predict gene-disease associations and evaluated the approach for prostate cancer, finding that the central genes in this disease-specific network are likely to be related to the disease.
Abstract: Motivation: Understanding the role of genetics in diseases is one of the most important aims of the biological sciences. The completion of the Human Genome Project has led to a rapid increase in the number of publications in this area. However, the coverage of curated databases that provide information manually extracted from the literature is limited. Another challenge is that determining disease-related genes requires laborious experiments. Therefore, predicting good candidate genes before experimental analysis will save time and effort. We introduce an automatic approach based on text mining and network analysis to predict gene-disease associations. We collected an initial set of known disease-related genes and built an interaction network by automatic literature mining based on dependency parsing and support vector machines. Our hypothesis is that the central genes in this disease-specific network are likely to be related to the disease. We used the degree, eigenvector, betweenness and closeness centrality metrics to rank the genes in the network. Results: The proposed approach can be used to extract known and to infer unknown gene-disease associations. We evaluated the approach for prostate cancer. Eigenvector and degree centrality achieved high accuracy. A total of 95% of the top 20 genes ranked by these methods are confirmed to be related to prostate cancer. On the other hand, betweenness and closeness centrality predicted more genes whose relation to the disease is currently unknown and are candidates for experimental study. Availability: A web-based system for browsing the disease-specific gene-interaction networks is available at: http://gin.ncibi.org Contact: [email protected]

355 citations


Journal ArticleDOI
TL;DR: In this article, the Ising model was applied to reconstruct the functional networks of cortical neurons using correlation analysis to identify functional connectivity, and the results suggest that cortical networks are optimized for the coexistence of local and global computations.
Abstract: A small-world network has been suggested to be an efficient solution for achieving both modular and global processing—a property highly desirable for brain computations. Here, we investigated functional networks of cortical neurons using correlation analysis to identify functional connectivity. To reconstruct the interaction network, we applied the Ising model based on the principle of maximum entropy. This allowed us to assess the interactions by measuring pairwise correlations and to assess the strength of coupling from the degree of synchrony. Visual responses were recorded in visual cortex of anesthetized cats, simultaneously from up to 24 neurons. First, pairwise correlations captured most of the patterns in the population's activity and, therefore, provided a reliable basis for the reconstruction of the interaction networks. Second, and most importantly, the resulting networks had small-world properties; the average path lengths were as short as in simulated random networks, but the clustering coefficients were larger. Neurons differed considerably with respect to the number and strength of interactions, suggesting the existence of “hubs” in the network. Notably, there was no evidence for scale-free properties. These results suggest that cortical networks are optimized for the coexistence of local and global computations: feature detection and feature integration or binding.

334 citations


Journal ArticleDOI
TL;DR: A number of computational protocols forprotein interaction prediction based on the structural, genomic, and biological context of proteins in complete genomes, and detail methods for protein interaction network visualization and analysis are described.
Abstract: Recently a number of computational approaches have been developed for the prediction of protein-protein interactions. Complete genome sequencing projects have provided the vast amount of information needed for these analyses. These methods utilize the structural, genomic, and biological context of proteins and genes in complete genomes to predict protein interaction networks and functional linkages between proteins. Given that experimental techniques remain expensive, time-consuming, and labor-intensive, these methods represent an important advance in proteomics. Some of these approaches utilize sequence data alone to predict interactions, while others combine multiple computational and experimental datasets to accurately build protein interaction maps for complete genomes. These methods represent a complementary approach to current high-throughput projects whose aim is to delineate protein interaction maps in complete genomes. We will describe a number of computational protocols for protein interaction prediction based on the structural, genomic, and biological context of proteins in complete genomes, and detail methods for protein interaction network visualization and analysis.

216 citations


Journal ArticleDOI
TL;DR: An efficient heuristic algorithm QCUT, which combines spectral graph partitioning and local search to optimize Q is proposed and it is shown that QCUT can find higher modularities and is more scalable than the existing algorithms.
Abstract: Community structure is an important property of complex networks. The automatic discovery of such structure is a fundamental task in many disciplines, including sociology, biology, engineering, and computer science. Recently, several community discovery algorithms have been proposed based on the optimization of a modularity function $(Q)$. However, the problem of modularity optimization is NP-hard and the existing approaches often suffer from a prohibitively long running time or poor quality. Furthermore, it has been recently pointed out that algorithms based on optimizing $Q$ will have a resolution limit; i.e., communities below a certain scale may not be detected. In this research, we first propose an efficient heuristic algorithm QCUT, which combines spectral graph partitioning and local search to optimize $Q$. Using both synthetic and real networks, we show that QCUT can find higher modularities and is more scalable than the existing algorithms. Furthermore, using QCUT as an essential component, we propose a recursive algorithm HQCUT to solve the resolution limit problem. We show that HQCUT can successfully detect communities at a much finer scale or with a higher accuracy than the existing algorithms. We also discuss two possible reasons that can cause the resolution limit problem and provide a method to distinguish them. Finally, we apply QCUT and HQCUT to study a protein-protein interaction network and show that the combination of the two algorithms can reveal interesting biological results that may be otherwise undetected.

Journal ArticleDOI
15 Aug 2008-Proteins
TL;DR: This work proposes an algorithm for detecting gene–disease associations based on the human protein–protein interaction network, known gene-diseases associations, protein sequence, and protein functional information at the molecular level, and provided evidence that, despite the noise/incompleteness of experimental data and unfinished ontology of diseases, identification of candidate genes can be successful even when a large number of candidate disease terms are predicted on simultaneously.
Abstract: One of the most important tasks of modern bioinformatics is the development of computational tools that can be used to understand and treat human disease. To date, a variety of methods have been explored and algorithms for candidate gene prioritization are gaining in their usefulness. Here, we propose an algorithm for detecting gene-disease associations based on the human protein-protein interaction network, known gene-disease associations, protein sequence, and protein functional information at the molecular level. Our method, PhenoPred, is supervised: first, we mapped each gene/protein onto the spaces of disease and functional terms based on distance to all annotated proteins in the protein interaction network. We also encoded sequence, function, physicochemical, and predicted structural properties, such as secondary structure and flexibility. We then trained support vector machines to detect gene-disease associations for a number of terms in Disease Ontology and provided evidence that, despite the noise/incompleteness of experimental data and unfinished ontology of diseases, identification of candidate genes can be successful even when a large number of candidate disease terms are predicted on simultaneously.

Journal ArticleDOI
TL;DR: A method in which all direct and indirect interactions are first weighted using topological weight (FS-Weight), which estimates the strength of functional association and can be used to improve the precision of clusters predicted by various existing clustering algorithms.
Abstract: Protein complexes are fundamental for understanding principles of cellular organizations. As the sizes of protein–protein interaction (PPI) networks are increasing, accurate and fast protein complex prediction from these PPI networks can serve as a guide for biological experiments to discover novel protein complexes. However, it is not easy to predict protein complexes from PPI networks, especially in situations where the PPI network is noisy and still incomplete. Here, we study the use of indirect interactions between level-2 neighbors (level-2 interactions) for protein complex prediction. We know from previous work that proteins which do not interact but share interaction partners (level-2 neighbors) often share biological functions. We have proposed a method in which all direct and indirect interactions are first weighted using topological weight (FS-Weight), which estimates the strength of functional association. Interactions with low weight are removed from the network, while level-2 interactions with high weight are introduced into the interaction network. Existing clustering algorithms can then be applied to this modified network. We have also proposed a novel algorithm that searches for cliques in the modified network, and merge cliques to form clusters using a “partial clique merging” method. Experiments show that (1) the use of indirect interactions and topological weight to augment protein–protein interactions can be used to improve the precision of clusters predicted by various existing clustering algorithms; and (2) our complex-finding algorithm performs very well on interaction networks modified in this way. Since no other information except the original PPI network is used, our approach would be very useful for protein complex prediction, especially for prediction of novel protein complexes.

Journal ArticleDOI
TL;DR: This paper significantly extends the class of pathways that can be efficiently queried to the case of trees, and graphs of bounded treewidth, and implements a tool for tree queries, called QNet, and uses it to perform the first large-scale cross-species comparison of protein complexes.
Abstract: Molecular interaction databases can be used to study the evolution of molecular pathways across species. Querying such pathways is a challenging computational problem, and recent efforts have been limited to simple queries (paths), or simple networks (forests). In this paper, we significantly extend the class of pathways that can be efficiently queried to the case of trees, and graphs of bounded treewidth. Our algorithm allows the identification of non-exact (homeomorphic) matches, exploiting the color coding technique of Alon et al. (1995). We implement a tool for tree queries, called QNet, and test its retrieval properties in simulations and on real network data. We show that QNet searches queries with up to nine proteins in seconds on current networks, and outperforms sequence-based searches. We also use QNet to perform the first large-scale cross-species comparison of protein complexes, by querying known yeast complexes against a fly protein interaction network. This comparison points to strong conservation between the two species, and underscores the importance of our tool in mining protein interaction networks.

Proceedings ArticleDOI
01 Nov 2008
TL;DR: This work is the first attempt to predict the global set of interactions between HIV-1 and human host cellular proteins and proposes a supervised learning framework, where multiple information data sources are utilized.
Abstract: Human immunodeficiency virus-1 (HIV-1) in acquired immune deficiency syndrome (AIDS) relies on human host cell proteins in virtually every aspect of its life cycle. Knowledge of the set of interacting human and viral proteins would greatly contribute to our understanding of the mechanisms of infection and subsequently to the design of new therapeutic approaches. This work is the first attempt to predict the global set of interactions between HIV-1 and human host cellular proteins. We propose a supervised learning framework, where multiple information data sources are utilized, including cooccurrence of functional motifs and their interaction domains and protein classes, gene ontology annotations, posttranslational modifications, tissue distributions and gene expression profiles, topological properties of the human protein in the interaction network and the similarity of HIV-1 proteins to human proteins’ known binding partners. We trained and tested a Random Forest (RF) classifier with this extensive feature set. The model’s predictions achieved an average Mean Average Precision (MAP) score of 23%. Among the predicted interactions was for example the pair, HIV-1 protein tat and human vitamin D receptor. This interaction had recently been independently validated experimentally. The rank-ordered lists of predicted interacting pairs are a rich source for generating biological hypotheses. Amongst the novel predictions, transcription regulator activity, immune system process and macromolecular complex were the top most significant molecular function, process and cellular compartments, respectively. Supplementary material is available at URL www.cs.cmu.edu/~oznur/hiv/hivPPI.html

Journal ArticleDOI
TL;DR: This work presents the most complete analysis of the proteasome interaction network to date, providing an inclusive set of physical interaction data consistent with physiological roles for the proteAsome that have been suggested primarily through genetic analyses.
Abstract: Quantitative analysis of tandem-affinity purified cross-linked (x) protein complexes (QTAX) is a powerful technique for the identification of protein interactions, including weak and/or transient components. Here, we apply a QTAX-based tag-team mass spectrometry strategy coupled with protein network analysis to acquire a comprehensive and detailed assessment of the protein interaction network of the yeast 26S proteasome. We have determined that the proteasome network is composed of at least 471 proteins, significantly more than the total number of proteins identified by previous reports using proteasome subunits as baits. Validation of the selected proteasome-interacting proteins by reverse copurification and immunoblotting experiments with and without cross-linking, further demonstrates the power of the QTAX strategy for capturing protein interactions of all natures. In addition, >80% of the identified interactions have been confirmed by existing data using protein network analysis. Moreover, evidence obtained through network analysis links the proteasome to protein complexes associated with diverse cellular functions. This work presents the most complete analysis of the proteasome interaction network to date, providing an inclusive set of physical interaction data consistent with physiological roles for the proteasome that have been suggested primarily through genetic analyses. Moreover, the methodology described here is a general proteomic tool for the comprehensive study of protein interaction networks.

Journal ArticleDOI
TL;DR: A multi-step but easy-to-follow framework for identifying protein complexes from MS pull-down data assesses interaction affinity between two proteins based on similarity of their co-purification patterns derived from MS data and constructs a protein interaction network by adopting a knowledge-guided threshold selection method.
Abstract: Motivation:Recent improvements in high-throughput Mass Spectrometry (MS) technology have expedited genome-wide discovery of protein–protein interactions by providing a capability of detecting protein complexes in a physiological setting. Computational inference of protein interaction networks and protein complexes from MS data are challenging. Advances are required in developing robust and seamlessly integrated procedures for assessment of protein–protein interaction affinities, mathematical representation of protein interaction networks, discovery of protein complexes and evaluation of their biological relevance. Results: A multi-step but easy-to-follow framework for identifying protein complexes from MS pull-down data is introduced. It assesses interaction affinity between two proteins based on similarity of their co-purification patterns derived from MS data. It constructs a protein interaction network by adopting a knowledge-guided threshold selection method. Based on the network, it identifies protein complexes and infers their core components using a graph-theoretical approach. It deploys a statistical evaluation procedure to assess biological relevance of each found complex. On Saccharomyces cerevisiae pull-down data, the framework outperformed other more complicated schemes by at least 10% in F1-measure and identified 610 protein complexes with high-functional homogeneity based on the enrichment in Gene Ontology (GO) annotation. Manual examination of the complexes brought forward the hypotheses on cause of false identifications. Namely, co-purification of different protein complexes as mediated by a common non-protein molecule, such as DNA, might be a source of false positives. Protein identification bias in pull-down technology, such as the hydrophilic bias could result in false negatives. Contact: samatovan@ornl.gov Supplementary information: Supplementary data are available at Bioinformatics online.

Journal ArticleDOI
TL;DR: A detailed and curated map of molecular interactions taking place in the regulation of the cell cycle by the retinoblastoma protein (RB/RB1) is presented, which contains more details about RB/E2F interaction network than existing large‐scale pathway databases.
Abstract: We present, here, a detailed and curated map of molecular interactions taking place in the regulation of the cell cycle by the retinoblastoma protein (RB/RB1). Deregulations and/or mutations in this pathway are observed in most human cancers. The map was created using Systems Biology Graphical Notation language with the help of CellDesigner 3.5 software and converted into BioPAX 2.0 pathway description format. In the current state the map contains 78 proteins, 176 genes, 99 protein complexes, 208 distinct chemical species and 165 chemical reactions. Overall, the map recapitulates biological facts from approximately 350 publications annotated in the diagram. The network contains more details about RB/E2F interaction network than existing large-scale pathway databases. Structural analysis of the interaction network revealed a modular organization of the network, which was used to elaborate a more summarized, higher-level representation of RB/E2F network. The simplification of complex networks opens the road for creating realistic computational models of this regulatory pathway.

Journal ArticleDOI
TL;DR: A new methodology called SCAN (Structural Clustering Algorithm for Networks) is devised that can efficiently find clusters or functional modules in complex biological networks as well as hubs and outliers and classify nodes into various roles based on their structures.
Abstract: Background Biological systems can be modeled as complex network systems with many interactions between the components. These interactions give rise to the function and behavior of that system. For example, the protein-protein interaction network is the physical basis of multiple cellular functions. One goal of emerging systems biology is to analyze very large complex biological networks such as protein-protein interaction networks, metabolic networks, and regulatory networks to identify functional modules and assign functions to certain components of the system. Network modules do not occur by chance, so identification of modules is likely to capture the biologically meaningful interactions in large-scale PPI data. Unfortunately, existing computer-based clustering methods developed to find those modules are either not so accurate or too slow.

Journal ArticleDOI
TL;DR: The Pathway Interaction Database (PID), a freely available collection of curated and peer-reviewed pathways composed of human molecular signaling and regulatory events and key cellular processes, serves as a research tool for the cancer research community and others interested in cellular pathways.
Abstract: The Pathway Interaction Database (PID, "http://pid.nci.nih.gov":http://pid.nci.nih.gov) is a freely available collection of curated and peer-reviewed pathways composed of human molecular signaling and regulatory events and key cellular processes. Created in a collaboration between the U.S. National Cancer Institute and Nature Publishing Group, the database serves as a research tool for the cancer research community and others interested in cellular pathways, such as neuroscientists, developmental biologists, and immunologists. PID offers a range of search features to facilitate pathway exploration. Users can browse the predefined set of pathways or create interaction network maps centered on a single molecule or cellular process of interest. In addition, the batch query tool allows users to upload long list(s) of molecules, such as those derived from microarray experiments, and either overlay these molecules onto predefined pathways or visualize the complete molecular connectivity map. Users can also download molecule lists, citation lists and complete database content in extensible markup language (XML) and Biological Pathways Exchange (BioPAX) Level 2 format. The database is updated with new pathway content every month and supplemented by specially commissioned articles on the practical uses of other relevant online tools.

Journal ArticleDOI
TL;DR: Although all formulated as networks, the concepts represent widely different physical systems, and caution should be taken when applying relevant topological analysis, it is important to highlight the differences between these concepts.
Abstract: The formulation of network models from global protein studies is essential to understand the functioning of organisms. Network models of the proteome enable the application of Complex Network Analysis, a quantitative framework to investigate large complex networks using techniques from graph theory, statistical physics, dynamical systems and other fields. This approach has provided many insights into the functional organization of the proteome so far and will likely continue to do so. Currently, several network concepts have emerged in the field of proteomics. It is important to highlight the differences between these concepts, since different representations allow different insights into functional organization. One such concept is the protein interaction network, which contains proteins as nodes and undirected edges representing the occurrence of binding in large-scale protein-protein interaction studies. A second concept is the protein-signaling network, in which the nodes correspond to levels of post-translationally modified forms of proteins and directed edges to causal effects through post-translational modification, such as phosphorylation. Several other network concepts were introduced for proteomics. Although all formulated as networks, the concepts represent widely different physical systems. Therefore caution should be taken when applying relevant topological analysis. We review recent literature formulating and analyzing such networks.

Journal ArticleDOI
TL;DR: Topological analysis of the global correlation between microRNA (miRNA) regulation and protein‐protein interaction network in human showed that target genes of individual miRNA tend to be hubs and bottlenecks in the network.
Abstract: We have performed topological analysis to elucidate the global correlation between microRNA (miRNA) regulation and protein-protein interaction network in human. The analysis showed that target genes of individual miRNA tend to be hubs and bottlenecks in the network. While proteins directly regulated by miRNA might not form a network module themselves, the miRNA-target genes and their interacting neighbors jointly showed significantly higher modularity. Our findings shed light on how miRNA may regulate the protein interaction network.

Journal ArticleDOI
TL;DR: NeAT as discussed by the authors is a suite of computer tools that integrate various algorithms for the analysis of biological networks: comparison between graphs, between clusters, or between graphs and clusters; network randomization; analysis of degree distribution; network-based clustering and path finding.
Abstract: Network Analysis Tools (NeAT) is a suite of computer tools that integrate various algorithms for the analysis of biological networks: comparison between graphs, between clusters, or between graphs and clusters; network randomization; analysis of degree distribution; network-based clustering and path finding. The tools are interconnected to enable a stepwise analysis of the network through a complete analytical workflow. In this protocol, we present a typical case of utilization, where the tasks above are combined to decipher a protein-protein interaction network retrieved from the STRING database. The results returned by NeAT are typically subnetworks, networks enriched with additional information (i.e., clusters or paths) or tables displaying statistics. Typical networks comprising several thousands of nodes and arcs can be analyzed within a few minutes. The complete protocol can be read and executed in approximately 1 h.

Journal ArticleDOI
TL;DR: The objective of the present study is to systematically analyze the complex effects of interrelated genes and provide a framework for revealing their relationships in association with a specific disease (asthma) and suggest unknown candidate target genes associated with asthma.

Journal ArticleDOI
TL;DR: An alternative stochastic model is proposed, which adds each protein sequentially to a growing network in a manner analogous to protein crystal growth (CG) in solution, and is well supported by the spatial arrangement of protein complexes of known 3-D structure, suggesting a plausible physical mechanism for network evolution.
Abstract: Proteins interact in complex protein–protein interaction (PPI) networks whose topological properties—such as scale-free topology, hierarchical modularity, and dissortativity—have suggested models of network evolution. Currently preferred models invoke preferential attachment or gene duplication and divergence to produce networks whose topology matches that observed for real PPIs, thus supporting these as likely models for network evolution. Here, we show that the interaction density and homodimeric frequency are highly protein age–dependent in real PPI networks in a manner which does not agree with these canonical models. In light of these results, we propose an alternative stochastic model, which adds each protein sequentially to a growing network in a manner analogous to protein crystal growth (CG) in solution. The key ideas are (1) interaction probability increases with availability of unoccupied interaction surface, thus following an anti-preferential attachment rule, (2) as a network grows, highly connected sub-networks emerge into protein modules or complexes, and (3) once a new protein is committed to a module, further connections tend to be localized within that module. The CG model produces PPI networks consistent in both topology and age distributions with real PPI networks and is well supported by the spatial arrangement of protein complexes of known 3-D structure, suggesting a plausible physical mechanism for network evolution.

Journal ArticleDOI
TL;DR: This work introduces an interactive inference algorithm to infer a realizable S-system structure for biochemical networks and guarantees the minimum solution for the epsilon-constrained problem to achieve the minimum interaction network for the inference problem.
Abstract: Motivation: The inference of biochemical networks, such as gene regulatory networks, protein–protein interaction networks, and metabolic pathway networks, from time-course data is one of the main challenges in systems biology. The ultimate goal of inferred modeling is to obtain expressions that quantitatively understand every detail and principle of biological systems. To infer a realizable S-system structure, most articles have applied sums of magnitude of kinetic orders as a penalty term in the fitness evaluation. How to tune a penalty weight to yield a realizable model structure is the main issue for the inverse problem. No guideline has been published for tuning a suitable penalty weight to infer a suitable model structure of biochemical networks. Results: We introduce an interactive inference algorithm to infer a realizable S-system structure for biochemical networks. The inference problem is formulated as a multiobjective optimization problem to minimize simultaneously the concentration error, slope error and interaction measure in order to find a suitable S-system model structure and its corresponding model parameters. The multiobjective optimization problem is solved by the e-constraint method to minimize the interaction measure subject to the expectation constraints for the concentration and slope error criteria. The theorems serve to guarantee the minimum solution for the e-constrained problem to achieve the minimum interaction network for the inference problem. The approach could avoid assigning a penalty weight for sums of magnitude of kinetic orders. Contact: chmfsw@ccu.edu.tw Supplementary information: Supplementary data are available at Bioinformatics online.

Journal ArticleDOI
TL;DR: The results suggest an intriguing conclusion—although redundancy is typically transient on evolutionary time scales, it tends to be preserved among some of the central proteins in the cellular interaction network.
Abstract: The widely observed dispensability of duplicate genes is typically interpreted to suggest that a proportion of the duplicate pairs are at least partially redundant in their functions, thus allowing for compensatory affects. However, because redundancy is expected to be evolutionarily short lived, there is currently debate on both the proportion of redundant duplicates and their functional importance. Here, we examined these compensatory interactions by relying on a genome wide data analysis, followed by experiments and literature mining in yeast. Our data, thus, strongly suggest that compensated duplicates are not randomly distributed within the protein interaction network but are rather strategically allocated to the most highly connected proteins. This design is appealing because it suggests that many of the potentially vulnerable nodes that would otherwise be highly sensitive to mutations are often protected by redundancy. Furthermore, divergence analyses show that this association between redundancy and protein connectivity becomes even more significant among the ancient duplicates, suggesting that these functional overlaps have undergone purifying selection. Our results suggest an intriguing conclusion—although redundancy is typically transient on evolutionary time scales, it tends to be preserved among some of the central proteins in the cellular interaction network.

Journal ArticleDOI
TL;DR: This meta‐analysis exploits non‐protein‐based data, but successfully predicts associations, including 5589 novel human physical protein associations, with measured accuracies of 54±10%, comparable to direct large‐scale interaction assays.
Abstract: The human protein interaction network will offer global insights into the molecular organization of cells and provide a framework for modeling human disease, but the network's large scale demands new approaches. We report a set of 7000 physical associations among human proteins inferred from indirect evidence: the comparison of human mRNA co-expression patterns with those of orthologous genes in five other eukaryotes, which we demonstrate identifies proteins in the same physical complexes. To evaluate the accuracy of the predicted physical associations, we apply quantitative mass spectrometry shotgun proteomics to measure elution profiles of 3013 human proteins during native biochemical fractionation, demonstrating systematically that putative interaction partners tend to co-sediment. We further validate uncharacterized proteins implicated by the associations in ribosome biogenesis, including WBSCR20C, associated with Williams-Beuren syndrome. This meta-analysis therefore exploits non-protein-based data, but successfully predicts associations, including 5589 novel human physical protein associations, with measured accuracies of 54+/-10%, comparable to direct large-scale interaction assays. The new associations' derivation from conserved in vivo phenomena argues strongly for their biological relevance.

Journal ArticleDOI
TL;DR: The network-based target selection (BioNet) approach described here is an example of a general strategy for targeting co-functioning proteins by structural genomics projects.

Journal ArticleDOI
TL;DR: It is found that cancer genes are fragile components of the human gene repertoire, sensitive to dosage modification, and other nodes of thehuman PIN with similar properties are rare and probably enriched in candidate cancer genes.