scispace - formally typeset
Search or ask a question

Showing papers on "Interaction network published in 2004"


Journal ArticleDOI
TL;DR: This work proposes a community standard data model for the representation and exchange of protein interaction data, jointly developed by members of the Proteomics Standards Initiative (PSI) and the Human Proteome Organization (HUPO).
Abstract: A major goal of proteomics is the complete description of the protein interaction network underlying cell physiology. A large number of small scale and, more recently, large-scale experiments have contributed to expanding our understanding of the nature of the interaction network. However, the necessary data integration across experiments is currently hampered by the fragmentation of publicly available protein interaction data, which exists in different formats in databases, on authors' websites or sometimes only in print publications. Here, we propose a community standard data model for the representation and exchange of protein interaction data. This data model has been jointly developed by members of the Proteomics Standards Initiative (PSI), a work group of the Human Proteome Organization (HUPO), and is supported by major protein interaction data providers, in particular the Biomolecular Interaction Network Database (BIND), Cellzome (Heidelberg, Germany), the Database of Interacting Proteins (DIP), Dana Farber Cancer Institute (Boston, MA, USA), the Human Protein Reference Database (HPRD), Hybrigenics (Paris, France), the European Bioinformatics Institute's (EMBL-EBI, Hinxton, UK) IntAct, the Molecular Interactions (MINT, Rome, Italy) database, the Protein-Protein Interaction Database (PPID, Edinburgh, UK) and the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING, EMBL, Heidelberg, Germany).

658 citations


Journal ArticleDOI
TL;DR: The Biomolecular Interaction Network Database (BIND) archives biomolecular interaction, reaction, complex and pathway information and provides users with methods to discover interactions and molecular mechanisms.
Abstract: The Biomolecular Interaction Network Database (BIND) (http://bind.ca) archives biomolecular interaction, reaction, complex and pathway information. Our aim is to curate the details about molecular interactions that arise from published experimental research and to provide this information, as well as tools to enable data analysis, freely to researchers worldwide. BIND data are curated into a comprehensive machine-readable archive of computable information and provides users with methods to discover interactions and molecular mechanisms. BIND has worked to develop new methods for visualization that amplify the underlying annotation of genes and proteins to facilitate the study of molecular interaction networks. BIND has maintained an open database policy since its inception in 1999. Data growth has proceeded at a tremendous rate, approaching over 100 000 records. New services provided include a new BIND Query and Submission interface, a Standard Object Access Protocol service and the Small Molecule Interaction Database (http://smid.blueprint.org) that allows users to determine probable small molecule binding sites of new sequences and examine conserved binding residues.

635 citations


Journal ArticleDOI
TL;DR: This work compares four available databases that approximate the protein interaction network of the yeast, Saccharomyces cerevisiae, aiming to uncover the network's generic large‐scale properties and the impact of the proteins' function and cellular localization on the network topology.
Abstract: The elucidation of the cell’s large-scale organization is a primary challenge for post-genomic biology, and understanding the structure of protein interaction networks offers an important starting point for such studies. We compare four available databases that approximate the protein interaction network of the yeast, Saccharomyces cerevisiae, aiming to uncover the network’s generic large-scale properties and the impact of the proteins’ function and cellular localization on the network topology. We show how each database supports a scale-free, topology with hierarchical modularity, indicating that these features represent a robust and generic property of the protein interactions network. We also find strong correlations between the network’s structure and the functional role and subcellular localization of its protein constituents, concluding that most functional and/or localization classes appear as relatively segregated subnetworks of the full protein interaction network. The uncovered systematic differences between the four protein interaction databases reflect their relative coverage for different functional and localization classes and provide a guide for their utility in various bioinformatics studies.

623 citations


Journal ArticleDOI
TL;DR: The coupled dynamics of the internal states of a set of interacting elements and the network of interactions among them and the formation of a hierarchical interaction network that sustains a highly cooperative stationary state are explored.
Abstract: We explore the coupled dynamics of the internal states of a set of interacting elements and the network of interactions among them. Interactions are modeled by a spatial game and the network of interaction links evolves adapting to the outcome of the game. As an example, we consider a model of cooperation in which the adaptation is shown to facilitate the formation of a hierarchical interaction network that sustains a highly cooperative stationary state. The resulting network has the characteristics of a small world network when a mechanism of local neighbor selection is introduced in the adaptive network dynamics. The highly connected nodes in the hierarchical structure of the network play a leading role in the stability of the network. Perturbations acting on the state of these special nodes trigger global avalanches leading to complete network reorganization.

482 citations


Journal ArticleDOI
TL;DR: Protein interaction networks summarize large amounts of protein-protein interaction data, both from individual, small-scale experiments and from automated high-throughput screens, for reconstructing the human protein interaction network.

359 citations


Journal ArticleDOI
TL;DR: An integrated approach combining large-scale protein interaction mapping, exploration of the interaction network, and cellular functional assays performed on newly identified proteins involved in a human signaling pathway is presented, validating this integrated functional proteomics approach.
Abstract: Access to the human genome facilitates extensive functional proteomics studies. Here, we present an integrated approach combining large-scale protein interaction mapping, exploration of the interaction network, and cellular functional assays performed on newly identified proteins involved in a human signaling pathway. As a proof of principle, we studied the Smad signaling system, which is regulated by members of the transforming growth factor β (TGFβ) superfamily. We used two-hybrid screening to map Smad signaling protein–protein interactions and to establish a network of 755 interactions, involving 591 proteins, 179 of which were poorly or not annotated. The exploration of such complex interaction databases is improved by the use of PIMRider, a dedicated navigation tool accessible through the Web. The biological meaning of this network is illustrated by the presence of 18 known Smad-associated proteins. Functional assays performed in mammalian cells including siRNA knock-down experiments identified eight novel proteins involved in Smad signaling, thus validating this integrated functional proteomics approach.

315 citations


Journal ArticleDOI
TL;DR: In this article, a detailed statistical analysis of the protein interactions in Saccharomyces cerevisiae based on several large-throughput datasets is presented, where the authors infer rate estimates for two key evolutionary processes shaping the network: (i) gene duplications and (ii) gain and loss of interactions through mutations in existing proteins, referred as link dynamics.
Abstract: The structure of molecular networks derives from dynamical processes on evolutionary time scales. For protein interaction networks, global statistical features of their structure can now be inferred consistently from several large-throughput datasets. Understanding the underlying evolutionary dynamics is crucial for discerning random parts of the network from biologically important properties shaped by natural selection. We present a detailed statistical analysis of the protein interactions in Saccharomyces cerevisiae based on several large-throughput datasets. Protein pairs resulting from gene duplications are used as tracers into the evolutionary past of the network. From this analysis, we infer rate estimates for two key evolutionary processes shaping the network: (i) gene duplications and (ii) gain and loss of interactions through mutations in existing proteins, which are referred to as link dynamics. Importantly, the link dynamics is asymmetric, i.e., the evolutionary steps are mutations in just one of the binding parters. The link turnover is shown to be much faster than gene duplications. Both processes are assembled into an empirically grounded, quantitative model for the evolution of protein interaction networks. According to this model, the link dynamics is the dominant evolutionary force shaping the statistical structure of the network, while the slower gene duplication dynamics mainly affects its size. Specifically, the model predicts (i) a broad distribution of the connectivities (i.e., the number of binding partners of a protein) and (ii) correlations between the connectivities of interacting proteins, a specific consequence of the asymmetry of the link dynamics. Both features have been observed in the protein interaction network of S. cerevisiae.

220 citations


Journal ArticleDOI
TL;DR: A novel method is introduced, the evolutionary excess retention (ER), allowing for a robust and strong correlation between the conservation, essentiality, and connectivity of a yeast protein, and concludes that the relevance of the hubs for the network integrity is simultaneously reflected by a considerable probability of simultaneously being evolutionarily conserved and essential.
Abstract: The integrity of the yeast protein-protein interaction network is maintained by a few highly connected proteins, or hubs, which hold the numerous less-connected proteins together. The structural importance and the increased essentiality of these proteins suggest that they are likely to be conserved in evolution, implying a strong relationship between the number of interactions and their evolutionary distance to its orthologs in other organisms. The existence of this coherence was recently reported to strongly depend on the quality of the protein interaction and orthologs data. Here, we introduce a novel method, the evolutionary excess retention (ER), allowing us to uncover a robust and strong correlation between the conservation, essentiality, and connectivity of a yeast protein. We conclude that the relevance of the hubs for the network integrity is simultaneously reflected by a considerable probability of simultaneously being evolutionarily conserved and essential, an observation that does not have an equivalent for nonessential proteins. Providing a thorough assessment of the impact noisy and incomplete data have on our findings, we conclude that our results are largely insensitive to the quality of the utilized data.

154 citations


Journal ArticleDOI
TL;DR: It is shown that the positive feedback regulation of E2F1 and a double activator-inhibitor module can lead to bistability and core modules can explain major features of the complex G1/S network and have a robust decision taking function.
Abstract: Motivation: Mathematical models of the cell cycle can contribute to an understanding of its basic mechanisms. Modern simulation tools make the analysis of key components and their interactions very effective. This paper focuses on the role of small modules and feedbacks in the gene--protein network governing the G1/S transition in mammalian cells. Mutations in this network may lead to uncontrolled cell proliferation. Bifurcation analysis helps to identify the key components of this extremely complex interaction network. Results: We identify various positive and negative feedback loops in the network controlling the G1/S transition. It is shown that the positive feedback regulation of E2F1 and a double activator--inhibitor module can lead to bistability. Extensions of the core module preserve the essential features such as bistability. The complete model exhibits a transcritical bifurcation in addition to bistability. We relate these bifurcations to the cell cycle checkpoint and the G1/S phase transition point. Thus, core modules can explain major features of the complex G1/S network and have a robust decision taking function.

148 citations


Journal ArticleDOI
TL;DR: A global network investigation of the genotype/phenotype data set developed for the recovery of the yeast Saccharomyces cerevisiae from exposure to DNA-damaging agents is presented, enabling explicit study of how protein–protein interaction network characteristics may be associated with phenotypic functional effects.
Abstract: Using genome-wide information to understand holistically how cells function is a major challenge of the postgenomic era. Recent efforts to understand molecular pathway operation from a global perspective have lacked experimental data on phenotypic context, so insights concerning biologically relevant network characteristics of key genes or proteins have remained largely speculative. Here, we present a global network investigation of the genotype/phenotype data set we developed for the recovery of the yeast Saccharomyces cerevisiae from exposure to DNA-damaging agents, enabling explicit study of how protein–protein interaction network characteristics may be associated with phenotypic functional effects. We show that toxicity-modulating proteins have similar topological properties as essential proteins, suggesting that cells initiate highly coordinated responses to damage similar to those needed for vital cellular functions. We also identify toxicologically important protein complexes, pathways, and modules. These results have potential implications for understanding toxicity-modulating processes relevant to a number of human diseases, including cancer and aging.

139 citations


Journal ArticleDOI
TL;DR: A new algorithm for clustering vertices of a protein-protein interaction network using a density function, providing disjoint classes that can be applied to other organism as well as to other type of interaction graph, such as genetic interactions.
Abstract: Developing reliable and efficient strategies allowing to infer a function to yet uncharacterized proteins based on interaction networks is of crucial interest in the current context of high-throughput data generation. In this paper, we develop a new algorithm for clustering vertices of a protein-protein interaction network using a density function, providing disjoint classes. Applied to the yeast interaction network, the classes obtained appear to be biological significant. The partitions are then used to make functional predictions for uncharacterized yeast proteins, using an annotation procedure that takes into account the binary interactions between proteins inside the classes. We show that this procedure is able to enhance the performances with respect to previous approaches. Finally, we propose a new annotation for 37 previously uncharacterized yeast proteins. We believe that our results represent a significant improvement for the inference of cellular functions, that can be applied to other organism as well as to other type of interaction graph, such as genetic interactions.

Journal ArticleDOI
Haiyuan Yu1, Xiaowei Zhu1, Dov Greenbaum1, John E. Karro1, Mark Gerstein1 
TL;DR: TopNet, an automated web tool designed to address the challenge of comparing the topologies of sub- networks, found that soluble proteins had more interactions than membrane proteins and amongst soluble proteins, those that were highly expressed, had many polar amino acids, and had many alpha helices tended to have the most interaction partners.
Abstract: Biological networks are a topic of great current interest, particularly with the publication of a number of large genome-wide interaction datasets. They are globally characterized by a variety of graph-theoretic statistics, such as the degree distribution, clustering coefficient, characteristic path length and diameter. Moreover, real protein networks are quite complex and can often be divided into many sub-networks through systematic selection of different nodes and edges. For instance, proteins can be sub-divided by expression level, length, amino-acid composition, solubility, secondary structure and function. A challenging research question is to compare the topologies of sub- networks, looking for global differences associated with different types of proteins. TopNet is an automated web tool designed to address this question, calculating and comparing topological characteristics for different sub-networks derived from any given protein network. It provides reasonable solutions to the calculation of network statistics for sub-networks embedded within a larger network and gives simplified views of a sub-network of interest, allowing one to navigate through it. After constructing TopNet, we applied it to the interaction networks and protein classes currently available for yeast. We were able to find a number of potential biological correlations. In particular, we found that soluble proteins had more interactions than membrane proteins. Moreover, amongst soluble proteins, those that were highly expressed, had many polar amino acids, and had many alpha helices, tended to have the most interaction partners. Interestingly, TopNet also turned up some systematic biases in the current yeast interaction network: on average, proteins with a known functional classification had many more interaction partners than those without. This phenomenon may reflect the incompleteness of the experimentally determined yeast interaction network.

Journal ArticleDOI
TL;DR: Large-scale analysis of genetic and physical interaction networks has begun to reveal the global organization of the cell, and the nascent field of chemical genetics promises a host of small-molecule probes to explore these emerging networks.

Journal ArticleDOI
TL;DR: This review discusses how the integration of genetics and technologies such as transcriptomics and proteomics, combined with mathematical modeling, may lead to an understanding of networks that influences the development of atherosclerosis.

Journal ArticleDOI
TL;DR: Recent advances in the methods for deriving the portion of the protein network mediated by these domain families are reviewed and how specific biological outputs could emerge in vivo despite the observed promiscuity in peptide recognition in vitro is discussed.

Journal ArticleDOI
TL;DR: It is proposed that the understanding of the mechanisms that generate the scale-free protein interaction network, and possibly other biological networks, requires consideration of protein function.
Abstract: Protein interactions are central to most biological processes. We investigated the dynamics of emergence of the protein interaction network of Saccharomyces cerevisiae by mapping origins of proteins on an evolutionary tree. We demonstrate that evolutionary periods are characterized by distinct connectivity levels of the emerging proteins. We found that the most-connected group of proteins dates to the eukaryotic radiation, and the more ancient group of pre-eukaryotic proteins is less connected. We show that functional classes have different average connectivity levels and that the time of emergence of these functional classes parallels the observed connectivity variation in evolution. We take these findings as evidence that the evolution of function might be the reason for the differences in connectivity throughout evolutionary time. We propose that the understanding of the mechanisms that generate the scale-free protein interaction network, and possibly other biological networks, requires consideration of protein function.

Journal ArticleDOI
01 Jan 2004
TL;DR: A general computational procedure for identifying the ligand peptides of PRMs by combining protein sequence information and observed physical interactions into a simple probabilistic model and from it derive an interaction-mediated de novo motif-finding framework.
Abstract: Motivation: Many protein--protein interactions are mediated by peptide recognition modules (PRMs), compact domains that bind to short peptides, and play a critical role in a wide array of biological processes. Recent experimental protein interaction data provide us with an opportunity to examine whether we may explain, or even predict their interactions by computational sequence analysis. Such a question was recently posed by the use of random peptide screens to characterize the ligands of one such PRM, the SH3 domain. Results: We describe a general computational procedure for identifying the ligand peptides of PRMs by combining protein sequence information and observed physical interactions into a simple probabilistic model and from it derive an interaction-mediated de novo motif-finding framework. Using a recent all-versus-all yeast two-hybrid SH3 domain interaction network, we demonstrate that our technique can be used to derive independent predictions of interactions mediated by SH3 domains. We show that only when sequence information is combined with such all versus all protein interaction datasets, are we capable of identifying motifs with sufficient sensitivity and specificity for predicting interactions. The algorithm is general so that it may be applied to other PRM domains (e.g. SH2, WW, PDZ). Availability: The Netmotsa software and source code, as part of a general Gibbs motif sampling library, are available at http://sf.net/projects/netmotsa

Journal ArticleDOI
TL;DR: Compared with previous clustering methods, the clustering method ADJW performs well both in retaining a meaningful image of the protein interaction network as well as in enriching the image with biological information, therefore is more suitable in visualization of the network.
Abstract: The refinement and high-throughput of protein interaction detection methods offer us a protein-protein interaction network in yeast. The challenge coming along with the network is to find better ways to make it accessible for biological investigation. Visualization would be helpful for extraction of meaningful biological information from the network. However, traditional ways of visualizing the network are unsuitable because of the large number of proteins. Here, we provide a simple but information-rich approach for visualization which integrates topological and biological information. In our method, the topological information such as quasi-cliques or spoke-like modules of the network is extracted into a clustering tree, where biological information spanning from protein functional annotation to expression profile correlations can be annotated onto the representation of it. We have developed a software named PINC based on our approach. Compared with previous clustering methods, our clustering method ADJW performs well both in retaining a meaningful image of the protein interaction network as well as in enriching the image with biological information, therefore is more suitable in visualization of the network.

Journal ArticleDOI
TL;DR: A new algorithm for the detection and quantification of hierarchical modularity is described, and it is demonstrated that the yeast protein-protein interaction network does have a hierarchically modular organization.
Abstract: Networks of interactions evolve in many different domains. They tend to have topological characteristics in common, possibly due to common factors in the way the networks grow and develop. It has been recently suggested that one such common characteristic is the presence of a hierarchically modular organization. In this paper, we describe a new algorithm for the detection and quantification of hierarchical modularity, and demonstrate that the yeast protein-protein interaction network does have a hierarchically modular organization. We further show that such organization is evident in artificial networks produced by computational evolution using a gene duplication operator, but not in those developing via preferential attachment of new nodes to highly connected existing nodes. (C) 2004 Elsevier Ireland Ltd. All rights reserved.

Journal ArticleDOI
TL;DR: APIN (Agile Protein Interaction Network browser) as mentioned in this paper is a bioinformatic tool for browsing protein interaction databases, which is in development and will be applied to browsing protein interactions databases.
Abstract: In recent years, the biomolecular sciences have been driven forward by overwhelming advances in new biotechnological high-throughput experimental methods and bioinformatic genome-wide computational methods. Such breakthroughs are producing huge amounts of new data that need to be carefully analysed to obtain correct and useful scientific knowledge. One of the fields where this advance has become more intense is the study of the network of ‘protein–protein interactions’, i.e. the ‘interactome’. In this short review we comment on the main data and databases produced in this field in last 5 years. We also present a rationalized scheme of biological definitions that will be useful for a better understanding and interpretation of ‘what a protein–protein interaction is’ and ‘which types of protein–protein interactions are found in a living cell’. Finally, we comment on some assignments of interactome data to defined types of protein interaction and we present a new bioinformatic tool called APIN (Agile Protein Interaction Network browser), which is in development and will be applied to browsing protein interaction databases.

Book ChapterDOI
TL;DR: A variety of applications are discussed, including screening the network to identify pathways responsible for gene expression changes observed in galactose-induced cells, and identifying groups of interacting proteins that are essential for the cellular response to DNA damage.
Abstract: In the post-genomic era, the first step in any study of protein function is a homology search against the complete genome sequence of the organism of interest. By analogy, if we also wish to elucidate the cadre of signaling and regulatory pathways in the cell, an extremely powerful first step is to construct a complete network of protein-protein and transcriptional interactions and then search through this network to identify critical pathways in a top-down fashion. Like genomic sequence, the molecular interaction network provides a broad foundation for more directed studies to follow. We illustrate this strategy using a large network of 12,232 interactions in yeast. A variety of applications are discussed, including screening the network to identify pathways responsible for gene expression changes observed in galactose-induced cells, and identifying groups of interacting proteins that are essential (by phenotypic assay) for the cellular response to DNA damage.

Journal ArticleDOI
TL;DR: In this paper, the formalization of complex network concepts in terms of discrete mathematics, especially mathematical morphology, allows a series of generalizations and important results ranging from new measurements of the network topology to new network growth models.
Abstract: This work describes how the formalization of complex network concepts in terms of discrete mathematics, especially mathematical morphology, allows a series of generalizations and important results ranging from new measurements of the network topology to new network growth models. First, the concepts of node degree and clustering coefficient are extended in order to characterize not only specific nodes, but any generic subnetwork. Second, the consideration of distance transform and rings are used to further extend those concepts in order to obtain a signature, instead of a single scalar measurement, ranging from the single node to whole graph scales. The enhanced discriminative potential of such extended measurements is illustrated with respect to the identification of correspondence between nodes in two complex networks, namely a protein-protein interaction network and a perturbed version of it. The use of other measurements derived from mathematical morphology are also suggested as a means to characterize complex networks connectivity in a more comprehensive fashion.

Journal ArticleDOI
TL;DR: A method to assess systematically which of a set of proposed network generation algorithms gives the most accurate description of a given biological network is presented, and it is shown that different duplication-mutation schemes best describe the E. coli genetic network, the S. cerevisiae protein interaction network, and the C. elegans neuronal network.
Abstract: Recent genomic and bioinformatic advances have motivated the development of numerous network models intending to describe graphs of biological, technological, and sociological origin In most cases the success of a model has been evaluated by how well it reproduces a few key features of the real-world data, such as degree distributions, mean geodesic lengths, and clustering coefficients Often pairs of models can reproduce these features with indistinguishable fidelity despite being generated by vastly different mechanisms In such cases, these few target features are insufficient to distinguish which of the different models best describes real world networks of interest; moreover, it is not clear a priori that any of the presently-existing algorithms for network generation offers a predictive description of the networks inspiring them We present a method to assess systematically which of a set of proposed network generation algorithms gives the most accurate description of a given biological network To derive discriminative classifiers, we construct a mapping from the set of all graphs to a high-dimensional (in principle infinite-dimensional) "word space" This map defines an input space for classification schemes which allow us to state unambiguously which models are most descriptive of a given network of interest Our training sets include networks generated from 17 models either drawn from the literature or introduced in this work We show that different duplication-mutation schemes best describe the E coli genetic network, the S cerevisiae protein interaction network, and the C elegans neuronal network, out of a set of network models including a linear preferential attachment model and a small-world model Our method is a first step towards systematizing network models and assessing their predictability, and we anticipate its usefulness for a number of communities

Journal Article
TL;DR: This short review of the main data and databases produced in this field in last 5 years is commented on and a rationalized scheme of biological definitions are presented that will be useful for a better understanding and interpretation of ‘what a protein–protein interaction is’ and ‘which types of protein– protein interactions are found in a living cell’.
Abstract: In recent years, the biomolecular sciences have been driven forward by overwhelming advances in new biotechnological high-throughput experimental methods and bioinformatic genome-wide computational methods. Such breakthroughs are producing huge amounts of new data that need to be carefully analysed to obtain correct and useful scientific knowledge. One of the fields where this advance has become more intense is the study of the network of 'protein-protein interactions', i.e. the 'interactome'. In this short review we comment on the main data and databases produced in this field in last 5 years. We also present a rationalized scheme of biological definitions that will be useful for a better understanding and interpretation of'what a protein-protein interaction is' and 'which types of protein-protein interactions are found in a living cell'. Finally, we comment on some assignments of interactome data to defined types of protein interaction and we present a new bioinformatic tool called APIN (Agile Protein Interaction Network browser), which is in development and will be applied to browsing protein interaction databases.

Journal ArticleDOI
01 Oct 2004-Proteins
TL;DR: A unified representation of the protein–protein and complex–complex networks based on an underlying bipartite graph model that is an advance over existing models of the network and allows for weighting of connections between proteins shared in more than one complex.
Abstract: The protein interaction network presents one perspective for understanding cellular processes. Recent experiments employing high-throughput mass spectrometric characterizations have resulted in large data sets of physiologically relevant multiprotein complexes. We present a unified representation of such data sets based on an underlying bipartite graph model that is an advance over existing models of the network. Our unified representation allows for weighting of connections between proteins shared in more than one complex, as well as addressing the higher level organization that occurs when the network is viewed as consisting of protein complexes that share components. This representation also allows for the application of the rigorous MinMaxCut graph clustering algorithm for the determination of relevant protein modules in the networks. Statistically significant annotations of clusters in the protein-protein and complex-complex networks using terms from the Gene Ontology indicate that this method will be useful for posing hypotheses about uncharacterized components of protein complexes or uncharacterized relationships between protein complexes.

Journal ArticleDOI
TL;DR: A new, fast-layout algorithm and its implementation called WebInterViewer is presented, which can visualize data directly from protein interaction databases and provides several abstraction and comparison operations for analyzing large-scale biological networks effectively.
Abstract: Molecular interaction networks, such as those involving protein-protein and protein-DNA interactions, often consist of thousands of nodes or even more, which severely limits the usefulness of many graph drawing tools because they become too slow for interactive analysis of the networks and because they produce cluttered drawings with many edge crossings. We present a new, fast- layout algorithm and its implementation called Webinter Viewer for visualizing large-scale molecular interaction networks. WeblnterViewer (i) finds a layout of the connected components of an entire network, (ii) finds a global layout of nodes with respect to pivot nodes within the connected components and (iii) refines the local layout of each connected component by first relocating midnodes with respect to their cutvertices and the direct neighbors of the cutvertices, and then relocating all nodes with respect to their neighbors within distance 2. The advantages of WeblnterViewer over classical graph drawing methods include the facts that (i) it is an order of magnitude faster, (ii) it can visualize data directly from protein interaction databases and (iii) it provides several abstraction and comparison operations for analyzing large-scale biological networks effectively. WeblnterViewer is accessible at http://interviewer.inha.ac.kr/.

Proceedings Article
22 Aug 2004
TL;DR: The technique of differential association rule mining is applied to the comparison of protein annotations within an interaction network and between different interaction networks and is able to find rules that explain known properties of protein interaction networks as well as rules that show promise for advanced study.
Abstract: Protein-protein interactions are of great interest to biologists. A variety of high-throughput techniques have been devised, each of which leads to a separate definition of an interaction network. The concept of differential association rule mining is introduced to study the annotations of proteins in the context of one or more interaction networks. Differences among items across edges of a network are explicitly targeted. As a second step we identify differences between networks that are separately defined on the same set of nodes. The technique of differential association rule mining is applied to the comparison of protein annotations within an interaction network and between different interaction networks. In both cases we were able to find rules that explain known properties of protein interaction networks as well as rules that show promise for advanced study.

Proceedings ArticleDOI
07 Oct 2004
TL;DR: The experiments of SPIE-DM indicate that the system is very promising for extracting and mining from biomedical literature databases.
Abstract: We present a biomedical literature data mining system SPIE-DM (Scalable and Portable Information Extraction and Data Mining) to extract and mine the protein-protein interaction network from biomedical literature such as MedLine SPIE-DM consists of two phases: in phase 1, we develop a scalable and portable ie method (SPIE) to extract the protein-protein interaction from the biomedical literature These extracted protein-protein interactions form a scale-free network graph In phase 2, we apply a novel clustering method SFCluster to mine the protein-protein interaction network The clusters in the network graph represent some potential protein complexes, which are very important for biologist to study the protein functionality The clustering algorithm considers the characteristics of the scale-free network graphs and is based on the local density of the vertex and its neighborhood functions that can be used to find more meaningful clusters at different density levels The experiments of SPIE-DM on around 1600 chromatin proteins indicate that our system is very promising for extracting and mining from biomedical literature databases

Journal ArticleDOI
TL;DR: This new result allows us for example to partly solve the topology of the genetic regulatory network ruling the flowering in Arabidopsis thaliana.
Abstract: This paper deals with the problem of reconstruction of the intergenic interaction graph from the raw data of genetic co-expression coming with new technologies of bio-arrays (DMA-arrays, protein-arrays, etc.). These new imaging devices in general only give information about the asymptotical part (fixed configurations of co-expression or limit cycles of such configurations) of the dynamical evolution of the regulatory networks (genetic and/or proteic) underlying the functioning of living systems. Extracting the casual structure and interaction coefficients of a gene interaction network from the observed configurations is a complex problem. But if all the fixed configurations are supposedly observed and if they are factorizable into two or more subsets of values, then the interaction graph possesses as many connected components as the number of factors and the solution is obtained in polynomial time. This new result allows us for example to partly solve the topology of the genetic regulatory network ruling the flowering in Arabidopsis thaliana.

Journal ArticleDOI
TL;DR: A protein interaction prediction service system based on the domain combination based protein-protein interaction prediction technique, which is known to show superior accuracy to other conventional computational protein- proteins interaction prediction methods is designed and implemented.
Abstract: With the recognition of the importance of computational approach for protein-protein interaction prediction, many techniques have been developed to computationally predict protein-protein interactions. However, few techniques are actually implemented and announced in service form for general users to readily access and use the techniques. In this paper, we design and implement a protein interaction prediction service system based on the domain combination based protein-protein interaction prediction technique, which is known to show superior accuracy to other conventional computational protein-protein interaction prediction methods. In the prediction accuracy test of the method, high sensitivity (77%) and specificity (95%) are achieved for test protein pairs containing common domains with learning sets of proteins in a Yeast. The stability of the method is also manifested through the testing over DIP CORE, HMS-PCI, and TAP data. The functions of the system are divided into core, subsidiary, and general service function categories. The core function category includes the functions that can be provided only by using the domain combination based protein-protein interaction prediction method. Interaction prediction for a single protein pair and visualization of interaction probability distributions are the functions in this category. The subsidiary function category includes the functions that can be derived from the core functions. Domain combination pair search with high appearance probability and construction of protein interaction network are the functions in this category. Lastly, the general service function category includes the functions that can be implemented by collecting and organizing the protein and domain data in the Internet. Performance, openness and flexibility are the major design goals and they are achieved by adopting parallel execution techniques, Web Services standards, and layered architecture respectively. In this paper, several representative user interfaces of the system are also introduced with comprehensive usage guides.