scispace - formally typeset
Search or ask a question
Book ChapterDOI

Domain-Domain Interactions

TL;DR: This work document known information on domain-domain interfaces in understanding molecular function aided cellular biology and plays an essential role in specific molecular functions with multiplicity in many conditions.
Abstract: Multi-domain proteins are increasingly high in cellular systems from lower to higher species organisms. Therefore, it is an important and a critical phenomenon in protein architectural evolution for multiple molecular functions in a single reaction center. Hence, domain-domain interfaces play an essential role in specific molecular functions with multiplicity in many conditions. We document known information on domain-domain interfaces in understanding molecular function aided cellular biology.
References
More filters
Journal ArticleDOI
TL;DR: With the advancement of genomic technology and genome-wide analysis of organisms, more and more organisms are being studied extensively for gene expression on a global scale and computational methods to predict protein–protein interaction have been developed to predictprotein–protein interactions.
Abstract: With the advancement of genomic technology and genome-wide analysis of organisms, more and more organisms are being studied extensively for gene expression on a global scale. Expression profiling is now being used increasingly to analyze gene functions or to functionally group genes on the basis of their expression profiles (Lockhart and Winzeler 2000). After the completion of the genome sequence of Saccharomyces cerevisiae (Goffeau et al. 1996), a budding yeast, many researchers have undertaken the task of functionally analyzing the yeast genome, comprising ∼6280 proteins (YPD), of which roughly one-third do not have known functions (Mewes et al. 2002). Genes can be clustered on the basis of similar expression profiles. This makes it possible to assign a biological function to genes, depending on the functions of other genes in the cluster (Eisen et al. 1998). However, expression profiling gives an indirect measure of a gene product's biological and cellular function. A more complete study of an organism could possibly be achieved by looking at not only the mRNA levels but also the proteins they encode. It is well known that mRNA levels alone are not sufficient to group genes into different functions, because not all mRNAs end up being translated. Most biological functions within a cell are carried out by proteins and most cellular processes and biochemical events are ultimately achieved by interactions of proteins with one another. Thus, it is important to look at protein expression and their interactions simultaneously. Affinity chromatography, two-hybrid assay, copurification, coimmunoprecipitation, and cross-linking are some of the tools used to verify proteins that are associated physically with one another. Among these techniques, the two-hybrid assay has been used widely to analyze protein–protein interactions in Saccharomyces cerevisiae (Ito et al. 2000, 2001a; Uetz et al. 2000). Their protein interaction profiles have made it possible to look at the interaction networks comprising a large number of proteins and to also functionally classify proteins of unknown function. Uetz et al. (2000) used two different approaches in their two-hybrid experiments. The first was a protein array approach with 192 yeast proteins as bait, Gal4–DNA-binding domain fusions, and ∼6000 yeast transformants as prey, Gal4-activation domain fusions. The second, an interaction sequence tag (IST) approach, used high-throughput screens of an activation domain library encoding ∼6000 yeast genes that were pooled. All yeast proteins were cloned into DNA-binding domain vectors. Of the 6144 yeast ORF PCR products, 5345 were successfully cloned. Their first approach revealed 281 interactions, with less stringent selection criteria, using HIS3. The second approach revealed 692 interactions with the more stringent URA3 selection method. Ito et al. (2001a) used a similar method and reported 4549 interactions among 3278 proteins. Some interactions in both data sets were repeated (bait and prey exchanged). They imposed a more rigorous selection criterion including four reporter genes, ADE2, HIS3, URA3, and MEL1, to minimize false positives due to promoter-specific activation. All of these genes have Gal4-responsive promoter. Computational methods have been developed to predict protein–protein interactions. Those approaches include the Rosetta stone/gene fusion method (Enright et al. 1999; Marcotte et al. 1999a), the phylogenetic profile method (Pellegrini et al. 1999) and the method combining multiple sources of data (Marcotte et al. 1999b). Other computational methods to predict protein–protein interaction have been presented on the basis of different principles, including the interaction domain pair profile method (Rain et al. 2001; Wojcik and Schachter 2001) and the support vector machine learning method (Bock and Gough 2001). Gomez et al. (2001) developed probabilistic models for protein–protein interactions. Sprinzak and Margalit (2001) analyzed over-represented sequence-signature pairs among protein–protein interactions. In our study, we use the protein–protein interaction (PPI) data sets of Uetz and Ito to predict domain–domain interactions (DDI) in yeast proteins. The protein-domain information is obtained from a protein-domain family database called PFAM (Bateman et al. 2000). Because every protein can be characterized by either a distinct domain or a combination of domains, understanding domain interactions is crucial to understanding the nature and extent of biomolecular interactions. Our study predicts probable domain–domain interactions solely on the basis of the information of protein–protein interactions. Because proteins interact with one another through their specific domains, predicting domain–domain interactions on a global scale from the entire protein interaction data set make it possible to predict previously unknown protein–protein interactions from their domains. Thus, domain interactions extend the functional significance of proteins and present a global view of the protein–protein interaction network within a cell responsible for carrying out various biological and cellular functions. It is known that the yeast two-hybrid assay is not accurate in determining protein–protein interactions, and the interaction data used in our study certainly contain many false positive and false negative errors (Legrain and Selig 2000; Hazbun and Fields 2001; Mrowka et al. 2001). Taking into account these errors, we apply the Maximum Likelihood approach to estimate the probability of domain–domain interactions. We have also taken into account multiplicity of observations in the two data sets as evidenced by exchanged baits and preys, repeated interactions, and synonymously used gene names. To assess the accuracy of our method, we predict protein–protein interactions using the inferred domain–domain interactions, and compare them with the observed interactions. The following results are obtained: (1) Our method has shown robustness in analyzing incomplete data sets and dealing with various experimental errors, and we achieve 42.5% specificity and 77.6% sensitivity using the combined Uetz and Ito data. The relative low specificity may be caused by the fact that the observed protein–protein interactions in the Uetz and Ito combined data represent only a small fraction of all of the real interactions. (2) Comparing our predicted protein–protein interactions with the MIPS protein–protein interactions obtained by methods other than the two-hybrid assays, we show that the prediction rate of our method is about 100 times better than that of a random assignment. (3) We also compare the gene expression profile correlation coefficients of our predictions with those of random protein pairs, and our predictions have a higher mean correlation coefficient. (4) Finally, we check for biological significance of our novel predictions, and find several interesting interactions such as RPS0A interacting with APG17 and TAF40 interacting with SPT3, which are consistent with the functions of the proteins. A complete description of our model and the results are given in the sections below.

456 citations

Journal ArticleDOI
TL;DR: The updated DOMINE includes 2285 new domain–domain interactions inferred from experimentally characterized high-resolution three-dimensional structures, and about 3500 novel predictions by five computational approaches published over the last 3 years.
Abstract: DOMINE is a comprehensive collection of known and predicted domain-domain interactions (DDIs) compiled from 15 different sources. The updated DOMINE includes 2285 new domain-domain interactions (DDIs) inferred from experimentally characterized high-resolution three-dimensional structures, and about 3500 novel predictions by five computational approaches published over the last 3 years. These additions bring the total number of unique DDIs in the updated version to 26,219 among 5140 unique Pfam domains, a 23% increase compared to 20,513 unique DDIs among 4346 unique domains in the previous version. The updated version now contains 6634 known DDIs, and features a new classification scheme to assign confidence levels to predicted DDIs. DOMINE will serve as a valuable resource to those studying protein and domain interactions. Most importantly, DOMINE will not only serve as an excellent reference to bench scientists testing for new interactions but also to bioinformaticans seeking to predict novel protein-protein interactions based on the DDIs. The contents of the DOMINE are available at http://domine.utdallas.edu.

175 citations

Journal ArticleDOI
TL;DR: This study shows that integration of multiple biological data sets based on the Bayesian approach provides a reliable framework to predict domain interactions and shows that the coverage and accuracy of predicted domain interactions can be significantly increased.
Abstract: The development of high-throughput technologies has produced several large scale protein interaction data sets for multiple species, and significant efforts have been made to analyze the data sets in order to understand protein activities. Considering that the basic units of protein interactions are domain interactions, it is crucial to understand protein interactions at the level of the domains. The availability of many diverse biological data sets provides an opportunity to discover the underlying domain interactions within protein interactions through an integration of these biological data sets. We combine protein interaction data sets from multiple species, molecular sequences, and gene ontology to construct a set of high-confidence domain-domain interactions. First, we propose a new measure, the expected number of interactions for each pair of domains, to score domain interactions based on protein interaction data in one species and show that it has similar performance as the E-value defined by Riley et al. [1]. Our new measure is applied to the protein interaction data sets from yeast, worm, fruitfly and humans. Second, information on pairs of domains that coexist in known proteins and on pairs of domains with the same gene ontology function annotations are incorporated to construct a high-confidence set of domain-domain interactions using a Bayesian approach. Finally, we evaluate the set of domain-domain interactions by comparing predicted domain interactions with those defined in iPfam database [2, 3] that were derived based on protein structures. The accuracy of predicted domain interactions are also confirmed by comparing with experimentally obtained domain interactions from H. pylori [4]. As a result, a total of 2,391 high-confidence domain interactions are obtained and these domain interactions are used to unravel detailed protein and domain interactions in several protein complexes. Our study shows that integration of multiple biological data sets based on the Bayesian approach provides a reliable framework to predict domain interactions. By integrating multiple data sources, the coverage and accuracy of predicted domain interactions can be significantly increased.

114 citations

Journal ArticleDOI
TL;DR: The results indicate that different organisms use the same 'building blocks' for PPIs, suggesting that the functionality of many domain pairs in mediating protein interactions is maintained in evolution.
Abstract: Recently, there has been much interest in relating domain-domain interactions (DDIs) to protein-protein interactions (PPIs) and vice versa, in an attempt to understand the molecular basis of PPIs. Here we map structurally derived DDIs onto the cellular PPI networks of different organisms and demonstrate that there is a catalog of domain pairs that is used to mediate various interactions in the cell. We show that these DDIs occur frequently in protein complexes and that homotypic interactions (of a domain with itself) are abundant. A comparison of the repertoires of DDIs in the networks of Escherichia coli, Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens shows that many DDIs are evolutionarily conserved. Our results indicate that different organisms use the same 'building blocks' for PPIs, suggesting that the functionality of many domain pairs in mediating protein interactions is maintained in evolution.

102 citations

Journal ArticleDOI
TL;DR: Results indicate that the parsimony principle provides a correct approach for detecting domain-domain contacts in a protein-protein interaction network.
Abstract: We propose a novel approach to predict domain-domain interactions from a protein-protein interaction network. In our method we apply a parsimony-driven explanation of the network, where the domain interactions are inferred using linear programming optimization, and false positives in the protein network are handled by a probabilistic construction. This method outperforms previous approaches by a considerable margin. The results indicate that the parsimony principle provides a correct approach for detecting domain-domain contacts.

92 citations