scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Network analysis of protein structures identifies functional residues.

03 Dec 2004-Journal of Molecular Biology (J Mol Biol)-Vol. 344, Iss: 4, pp 1135-1146
TL;DR: This work transformed protein structures into residue interaction graphs (RIGs), where amino acid residues are graph nodes and their interactions with each other are the graph edges, and found that active site, ligand-binding and evolutionary conserved residues, typically have high closeness values.
About: This article is published in Journal of Molecular Biology.The article was published on 2004-12-03. It has received 463 citations till now. The article focuses on the topics: Protein structure & Active site.
Citations
More filters
Journal ArticleDOI
TL;DR: The history of automated protein function prediction, a need for a functional annotation which is standardized and machine readable so that function prediction programs could be incorporated into larger workflows, and the latest innovations in all three topics are surveyed.
Abstract: Overwhelmed with genomic data, biologists are facing the first big post-genomic questioncwhat do all genes do? First, not only is the volume of pure sequence and structure data growing, but its diversity is growing as well, leading to a disproportionate growth in the number of uncharacterized gene products. Consequently, established methods of gene and protein annotation, such as homology-based transfer, are annotating less data and in many cases are amplifying existing erroneous annotation. Second, there is a need for a functional annotation which is standardized and machine readable so that function prediction programs could be incorporated into larger workflows. This is problematic due to the subjective and contextual definition of protein function. Third, there is a need to assess the quality of function predictors. Again, the subjectivity of the term ‘function’ and the various aspects of biological function make this a challenging effort. This article briefly outlines the history of automated protein function prediction and surveys the latest innovations in all three topics.

479 citations


Cites methods from "Network analysis of protein structu..."

  • ...including the Catalytic Site Atlas [78**], PDBFun [79], PDBSite and PDBSiteScan [80, 81], SuMo [82, 83], pvSOAR [84], SARIG [85], FEATURE [86, 87], RIGOR [88] and PatchFinder [89] (Table 2)....

    [...]

  • ...Several more such databases and associated search algorithms exist, including the Catalytic Site Atlas [78**], PDBFun [79], PDBSite and PDBSiteScan [80, 81], SuMo [82, 83], pvSOAR [84], SARIG [85], FEATURE [86, 87], RIGOR [88] and PatchFinder [89] (Table 2)....

    [...]

Journal ArticleDOI
TL;DR: The hubs identified are found to play a role in bringing together different secondary structural elements in the tertiary structure of the proteins, and could be crucial for the folding and stability of the unique three-dimensional structure of proteins.

389 citations

Journal ArticleDOI
TL;DR: ConCavity is introduced, a small molecule binding site prediction algorithm that integrates evolutionary sequence conservation estimates with structure-based methods for identifying protein surface cavities and finds that the two approaches provide largely complementary information, which can be combined to improve upon either approach alone.
Abstract: Identifying a protein's functional sites is an important step towards characterizing its molecular function. Numerous structure- and sequence-based methods have been developed for this problem. Here we introduce ConCavity, a small molecule binding site prediction algorithm that integrates evolutionary sequence conservation estimates with structure-based methods for identifying protein surface cavities. In large-scale testing on a diverse set of single- and multi-chain protein structures, we show that ConCavity substantially outperforms existing methods for identifying both 3D ligand binding pockets and individual ligand binding residues. As part of our testing, we perform one of the first direct comparisons of conservation-based and structure-based methods. We find that the two approaches provide largely complementary information, which can be combined to improve upon either approach alone. We also demonstrate that ConCavity has state-of-the-art performance in predicting catalytic sites and drug binding pockets. Overall, the algorithms and analysis presented here significantly improve our ability to identify ligand binding sites and further advance our understanding of the relationship between evolutionary sequence conservation and structural and functional attributes of proteins. Data, source code, and prediction visualizations are available on the ConCavity web site (http://compbio.cs.princeton.edu/concavity/).

373 citations


Cites background from "Network analysis of protein structu..."

  • ..., Theoretical Microscopic Titration Curves (THEMATICS) [35], binding site similarity [36], phage display libraries [37], and residue interaction graphs [38])....

    [...]

Journal ArticleDOI
TL;DR: It is proposed that centrally conserved residues, whose removal increases the characteristic path length in protein networks, may relate to the system fragility.
Abstract: Here, we represent protein structures as residue interacting networks, which are assumed to involve a permanent flow of information between amino acids. By removal of nodes from the protein network, we identify fold centrally conserved residues, which are crucial for sustaining the shortest pathways and thus play key roles in long-range interactions. Analysis of seven protein families (myoglobins, G-protein-coupled receptors, the trypsin class of serine proteases, hemoglobins, oligosaccharide phosphorylases, nuclear receptor ligand-binding domains and retroviral proteases) confirms that experimentally many of these residues are important for allosteric communication. The agreement between the centrally conserved residues, which are key in preserving short path lengths, and residues experimentally suggested to mediate signaling further illustrates that topology plays an important role in network communication. Protein folds have evolved under constraints imposed by function. To maintain function, protein structures need to be robust to mutational events. On the other hand, robustness is accompanied by an extreme sensitivity at some crucial sites. Thus, here we propose that centrally conserved residues, whose removal increases the characteristic path length in protein networks, may relate to the system fragility.

315 citations


Cites background from "Network analysis of protein structu..."

  • ...…proven to be useful in a number of studies, such as protein folding (Vendruscolo et al, 2002), residue contribution to the protein–protein binding free energy in given complexes (del Sol and O’Meara, 2004) and prediction of functionally important residues in enzyme families (Amitai et al, 2004)....

    [...]

  • ...Based on a large set of enzymes, Amitai et al (2004) have shown that active site residues tend to be highly central in the structure, suggesting that these positions are crucial for the transmission of information between the residues in the protein....

    [...]

Journal ArticleDOI
TL;DR: Several automated servers that integrate evidence from multiple sources have been released this year and particular improvements have been seen with methods utilizing the Gene Ontology functional annotation schema.

308 citations


Cites background from "Network analysis of protein structu..."

  • ...[35] aims to identify active site residues through network analysis of protein structures....

    [...]

References
More filters
Journal ArticleDOI
04 Jun 1998-Nature
TL;DR: Simple models of networks that can be tuned through this middle ground: regular networks ‘rewired’ to introduce increasing amounts of disorder are explored, finding that these systems can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs.
Abstract: Networks of coupled dynamical systems have been used to model biological oscillators, Josephson junction arrays, excitable media, neural networks, spatial games, genetic control networks and many other self-organizing systems. Ordinarily, the connection topology is assumed to be either completely regular or completely random. But many biological, technological and social networks lie somewhere between these two extremes. Here we explore simple models of networks that can be tuned through this middle ground: regular networks 'rewired' to introduce increasing amounts of disorder. We find that these systems can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs. We call them 'small-world' networks, by analogy with the small-world phenomenon (popularly known as six degrees of separation. The neural network of the worm Caenorhabditis elegans, the power grid of the western United States, and the collaboration graph of film actors are shown to be small-world networks. Models of dynamical systems with small-world coupling display enhanced signal-propagation speed, computational power, and synchronizability. In particular, infectious diseases spread more easily in small-world networks than in regular lattices.

39,297 citations

Journal ArticleDOI
TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.
Abstract: The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.

34,239 citations

Journal ArticleDOI
15 Oct 1999-Science
TL;DR: A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.
Abstract: Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature was found to be a consequence of two generic mechanisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.

33,771 citations


"Network analysis of protein structu..." refers background in this paper

  • ...Frequently these networks include a small number of central nodes that are hubs through which many nodes can indirectly connect.(23) We sought to examine whether central nodes of protein residue interaction networks correspond to functional residues....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a simple model based on the power-law degree distribution of real networks was proposed, which was able to reproduce the power law degree distribution in real networks and to capture the evolution of networks, not just their static topology.
Abstract: The emergence of order in natural systems is a constant source of inspiration for both physical and biological sciences. While the spatial order characterizing for example the crystals has been the basis of many advances in contemporary physics, most complex systems in nature do not offer such high degree of order. Many of these systems form complex networks whose nodes are the elements of the system and edges represent the interactions between them. Traditionally complex networks have been described by the random graph theory founded in 1959 by Paul Erdohs and Alfred Renyi. One of the defining features of random graphs is that they are statistically homogeneous, and their degree distribution (characterizing the spread in the number of edges starting from a node) is a Poisson distribution. In contrast, recent empirical studies, including the work of our group, indicate that the topology of real networks is much richer than that of random graphs. In particular, the degree distribution of real networks is a power-law, indicating a heterogeneous topology in which the majority of the nodes have a small degree, but there is a significant fraction of highly connected nodes that play an important role in the connectivity of the network. The scale-free topology of real networks has very important consequences on their functioning. For example, we have discovered that scale-free networks are extremely resilient to the random disruption of their nodes. On the other hand, the selective removal of the nodes with highest degree induces a rapid breakdown of the network to isolated subparts that cannot communicate with each other. The non-trivial scaling of the degree distribution of real networks is also an indication of their assembly and evolution. Indeed, our modeling studies have shown us that there are general principles governing the evolution of networks. Most networks start from a small seed and grow by the addition of new nodes which attach to the nodes already in the system. This process obeys preferential attachment: the new nodes are more likely to connect to nodes with already high degree. We have proposed a simple model based on these two principles wich was able to reproduce the power-law degree distribution of real networks. Perhaps even more importantly, this model paved the way to a new paradigm of network modeling, trying to capture the evolution of networks, not just their static topology.

18,415 citations

Journal ArticleDOI
TL;DR: In this article, three distinct intuitive notions of centrality are uncovered and existing measures are refined to embody these conceptions, and the implications of these measures for the experimental study of small groups are examined.

14,757 citations