scispace - formally typeset
Search or ask a question
Author

Arye Shemesh

Bio: Arye Shemesh is an academic researcher from Weizmann Institute of Science. The author has contributed to research in topics: Active site & Protein structure. The author has an hindex of 1, co-authored 1 publications receiving 421 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: This work transformed protein structures into residue interaction graphs (RIGs), where amino acid residues are graph nodes and their interactions with each other are the graph edges, and found that active site, ligand-binding and evolutionary conserved residues, typically have high closeness values.

463 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The history of automated protein function prediction, a need for a functional annotation which is standardized and machine readable so that function prediction programs could be incorporated into larger workflows, and the latest innovations in all three topics are surveyed.
Abstract: Overwhelmed with genomic data, biologists are facing the first big post-genomic questioncwhat do all genes do? First, not only is the volume of pure sequence and structure data growing, but its diversity is growing as well, leading to a disproportionate growth in the number of uncharacterized gene products. Consequently, established methods of gene and protein annotation, such as homology-based transfer, are annotating less data and in many cases are amplifying existing erroneous annotation. Second, there is a need for a functional annotation which is standardized and machine readable so that function prediction programs could be incorporated into larger workflows. This is problematic due to the subjective and contextual definition of protein function. Third, there is a need to assess the quality of function predictors. Again, the subjectivity of the term ‘function’ and the various aspects of biological function make this a challenging effort. This article briefly outlines the history of automated protein function prediction and surveys the latest innovations in all three topics.

479 citations

Journal ArticleDOI
TL;DR: The hubs identified are found to play a role in bringing together different secondary structural elements in the tertiary structure of the proteins, and could be crucial for the folding and stability of the unique three-dimensional structure of proteins.

389 citations

Journal ArticleDOI
TL;DR: ConCavity is introduced, a small molecule binding site prediction algorithm that integrates evolutionary sequence conservation estimates with structure-based methods for identifying protein surface cavities and finds that the two approaches provide largely complementary information, which can be combined to improve upon either approach alone.
Abstract: Identifying a protein's functional sites is an important step towards characterizing its molecular function. Numerous structure- and sequence-based methods have been developed for this problem. Here we introduce ConCavity, a small molecule binding site prediction algorithm that integrates evolutionary sequence conservation estimates with structure-based methods for identifying protein surface cavities. In large-scale testing on a diverse set of single- and multi-chain protein structures, we show that ConCavity substantially outperforms existing methods for identifying both 3D ligand binding pockets and individual ligand binding residues. As part of our testing, we perform one of the first direct comparisons of conservation-based and structure-based methods. We find that the two approaches provide largely complementary information, which can be combined to improve upon either approach alone. We also demonstrate that ConCavity has state-of-the-art performance in predicting catalytic sites and drug binding pockets. Overall, the algorithms and analysis presented here significantly improve our ability to identify ligand binding sites and further advance our understanding of the relationship between evolutionary sequence conservation and structural and functional attributes of proteins. Data, source code, and prediction visualizations are available on the ConCavity web site (http://compbio.cs.princeton.edu/concavity/).

373 citations

Journal ArticleDOI
TL;DR: It is proposed that centrally conserved residues, whose removal increases the characteristic path length in protein networks, may relate to the system fragility.
Abstract: Here, we represent protein structures as residue interacting networks, which are assumed to involve a permanent flow of information between amino acids. By removal of nodes from the protein network, we identify fold centrally conserved residues, which are crucial for sustaining the shortest pathways and thus play key roles in long-range interactions. Analysis of seven protein families (myoglobins, G-protein-coupled receptors, the trypsin class of serine proteases, hemoglobins, oligosaccharide phosphorylases, nuclear receptor ligand-binding domains and retroviral proteases) confirms that experimentally many of these residues are important for allosteric communication. The agreement between the centrally conserved residues, which are key in preserving short path lengths, and residues experimentally suggested to mediate signaling further illustrates that topology plays an important role in network communication. Protein folds have evolved under constraints imposed by function. To maintain function, protein structures need to be robust to mutational events. On the other hand, robustness is accompanied by an extreme sensitivity at some crucial sites. Thus, here we propose that centrally conserved residues, whose removal increases the characteristic path length in protein networks, may relate to the system fragility.

315 citations

Journal ArticleDOI
TL;DR: Several automated servers that integrate evidence from multiple sources have been released this year and particular improvements have been seen with methods utilizing the Gene Ontology functional annotation schema.

308 citations