scispace - formally typeset
Open AccessJournal ArticleDOI

Protein–ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment

Jianyi Yang, +2 more
- 15 Oct 2013 - 
- Vol. 29, Iss: 20, pp 2588-2595
TLDR
Two new methods, one based on binding-specific substructure comparison (TM-Site) and another on sequence profile alignment (S-SITE), for complementary binding site predictions are developed, which demonstrate a new robust approach to protein-ligand binding site recognition, ready for genome-wide structure-based function annotations.
Abstract
Motivation: Identification of protein–ligand binding sites is critical to protein function annotation and drug discovery. However, there is no method that could generate optimal binding site prediction for different protein types. Combination of complementary predictions is probably the most reliable solution to the problem. Results: We develop two new methods, one based on binding-specific substructure comparison (TM-SITE) and another on sequence profile alignment (S-SITE), for complementary binding site predictions. The methods are tested on a set of 500 non-redundant proteins harboring 814 natural, drug-like and metal ion molecules. Starting from low-resolution protein structure predictions, the methods successfully recognize 451% of binding residues with average Matthews correlation coefficient (MCC) significantly higher (with P-value 510 –9 in student t-test) than other state-of-the-art methods, including COFACTOR, FINDSITE and ConCavity. When combining TM-SITE and S-SITE with other structure-based programs, a consensus approach (COACH) can increase MCC by 15% over the best individual predictions. COACH was examined in the recent community-wide COMEO experiment and consistently ranked as the best method in last 22 individual datasets with the Area Under the Curve score 22.5% higher than the second best method. These data demonstrate a new robust approach to protein–ligand binding site recognition, which is ready for genome-wide structure-based function annotations.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

The I-TASSER Suite: protein structure and function prediction

TL;DR: A stand-alone I-TASSER Suite that can be used for off-line protein structure and function prediction and three complementary algorithms to enhance function inferences are developed, the consensus of which is derived by COACH4 using support vector machines.
Journal ArticleDOI

I-TASSER server: new development for protein structure and function predictions

TL;DR: Focuses have been made on the introduction of new methods for atomic-level structure refinement, local structure quality estimation and biological function annotations, which are designed to address the requirements from the user community and to increase the accuracy of modeling predictions.
Journal ArticleDOI

Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric.

TL;DR: The proposed MCC-classifier has a close performance to SVM-imba while being simpler and more efficient and an optimal Bayes classifier for the MCC metric using an approach based on Frechet derivative.
Journal ArticleDOI

Methane metabolism in the archaeal phylum Bathyarchaeota revealed by genome-centric metagenomics

TL;DR: These findings indicate that methane metabolism arose before the last common ancestor of the Euryarchaeota and BathyarchAEota, and suggest that unrecognized archaeal lineages may also contribute to global methane cycling.
Journal ArticleDOI

Protein Structure and Function Prediction Using I‐TASSER

TL;DR: This unit describes how to use the I‐TASSER protocol to generate structure and function prediction and how to interpret the prediction results, as well as alternative approaches for further improving the I-TASSer modeling quality for distant‐homologous and multi‐domain protein targets.
References
More filters
Journal ArticleDOI

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI

A general method applicable to the search for similarities in the amino acid sequence of two proteins

TL;DR: A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed and it is possible to determine whether significant homology exists between the proteins to trace their possible evolutionary development.
Journal ArticleDOI

Amino acid substitution matrices from protein blocks

TL;DR: This work has derived substitution matrices from about 2000 blocks of aligned sequence segments characterizing more than 500 groups of related proteins, leading to marked improvements in alignments and in searches using queries from each of the groups.
Journal ArticleDOI

I-TASSER: a unified platform for automated protein structure and function prediction

TL;DR: The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence- to-structure-to-function paradigm.
Journal ArticleDOI

Protein secondary structure prediction based on position-specific scoring matrices

TL;DR: A two-stage neural network has been used to predict protein secondary structure based on the position specific scoring matrices generated by PSI-BLAST and achieved an average Q3 score of between 76.5% to 78.3% depending on the precise definition of observed secondary structure used, which is the highest published score for any method to date.
Related Papers (5)