scispace - formally typeset
Search or ask a question
Journal ArticleDOI

SVM prediction of ligand-binding sites in bacterial lipoproteins employing shape and physio-chemical descriptors

31 Oct 2012-Protein and Peptide Letters (Protein Pept Lett)-Vol. 19, Iss: 11, pp 1155-1162
TL;DR: An algorithm to identify and predict ligand-binding sites in bacterial lipoproteins using three types of pocket descriptors and combines them with Support Vector Machine (SVM) method for the classification.
Abstract: Bacterial lipoproteins play critical roles in various physiological processes including the maintenance of pathogenicity and numbers of them are being considered as potential candidates for generating novel vaccines. In this work, we put forth an algorithm to identify and predict ligand-binding sites in bacterial lipoproteins. The method uses three types of pocket descriptors, namely fpocket descriptors, 3D Zernike descriptors and shell descriptors, and combines them with Support Vector Machine (SVM) method for the classification. The three types of descriptors represent shape-based properties of the pocket as well as its local physio-chemical features. All three types of descriptors, along with their hybrid combinations are evaluated with SVM and to improve classification performance, WEKA-InfoGain feature selection is applied. Results obtained in the study show that the classifier successfully differentiates between ligand-binding and non-binding pockets. For the combination of three types of descriptors, 10 fold cross-validation accuracy of 86.83% is obtained for training while the selected model achieved test Matthews Correlation Coefficient (MCC) of 0.534. Individually or in combination with new and existing methods, our model can be a very useful tool for the prediction of potential ligand-binding sites in bacterial lipoproteins.
Citations
More filters
Journal ArticleDOI
TL;DR: This work combines molecular dynamics simulations of HSA and the state-of-art machine learning method Support Vector Machine (SVM) to predict glucose-binding pockets in HSA, revealing seven new potential glucose- binding sites in the molecule.
Abstract: Human Serum Albumin (HSA) has been suggested to be an alternate biomarker to the existing Hemoglobin-A1c (HbA1c) marker for glycemic monitoring. Development and usage of HSA as an alternate biomarker requires the identification of glycation sites, or equivalently, glucose-binding pockets. In this work, we combine molecular dynamics simulations of HSA and the state-of-art machine learning method Support Vector Machine (SVM) to predict glucose-binding pockets in HSA. SVM uses the three dimensional arrangement of atoms and their chemical properties to predict glucose-binding ability of a pocket. Feature selection reveals that the arrangement of atoms and their chemical properties within the first 4A from the centroid of the pocket play an important role in the binding of glucose. With a 10-fold cross validation accuracy of 84 percent, our SVM model reveals seven new potential glucose-binding sites in HSA of which two are exposed only during the dynamics of HSA. The predictions are further corroborated using docking studies. These findings can complement studies directed towards the development of HSA as an alternate biomarker for glycemic monitoring.

9 citations


Cites methods from "SVM prediction of ligand-binding si..."

  • ...The van der Waals and shortrange electrostatic interactions were estimated within a 9 A cut-off and the long-range electrostatic interactions were assessed using the Particle Mesh Ewald (PME) method....

    [...]

Proceedings ArticleDOI
01 Dec 2019
TL;DR: A critical review on the recent development in machine learning based protein secondary structure prediction methods is presented and it is found that several further improvements are possible with the emergence of deep learning techniques.
Abstract: Protein secondary structure prediction plays a fundamental role in bioinformatics. Extracting valuable information from big biological data that can give an insight into understanding the 3-dimensional protein structure and later learn its biological function is quit challenging. In the past decade, many machine learning approaches have been applied in bioinformatics to extract knowledge from protein data. In this paper, a critical review on the recent development in machine learning based protein secondary structure prediction methods are presented. Next generation method (Deep learning) is also introduced to provide interested researchers with first-hand information on the future trend in this field. Although many approaches have yielded an appreciable prediction performance, machine learning approaches are far from fulfilling its potentials in biological research because of the difficulty in interpreting how particular model feature correlate with input features to yield that desired output in biological perspective. Therefore, this study has found that several further improvements are possible with the emergence of deep learning techniques.