scispace - formally typeset
Search or ask a question
Journal ArticleDOI

ClassAMP: A Prediction Tool for Classification of Antimicrobial Peptides

TL;DR: An algorithm called ClassAMP has been developed to predict the propensity of a protein sequence to have antibacterial, antifungal, or antiviral activity.
Abstract: Antimicrobial peptides (AMPs) are gaining popularity as anti-infective agents. Information on sequence features that contribute to target specificity of AMPs will aid in accelerating drug discovery programs involving them. In this study, an algorithm called ClassAMP using Random Forests (RFs) and Support Vector Machines (SVMs) has been developed to predict the propensity of a protein sequence to have antibacterial, antifungal, or antiviral activity. ClassAMP is available at http://www.bicnirrh.res.in/classamp/.
Citations
More filters
Journal ArticleDOI
TL;DR: This study made an attempt to develop a support vector machine (SVM) based computational approach for prediction of AMPs with improved accuracy, and achieved higher accuracy than several existing approaches, while compared using benchmark dataset.
Abstract: Antimicrobial peptides (AMPs) are important components of the innate immune system that have been found to be effective against disease causing pathogens. Identification of AMPs through wet-lab experiment is expensive. Therefore, development of efficient computational tool is essential to identify the best candidate AMP prior to the in vitro experimentation. In this study, we made an attempt to develop a support vector machine (SVM) based computational approach for prediction of AMPs with improved accuracy. Initially, compositional, physico-chemical and structural features of the peptides were generated that were subsequently used as input in SVM for prediction of AMPs. The proposed approach achieved higher accuracy than several existing approaches, while compared using benchmark dataset. Based on the proposed approach, an online prediction server iAMPpred has also been developed to help the scientific community in predicting AMPs, which is freely accessible at http://cabgrid.res.in:8080/amppred/. The proposed approach is believed to supplement the tools and techniques that have been developed in the past for prediction of AMPs.

313 citations

Journal ArticleDOI
TL;DR: This work reviews the available strategies for their synthesis, bioinformatics tools for the rational design of antimicrobial peptides with enhanced therapeutic indices, hurdles and shortcomings limiting the large-scale production of AMPs, as well as the challenges that the pharmaceutical industry faces on their use as therapeutic agents.
Abstract: Antimicrobial peptides are small molecules with activity against bacteria, yeasts, fungi, viruses, bacteria, and even tumor cells that make these molecules attractive as therapeutic agents. Due to the alarming increase of antimicrobial resistance, interest in alternative antimicrobial agents has led to the exploitation of antimicrobial peptides, both synthetic and from natural sources. Thus, many peptide-based drugs are currently commercially available for the treatment of numerous ailments, such as hepatitis C, myeloma, skin infections, and diabetes. Initial barriers are being increasingly overcome with the development of cost-effective, more stable peptides. Herein, we review the available strategies for their synthesis, bioinformatics tools for the rational design of antimicrobial peptides with enhanced therapeutic indices, hurdles and shortcomings limiting the large-scale production of AMPs, as well as the challenges that the pharmaceutical industry faces on their use as therapeutic agents.

159 citations


Cites background from "ClassAMP: A Prediction Tool for Cla..."

  • ...2012)), and the Biomedical Informatics Center in Mumbai, respectively (ClassAMP (Joseph et al. 2012))....

    [...]

  • ...…tools for the identification of antimicrobial regions and classification of antimicrobial peptide regions are also available at the Center for Genomic Regulation of Barcelona (AMPA (Torrent et al. 2012)), and the Biomedical Informatics Center in Mumbai, respectively (ClassAMP (Joseph et al. 2012))....

    [...]

Journal ArticleDOI
TL;DR: It was observed that binary based model developed in this study preforms better than any model/method on a dataset containing compositionally similar antifungal and non-AFPs.
Abstract: This paper describes in silico models developed using a wide range of peptide features for predicting antifungal peptides. Our analyses indicate that certain types of residue (e.g., C, G, H, K, R, Y) are more abundant in antifungal peptides. The positional residue preference analysis reveals the prominence of the particular type of residues (e.g., R, V, K) at N-terminus and a certain type of residues (e.g., C, H) at C-terminus. In this study, models have been developed for predicting antifungal peptides using a wide range of peptide features (like residue composition, binary profile, terminal residues). The support vector machine based model developed using compositional features of peptides achieved maximum accuracy of 88.78% on the training dataset and 83.33% on independent or validation dataset. Our model developed using binary patterns of terminal residues of peptides achieved maximum accuracy 84.88% on training and 84.64% on validation dataset. We benchmark models developed in this study and existing methods on a dataset contain compositionally similar antifungal and non-antifungal peptides. It was observed that binary based model developed in this study preforms better than any model/method. In order to facilitate scientific community, we developed a mobile app, standalone and a user-friendly web server ‘Antifp' (http://webs.iiitd.edu.in/raghava/antifp).

92 citations


Cites methods from "ClassAMP: A Prediction Tool for Cla..."

  • ...ClassAMP is one of the methods which predicts the given peptide as Antibacterial, Antiviral, Antifungal, etc. with a probability score (Joseph et al., 2012)....

    [...]

Journal ArticleDOI
TL;DR: A systematic evaluation of ten publicly available AMP prediction methods finds that CAMPR3(RF) provides a statistically significant improvement in performance, as measured by the area under the receiver operating characteristic (ROC) curve, relative to the other five methods.
Abstract: Motivation Antimicrobial peptides (AMPs) are innate immune molecules that exhibit activities against a range of microbes, including bacteria, fungi, viruses and protozoa. Recent increases in microbial resistance against current drugs has led to a concomitant increase in the need for novel antimicrobial agents. Over the last decade, a number of AMP prediction tools have been designed and made freely available online. These AMP prediction tools show potential to discriminate AMPs from non-AMPs, but the relative quality of the predictions produced by the various tools is difficult to quantify. Results We compiled two sets of AMP and non-AMP peptides, separated into three categories-antimicrobial, antibacterial and bacteriocins. Using these benchmark data sets, we carried out a systematic evaluation of ten publicly available AMP prediction methods. Among the six general AMP prediction tools-ADAM, CAMPR3(RF), CAMPR3(SVM), MLAMP, DBAASP and MLAMP-we find that CAMPR3(RF) provides a statistically significant improvement in performance, as measured by the area under the receiver operating characteristic (ROC) curve, relative to the other five methods. Surprisingly, for antibacterial prediction, the original AntiBP method significantly outperforms its successor, AntiBP2 based on one benchmark dataset. The two bacteriocin prediction tools, BAGEL3 and BACTIBASE, both provide very good performance and BAGEL3 outperforms its predecessor, BACTIBASE, on the larger of the two benchmarks. Contact gaberemu@ngha.med.sa or william-noble@uw.edu. Supplementary information Supplementary data are available at Bioinformatics online.

84 citations

Journal ArticleDOI
TL;DR: This review addresses the tools and techniques, and also their limitations, for mining AMPs from databases, which could be helpful for developing novel antimicrobial agents and combating resistant bacteria.

81 citations


Cites background or methods from "ClassAMP: A Prediction Tool for Cla..."

  • ...…systems available as standalone programs or web servers, and these are AntiBP (Lata et al., 2007, 2010), CAMP (Thomas et al., 2010; Waghu et al., 2014), CS-AMPPred (Porto et al., 2010, 2012a), ClassAMP (Joseph et al., 2012), iAMP-2L (Xiao et al., 2013) and ADAM (Lee et al., 2015) (Table 4)....

    [...]

  • ...ClassAMP: a prediction tool for classification of antimicrobial peptides....

    [...]

  • ...Currently, there are three prediction systems available as standalone programs or web servers, and these are AntiBP (Lata et al., 2007, 2010), CAMP (Thomas et al., 2010; Waghu et al., 2014), CS-AMPPred (Porto et al., 2010, 2012a), ClassAMP (Joseph et al., 2012), iAMP-2L (Xiao et al., 2013) and ADAM (Lee et al., 2015) (Table 4)....

    [...]

  • ...Porto et al. (2010, 2012a) ClassAMP This supports the algorithms SVM and RF and predicts the propensity for the peptide to be antibacterial, antifungal or antiviral....

    [...]

  • ...Besides, it does not predict dual activity (e.g. antibacterial and antiviral) Joseph et al. (2012) iAMP-2L This supports the algorithm fuzzy K-nearest neighbour (FKNN) and it was trained using a pseudo amino acid composition strategy (Chou, 2001)....

    [...]

References
More filters
Journal Article
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Abstract: Copyright (©) 1999–2012 R Foundation for Statistical Computing. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the R Core Team.

272,030 citations

Journal ArticleDOI
01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

79,257 citations

Book
Vladimir Vapnik1
01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

40,147 citations

01 Jan 1998
TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Abstract: A comprehensive look at learning and generalization theory. The statistical theory of learning and generalization concerns the problem of choosing desired functions on the basis of empirical data. Highly applicable to a variety of computer science and robotics fields, this book offers lucid coverage of the theory as a whole. Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

26,531 citations

01 Jan 2007
TL;DR: random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.
Abstract: Recently there has been a lot of interest in “ensemble learning” — methods that generate many classifiers and aggregate their results. Two well-known methods are boosting (see, e.g., Shapire et al., 1998) and bagging Breiman (1996) of classification trees. In boosting, successive trees give extra weight to points incorrectly predicted by earlier predictors. In the end, a weighted vote is taken for prediction. In bagging, successive trees do not depend on earlier trees — each is independently constructed using a bootstrap sample of the data set. In the end, a simple majority vote is taken for prediction. Breiman (2001) proposed random forests, which add an additional layer of randomness to bagging. In addition to constructing each tree using a different bootstrap sample of the data, random forests change how the classification or regression trees are constructed. In standard trees, each node is split using the best split among all variables. In a random forest, each node is split using the best among a subset of predictors randomly chosen at that node. This somewhat counterintuitive strategy turns out to perform very well compared to many other classifiers, including discriminant analysis, support vector machines and neural networks, and is robust against overfitting (Breiman, 2001). In addition, it is very user-friendly in the sense that it has only two parameters (the number of variables in the random subset at each node and the number of trees in the forest), and is usually not very sensitive to their values. The randomForest package provides an R interface to the Fortran programs by Breiman and Cutler (available at http://www.stat.berkeley.edu/ users/breiman/). This article provides a brief introduction to the usage and features of the R functions.

14,830 citations