scispace - formally typeset
Open AccessJournal ArticleDOI

iACP: a sequence-based tool for identifying anticancer peptides

Reads0
Chats0
TLDR
A sequence-based predictor called iACP is reported, developed by the approach of optimizing the g-gap dipeptide components, that remarkably outperformed the existing predictors for the same purpose in both overall accuracy and stability.
Abstract
// Wei Chen 1, 4 , Hui Ding 2 , Pengmian Feng 3 , Hao Lin 2, 4 , Kuo-Chen Chou 4, 5 1 Department of Physics, School of Sciences, Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan, China 2 Key Laboratory for Neuro-Information of Ministry of Education, Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China 3 School of Public Health, North China University of Science and Technology, Tangshan, China 4 Gordon Life Science Institute, Belmont, Massachusetts, United States of America 5 Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, Saudi Arabia Correspondence to: Wei Chen, e-mail: wchen@gordonlifescience.org , chenweiimu@gmail.com Hao Lin, e-mail: hlin@uestc.edu.cn Kuo-Chen Chou, e-mail: kcchou@gordonlifescience.org Keywords: anticancer peptides, PseAAC, g-gap dipeptide mode, incremental feature selection, iACP webserver Received: January 06, 2016      Accepted: February 11, 2016      Published: March 01, 2016 ABSTRACT Cancer remains a major killer worldwide. Traditional methods of cancer treatment are expensive and have some deleterious side effects on normal cells. Fortunately, the discovery of anticancer peptides (ACPs) has paved a new way for cancer treatment. With the explosive growth of peptide sequences generated in the post genomic age, it is highly desired to develop computational methods for rapidly and effectively identifying ACPs, so as to speed up their application in treating cancer. Here we report a sequence-based predictor called iACP developed by the approach of optimizing the g-gap dipeptide components. It was demonstrated by rigorous cross-validations that the new predictor remarkably outperformed the existing predictors for the same purpose in both overall accuracy and stability. For the convenience of most experimental scientists, a publicly accessible web-server for iACP has been established at http://lin.uestc.edu.cn/server/iACP , by which users can easily obtain their desired results.

read more

Citations
More filters
Journal ArticleDOI

Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou's general PseAAC.

TL;DR: This study made an attempt to develop a support vector machine (SVM) based computational approach for prediction of AMPs with improved accuracy, and achieved higher accuracy than several existing approaches, while compared using benchmark dataset.
Journal ArticleDOI

ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides.

TL;DR: An effective feature representation learning model is developed that can extract and learn a set of informative features from a pool of support vector machine-based models trained using sequence-based feature descriptors and provide the most discriminative power for identifying ACPs.
Journal ArticleDOI

iDNA6mA-PseKNC: Identifying DNA N6-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC.

TL;DR: A novel predictor called iDNA6mA-PseKNC is proposed that is established by incorporating nucleotide physicochemical properties into Pseudo K-tuple Nucleotide Composition (PSEKNC), and it has been observed via rigorous cross-validations that the predictor's sensitivity, specificity, accuracy, and stability are excellent.
Journal ArticleDOI

iRNA-PseU: Identifying RNA pseudouridine sites

TL;DR: A predictor called iRNA-PseU was proposed by incorporating the chemical properties of nucleotides and their occurrence frequency density distributions into the general form of pseudo nucleotide composition (PseKNC), and it has been demonstrated via the rigorous jackknife test, independent dataset test, and practical genome-wide analysis that the proposed predictor remarkably outperforms its counterpart.
Journal ArticleDOI

iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC.

TL;DR: A two-layer seamless predictor named as 'iPromoter-2 L', which serves to identify a query DNA sequence as a promoter or non-promoter, and the second layer to predict which of the following six types the identified promoter belongs to.
References
More filters
Journal ArticleDOI

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI

LIBSVM: A library for support vector machines

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Journal ArticleDOI

Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008.

TL;DR: The results for 20 world regions are presented, summarizing the global patterns for the eight most common cancers, and striking differences in the patterns of cancer from region to region are observed.
Book

An Introduction to Support Vector Machines and Other Kernel-based Learning Methods

TL;DR: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory, and will guide practitioners to updated literature, new applications, and on-line software.
Journal ArticleDOI

Cd-hit

TL;DR: A new CD-HIT program accelerated with a novel parallelization strategy and some other techniques to allow efficient clustering of such datasets to reduce sequence redundancy and improve the performance of other sequence analyses is developed.
Related Papers (5)