scispace - formally typeset
Open AccessJournal ArticleDOI

Proposing a highly accurate protein structural class predictor using segmentation-based features.

TLDR
By proposing segmented distribution and segmented auto covariance feature extraction methods to capture local and global discriminatory information from evolutionary profiles and predicted secondary structure of the proteins, this study is able to enhance the protein structural class prediction performance significantly.
Abstract
Prediction of the structural classes of proteins can provide important information about their functionalities as well as their major tertiary structures. It is also considered as an important step towards protein structure prediction problem. Despite all the efforts have been made so far, finding a fast and accurate computational approach to solve protein structural class prediction problem still remains a challenging problem in bioinformatics and computational biology. In this study we propose segmented distribution and segmented auto covariance feature extraction methods to capture local and global discriminatory information from evolutionary profiles and predicted secondary structure of the proteins. By applying SVM to our extracted features, for the first time we enhance the protein structural class prediction accuracy to over 90% and 85% for two popular low-homology benchmarks that have been widely used in the literature. We report 92.2% and 86.3% prediction accuracies for 25PDB and 1189 benchmarks which are respectively up to 7.9% and 2.8% better than previously reported results for these two benchmarks. By proposing segmented distribution and segmented auto covariance feature extraction methods to capture local and global discriminatory information from evolutionary profiles and predicted secondary structure of the proteins, we are able to enhance the protein structural class prediction performance significantly.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Systems biology study of transcriptional and post-transcriptional co-regulatory network sheds light on key regulators involved in important biological processes in Citrus sinensis

TL;DR: Combined systems biology approaches such as mining the literatures, various databases and network reconstruction, analysis, and visualization tools were employed to infer and visualize the coregulatory relationships between miRNAs and transcriptional regulators in Citrus sinensis to shed light on precisely identifying C. sinensis metabolic pathways key switches.
Book ChapterDOI

New Developments in Sugarcane Genetics and Genomics

TL;DR: Among the strategies discussed are the use of single-nucleotide polymorphisms (SNPs) and bacterial artificial chromosome (BAC) libraries and the analysis of the syntenic relationships with related species (maize, sorghum, and rice).
Book ChapterDOI

A Simplified Complex Network-Based Approach to mRNA and ncRNA Transcript Classification

TL;DR: In this paper, a simplified and efficient complex network-based approach for the classification of mRNA and ncRNA sequences was proposed, which achieved an average accuracy of 98% in the classification.
Journal ArticleDOI

A Review on Saponin Biosynthesis and its Transcriptomic Resources in Medicinal Plants

TL;DR: The significance of unexplored plants which are rich sources of saponins in future studies and the transcriptome resources of important medicinal plants for identifying genes related to its biosynthesis are discussed in the current review.
Book ChapterDOI

Computational Prediction of Lysine Pupylation Sites in Prokaryotic Proteins Using Position Specific Scoring Matrix into Bigram for Feature Extraction

TL;DR: A new predictor, PSSM-PUP, is proposed that uses evolutionary information of amino acids to predict pupylated lysine residues and has demonstrated improvement in performance compared to other existing predictors using the benchmark dataset from Pupdb Database.
References
More filters
Journal ArticleDOI

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI

LIBSVM: A library for support vector machines

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Book

The Nature of Statistical Learning Theory

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Journal ArticleDOI

The Protein Data Bank

TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.
Journal ArticleDOI

SCOP: a structural classification of proteins database for the investigation of sequences and structures.

TL;DR: This database provides a detailed and comprehensive description of the structural and evolutionary relationships of the proteins of known structure and provides for each entry links to co-ordinates, images of the structure, interactive viewers, sequence data and literature references.
Related Papers (5)