scispace - formally typeset
Open AccessJournal ArticleDOI

PSSM-based prediction of DNA binding sites in proteins.

Reads0
Chats0
TLDR
A neural network based algorithm is implemented to utilize evolutionary information of amino acid sequences in terms of their position specific scoring matrices (PSSMs) for a better prediction of DNA-binding sites.
Abstract
Detection of DNA-binding sites in proteins is of enormous interest for technologies targeting gene regulation and manipulation. We have previously shown that a residue and its sequence neighbor information can be used to predict DNA-binding candidates in a protein sequence. This sequence-based prediction method is applicable even if no sequence homology with a previously known DNA-binding protein is observed. Here we implement a neural network based algorithm to utilize evolutionary information of amino acid sequences in terms of their position specific scoring matrices (PSSMs) for a better prediction of DNA-binding sites. An average of sensitivity and specificity using PSSMs is up to 8.7% better than the prediction with sequence information only. Much smaller data sets could be used to generate PSSM with minimal loss of prediction accuracy. One problem in using PSSM-derived prediction is obtaining lengthy and time-consuming alignments against large sequence databases. In order to speed up the process of generating PSSMs, we tried to use different reference data sets (sequence space) against which a target protein is scanned for PSI-BLAST iterations. We find that a very small set of proteins can actually be used as such a reference data without losing much of the prediction value. This makes the process of generating PSSMs very rapid and even amenable to be used at a genome level. A web server has been developed to provide these predictions of DNA-binding sites for any new protein from its amino acid sequence. Online predictions based on this method are available at http://www.netasa.org/dbs-pssm/

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences.

TL;DR: BindN provides a useful tool for understanding the function of DNA and RNA-binding proteins based on primary sequence data and the SVM models appear to be more accurate and more efficient for online predictions.
Journal ArticleDOI

DP-Bind: a web server for sequence-based prediction of DNA-binding residues in DNA-binding proteins

TL;DR: DP-Bind is a web server for predicting DNA-binding sites in a DNA- binding protein from its amino acid sequence that implements three machine learning methods: support vector machine, kernel logistic regression and penalized logistic regressors.
Journal ArticleDOI

BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features

TL;DR: In this article, several new descriptors of evolutionary information have been developed and evaluated for sequence-based prediction of DNA and RNA-binding residues using support vector machines (SVMs).

BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features

TL;DR: A web-based tool is developed to make the SVM classifiers accessible to the research community and suggest that predictions at this level of accuracy may provide useful information for modelling protein-nucleic acid interactions in systems biology studies.
Journal ArticleDOI

Sequence and structural features of carbohydrate binding in proteins and assessment of predictability using a neural network.

TL;DR: Statistics of amino acid compositions in binding versus non-binding regions- general as well as in each different secondary structure conformation are compiled and neural networks give a moderate success of prediction, which is expected to improve when structures of more protein-carbohydrate complexes become available in future.
References
More filters
Journal ArticleDOI

Basic Local Alignment Search Tool

TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Journal ArticleDOI

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI

The Protein Data Bank

TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.
Journal ArticleDOI

Protein secondary structure prediction based on position-specific scoring matrices

TL;DR: A two-stage neural network has been used to predict protein secondary structure based on the position specific scoring matrices generated by PSI-BLAST and achieved an average Q3 score of between 76.5% to 78.3% depending on the precise definition of observed secondary structure used, which is the highest published score for any method to date.
Journal ArticleDOI

Application of multiple sequence alignment profiles to improve protein secondary structure prediction

TL;DR: The effect of training a neural network secondary structure prediction algorithm with different types of multiple sequence alignment profiles derived from the same sequences, is shown to provide a range of accuracy from 70.5% to 76.4%.
Related Papers (5)