PSSM-based prediction of DNA binding sites in proteins.

doi:10.1186/1471-2105-6-33

Open AccessJournal ArticleDOI

PSSM-based prediction of DNA binding sites in proteins.

Shandar Ahmad, +2 more

- 19 Feb 2005 -

BMC Bioinformatics

- Vol. 6, Iss: 1, pp 33-33

Chats0

TLDR

A neural network based algorithm is implemented to utilize evolutionary information of amino acid sequences in terms of their position specific scoring matrices (PSSMs) for a better prediction of DNA-binding sites.

Abstract:

Detection of DNA-binding sites in proteins is of enormous interest for technologies targeting gene regulation and manipulation. We have previously shown that a residue and its sequence neighbor information can be used to predict DNA-binding candidates in a protein sequence. This sequence-based prediction method is applicable even if no sequence homology with a previously known DNA-binding protein is observed. Here we implement a neural network based algorithm to utilize evolutionary information of amino acid sequences in terms of their position specific scoring matrices (PSSMs) for a better prediction of DNA-binding sites. An average of sensitivity and specificity using PSSMs is up to 8.7% better than the prediction with sequence information only. Much smaller data sets could be used to generate PSSM with minimal loss of prediction accuracy. One problem in using PSSM-derived prediction is obtaining lengthy and time-consuming alignments against large sequence databases. In order to speed up the process of generating PSSMs, we tried to use different reference data sets (sequence space) against which a target protein is scanned for PSI-BLAST iterations. We find that a very small set of proteins can actually be used as such a reference data without losing much of the prediction value. This makes the process of generating PSSMs very rapid and even amenable to be used at a genome level. A web server has been developed to provide these predictions of DNA-binding sites for any new protein from its amino acid sequence. Online predictions based on this method are available at http://www.netasa.org/dbs-pssm/

Citations

PDF

Open Access

More filters

Journal ArticleDOI

BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences.

Liangjiang Wang, +1 more

- 01 Jul 2006 -

Nucleic Acids Research

TL;DR: BindN provides a useful tool for understanding the function of DNA and RNA-binding proteins based on primary sequence data and the SVM models appear to be more accurate and more efficient for online predictions.

...read moreread less

Journal ArticleDOI

DP-Bind: a web server for sequence-based prediction of DNA-binding residues in DNA-binding proteins

Seungwoo Hwang, +2 more

- 10 Feb 2007 -

Bioinformatics

TL;DR: DP-Bind is a web server for predicting DNA-binding sites in a DNA- binding protein from its amino acid sequence that implements three machine learning methods: support vector machine, kernel logistic regression and penalized logistic regressors.

...read moreread less

Journal ArticleDOI

BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features

Liangjiang Wang, +6 more

- 28 May 2010 -

BMC Systems Biology

TL;DR: In this article, several new descriptors of evolutionary information have been developed and evaluated for sequence-based prediction of DNA and RNA-binding residues using support vector machines (SVMs).

...read moreread less

BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features

Liangjiang Wang, +6 more

TL;DR: A web-based tool is developed to make the SVM classifiers accessible to the research community and suggest that predictions at this level of accuracy may provide useful information for modelling protein-nucleic acid interactions in systems biology studies.

...read moreread less

Journal ArticleDOI

Sequence and structural features of carbohydrate binding in proteins and assessment of predictability using a neural network.

Adeel Malik, +1 more

- 03 Jan 2007 -

BMC Structural Biology

TL;DR: Statistics of amino acid compositions in binding versus non-binding regions- general as well as in each different secondary structure conformation are compiled and neural networks give a moderate success of prediction, which is expected to improve when structures of more protein-carbohydrate complexes become available in future.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Basic Local Alignment Search Tool

Stephen F. Altschul, +4 more

- 01 Oct 1990 -

Journal of Molecular Biology

TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.

...read moreread less

Journal ArticleDOI

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Stephen F. Altschul, +6 more

- 01 Sep 1997 -

Nucleic Acids Research

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.

...read moreread less

Journal ArticleDOI

The Protein Data Bank

Helen M. Berman, +7 more

- 01 Jan 2000 -

Nucleic Acids Research

TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.

...read moreread less

Journal ArticleDOI

Protein secondary structure prediction based on position-specific scoring matrices

David T. Jones

- 17 Sep 1999 -

Journal of Molecular Biology

TL;DR: A two-stage neural network has been used to predict protein secondary structure based on the position specific scoring matrices generated by PSI-BLAST and achieved an average Q3 score of between 76.5% to 78.3% depending on the precise definition of observed secondary structure used, which is the highest published score for any method to date.

...read moreread less

Journal ArticleDOI

Application of multiple sequence alignment profiles to improve protein secondary structure prediction

James Cuff, +1 more

- 15 Aug 2000 -

Proteins

TL;DR: The effect of training a neural network secondary structure prediction algorithm with different types of multiple sequence alignment profiles derived from the same sequences, is shown to provide a range of accuracy from 70.5% to 76.4%.

...read moreread less