scispace - formally typeset
Open AccessJournal ArticleDOI

Twilight zone of protein sequence alignments.

Burkhard Rost
- 01 Feb 1999 - 
- Vol. 12, Iss: 2, pp 85-94
Reads0
Chats0
TLDR
All findings are applicable to automatic database searches and using intermediate sequences for finding links between more distant families was almost as successful: pairs were predicted to be homologous when the respective sequence families had proteins in common.
Abstract
Sequence alignments unambiguously distinguish between protein pairs of similar and non-similar structure when the pairwise sequence identity is high (>40% for long alignments). The signal gets blurred in the twilight zone of 20-35% sequence identity. Here, more than a million sequence alignments were analysed between protein pairs of known structures to re-define a line distinguishing between true and false positives for low levels of similarity. Four results stood out. (i) The transition from the safe zone of sequence alignment into the twilight zone is described by an explosion of false negatives. More than 95% of all pairs detected in the twilight zone had different structures. More precisely, above a cut-off roughly corresponding to 30% sequence identity, 90% of the pairs were homologous; below 25% less than 10% were. (ii) Whether or not sequence homology implied structural identity depended crucially on the alignment length. For example, if 10 residues were similar in an alignment of length 16 (>60%), structural similarity could not be inferred. (iii) The 'more similar than identical' rule (discarding all pairs for which percentage similarity was lower than percentage identity) reduced false positives significantly. (iv) Using intermediate sequences for finding links between more distant families was almost as successful: pairs were predicted to be homologous when the respective sequence families had proteins in common. All findings are applicable to automatic database searches.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling

TL;DR: The SWISS-MODEL workspace is a web-based integrated service dedicated to protein structure homology modelling that assists and guides the user in building protein homology models at different levels of complexity.
Journal ArticleDOI

Guidelines for the use and interpretation of assays for monitoring autophagy (3rd edition)

Daniel J. Klionsky, +2522 more
- 21 Jan 2016 - 
TL;DR: In this paper, the authors present a set of guidelines for the selection and interpretation of methods for use by investigators who aim to examine macro-autophagy and related processes, as well as for reviewers who need to provide realistic and reasonable critiques of papers that are focused on these processes.
Journal ArticleDOI

Comparative Protein Structure Modeling Using MODELLER

TL;DR: This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications.
Journal ArticleDOI

DNA–DNA hybridization values and their relationship to whole-genome sequence similarities

TL;DR: It is concluded that ANI can accurately replace DDH values for strains for which genome sequences are available and reveal extensive gene diversity within the current concept of "species".
Journal ArticleDOI

Comparative protein structure modeling of genes and genomes

TL;DR: There is a need to develop an automated, rapid, robust, sensitive, and accurate comparative modeling pipeline applicable to whole genomes and to encourage new kinds of applications for the many resulting models, based on their large number and completeness at the level of the family, organism, or functional network.
References
More filters
Journal ArticleDOI

Basic Local Alignment Search Tool

TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Journal ArticleDOI

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI

Identification of common molecular subsequences.

TL;DR: This letter extends the heuristic homology algorithm of Needleman & Wunsch (1970) to find a pair of segments, one from each of two long sequences, such that there is no other Pair of segments with greater similarity (homology).
Journal ArticleDOI

The Protein Data Bank: a computer-based archival file for macromolecular structures.

TL;DR: The Protein Data Bank is a computer-based archival file for macromolecular structures that stores in a uniform format atomic co-ordinates and partial bond connectivities, as derived from crystallographic studies.
Book ChapterDOI

Evolutionary Divergence and Convergence in Proteins

TL;DR: The evaluation of the amount of differences between two organisms as derived from sequences in structural genes or in their polypeptide translation is likely to lead to quantities different from those obtained on the basis of observations made at any other, higher level of biological integration.
Related Papers (5)