scispace - formally typeset
Search or ask a question

Showing papers by "Toby J. Gibson published in 2008"


Journal ArticleDOI
TL;DR: The current state of linear motif biology is summarized, which uses low affinity interactions to create cooperative, combinatorial and highly dynamic regulatory protein complexes, which suggest that models for cell regulatory networks in systems biology should neither be overly dependent on stochastic nor on smooth deterministic approximations.
Abstract: It is now clear that a detailed picture of cell regulation requires a comprehensive understanding of the abundant short protein motifs through which signaling is channeled. The current body of knowledge has slowly accumulated through piecemeal experimental investigation of individual motifs in signaling. Computational methods contributed little to this process. A new generation of bioinformatics tools will aid the future investigation of motifs in regulatory proteins, and the disordered polypeptide regions in which they frequently reside. Allied to high throughput methods such as phosphoproteomics, signaling networks are becoming amenable to experimental deconstruction. In this review, we summarise the current state of linear motif biology, which uses low affinity interactions to create cooperative, combinatorial and highly dynamic regulatory protein complexes. The discrete deterministic properties implicit to these assemblies suggest that models for cell regulatory networks in systems biology should neither be overly dependent on stochastic nor on smooth deterministic approximations.

317 citations


Journal ArticleDOI
TL;DR: The localization, structure, and binding specificity of this protein, which is named malectin, open the way to studies of its role in the genesis, processing and secretion of N-glycosylated proteins.
Abstract: N-Glycosylation starts in the endoplasmic reticulum (ER) where a 14-sugar glycan composed of three glucoses, nine mannoses, and two N-acetylglucosamines (Glc(3)Man(9)GlcNAc(2)) is transferred to nascent proteins. The glucoses are sequentially trimmed by ER-resident glucosidases. The Glc(3)Man(9)GlcNAc(2) moiety is the substrate for oligosaccharyltransferase; the Glc(1)Man(9)GlcNAc(2) and Man(9)GlcNAc(2) intermediates are signals for glycoprotein folding and quality control in the calnexin/calreticulin cycle. Here, we report a novel membrane-anchored ER protein that is highly conserved in animals and that recognizes the Glc(2)-N-glycan. Structure determination by nuclear magnetic resonance showed that its luminal part is a carbohydrate binding domain that recognizes glucose oligomers. Carbohydrate microarray analyses revealed a uniquely selective binding to a Glc(2)-N-glycan probe. The localization, structure, and binding specificity of this protein, which we have named malectin, open the way to studies of its role in the genesis, processing and secretion of N-glycosylated proteins.

232 citations


Journal ArticleDOI
TL;DR: The new, greater focus on proteins that are in some way normally unstructured promises to provide a greater understanding of protein function, particularly with respect to protein–protein interactions.

79 citations


ComponentDOI
TL;DR: In this paper, a novel membrane-anchored endoplasmic reticulum (ER) protein named malectin was reported, which is highly conserved in animals and that recognizes the Glc(2)-N-glycan.
Abstract: N-Glycosylation starts in the endoplasmic reticulum (ER) where a 14-sugar glycan composed of three glucoses, nine mannoses, and two N-acetylglucosamines (Glc(3)Man(9)GlcNAc(2)) is transferred to nascent proteins. The glucoses are sequentially trimmed by ER-resident glucosidases. The Glc(3)Man(9)GlcNAc(2) moiety is the substrate for oligosaccharyltransferase; the Glc(1)Man(9)GlcNAc(2) and Man(9)GlcNAc(2) intermediates are signals for glycoprotein folding and quality control in the calnexin/calreticulin cycle. Here, we report a novel membrane-anchored ER protein that is highly conserved in animals and that recognizes the Glc(2)-N-glycan. Structure determination by nuclear magnetic resonance showed that its luminal part is a carbohydrate binding domain that recognizes glucose oligomers. Carbohydrate microarray analyses revealed a uniquely selective binding to a Glc(2)-N-glycan probe. The localization, structure, and binding specificity of this protein, which we have named malectin, open the way to studies of its role in the genesis, processing and secretion of N-glycosylated proteins.

51 citations


Journal ArticleDOI
TL;DR: The conservation score improves the prediction of linear motifs, by discarding those matches that are unlikely to be functional because they have not been conserved during the evolution of the protein sequences.
Abstract: The structure of many eukaryotic cell regulatory proteins is highly modular. They are assembled from globular domains, segments of natively disordered polypeptides and short linear motifs. The latter are involved in protein interactions and formation of regulatory complexes. The function of such proteins, which may be difficult to define, is the aggregate of the subfunctions of the modules. It is therefore desirable to efficiently predict linear motifs with some degree of accuracy, yet sequence database searches return results that are not significant.

50 citations


Journal ArticleDOI
TL;DR: KEN-box enrichment with cell cycle Gene Ontology terms suggests that collectively these motifs are functional but does not prove that any given instance is so, and suggests that KEN-boxes might be more common than reported.
Abstract: Motivation: KEN-box-mediated target selection is one of the mechanisms used in the proteasomal destruction of mitotic cell cycle proteins via the APC/C complex. While annotating the Eukaryotic Linear Motif resource (ELM, http://elm.eu.org/), we found that KEN motifs were significantly enriched in human protein entries with cell cycle keywords in the UniProt/Swiss-Prot database—implying that KEN-boxes might be more common than reported. Results: Matches to short linear motifs in protein database searches are not, per se, significant. KEN-box enrichment with cell cycle Gene Ontology terms suggests that collectively these motifs are functional but does not prove that any given instance is so. Candidates were surveyed for native disorder prediction using GlobPlot and IUPred and for motif conservation in homologues. Among >25 strong new candidates, the most notable are human HIPK2, CHFR, CDC27, Dab2, Upf2, kinesin Eg5, DNA Topoisomerase 1 and yeast Cdc5 and Swi5. A similar number of weaker candidates were present. These proteins have yet to be tested for APC/C targeted destruction, providing potential new avenues of research. Contact: toby.gibson@embl.de Supplementary information: Tables of KEN-box candidates and keyword/conservation significance assessments are available as supplementary data at Bioinformatics online.

41 citations


Journal ArticleDOI
TL;DR: None of the programs currently available is capable of reliably aligning LMs in distantly related sequences and a number of specific problems are highlighted.
Abstract: Linear motifs (LMs) are abundant short regulatory sites used for modulating the functions of many eukaryotic proteins. They play important roles in post-translational modification, cell compartment targeting, docking sites for regulatory complex assembly and protein processing and cleavage. Methods for LM detection are now being developed that are strongly dependent on scores for motif conservation in homologous proteins. However, most LMs are found in natively disordered polypeptide segments that evolve rapidly, unhindered by structural constraints on the sequence. These regions of modular proteins are difficult to align using classical multiple sequence alignment programs that are specifically optimised to align the globular domains. As a consequence, poor motif alignment quality is hindering efforts to detect new LMs. We have developed a new benchmark, as part of the BAliBASE suite, designed to assess the ability of standard multiple alignment methods to detect and align LMs. The reference alignments are organised into different test sets representing real alignment problems and contain examples of experimentally verified functional motifs, extracted from the Eukaryotic Linear Motif (ELM) database. The benchmark has been used to evaluate and compare a number of multiple alignment programs. With distantly related proteins, the worst alignment program correctly aligns 48% of LMs compared to 73% for the best program. However, the performance of all the programs is adversely affected by the introduction of other sequences containing false positive motifs. The ranking of the alignment programs based on LM alignment quality is similar to that observed when considering full-length protein alignments, however little correlation was observed between LM and overall alignment quality for individual alignment test cases. We have shown that none of the programs currently available is capable of reliably aligning LMs in distantly related sequences and we have highlighted a number of specific problems. The results of the tests suggest possible ways to improve program accuracy for difficult, divergent sequences.

32 citations


Reference EntryDOI
15 Mar 2008
TL;DR: This article presents a meta-modular model of protein architecture and its applications to nonglobular domains, and some of the methods for finding protein disorder and its implications are described.
Abstract: Originally published in: Modular Protein Domains. Edited by Giovanni Cesareni, Mario Gimona, Marius Sudol and Michael Yaffe. Copyright © 2005 Wiley-VCH Verlag GmbH & Co. KGaA Weinheim. Print ISBN: 3-527-30813-2 The sections in this article are Introduction Protein Architecture: Sequence, Structure, and Function The Modular Model of Protein Function Partitioning of Protein Space Analyzing Globular Domains Globularity of Domains Resources for Analysis of Globular Domains SMART: Simple Modular Architecture Research Tool The SMART Alignment Set SMART Relational Database System Web Interface Application of SMART Other Features and Resources Globular Repeats Domain Interaction Prediction No Domains? Analyzing Nonglobular Protein Segments Unstructured Regions: Protein Disorder What Role Does Protein Disorder Play in Biology? What is Protein Disorder? Methods for Finding Protein Disorder GlobPlotting Prediction of Multiple Types of Disorder with DisEMBL Design of Protein Expression Vectors Function Prediction for Nonglobular Protein Segments Available Resources The Eukaryotic Linear Motif Resource: ELM ELM Annotation – ‘Site seeing’ ELM Resource Architecture Knowledge-based Decision Support (KBDS): ELM Filtering Using ELM URLs Conclusions Acknowledgements Keywords: modular protein domains; computational analysis; protein architecture; sequence; structure; function; analyzing globular domains; SMART: Simple Modular Architecture Research Tool; analyzing nonglobular domains; URLs

3 citations