scispace - formally typeset
Search or ask a question
Author

Toby J. Gibson

Bio: Toby J. Gibson is an academic researcher from European Bioinformatics Institute. The author has contributed to research in topics: Short linear motif & Eukaryotic Linear Motif resource. The author has an hindex of 78, co-authored 171 publications receiving 167371 citations. Previous affiliations of Toby J. Gibson include University of Rome Tor Vergata & University College Dublin.


Papers
More filters
Journal ArticleDOI
TL;DR: A project to comprehensively annotate ciliary genes of the laboratory mouse using Gene Ontology (GO) terms to describe their molecular functions, biological roles, and cellular locations to help better understand the similarities and differences between mouse and human in the role of Sonic hedgehog signaling in development.
Abstract: Interest in primary cilia has increased dramatically over the last ten years as it has become clear that ciliopathies are an underlying cause of numerous human diseases including some types of retinitis pigmentosa and polycystic kidney disease. Once thought to be restricted to a few cell types, it is now clear that primary cilia are found on almost all vertebrate cells and are critical to Sonic hedgehog (Shh) signaling. Mouse models play a key role in developing our understanding of the role of primary cilia in control of Shh signaling in development throughout the embryo and in ongoing maintenance of structures such as photoreceptors. To maximize the utility of the wealth of experimental data generated by these mouse ciliopathy models, we have initiated a project to comprehensively annotate ciliary genes of the laboratory mouse using Gene Ontology (GO) terms to describe their molecular functions, biological roles, and cellular locations. We are guided by the SysCilia gold standard of known human ciliary components as a starting point, but will also include additional genes experimentally shown to be involved in ciliary function in the mouse. If needed, we will also update the Gene Ontology to add new terms representing recent advances in our understanding of ciliary biology. Comprehensive GO annotation of ciliary genes in the mouse will be a great resource to those doing high throughput studies or comparative genomic analysis across species, and may help us better understand the similarities and differences between mouse and human in the role of Sonic hedgehog signaling in development. This work is funded by HG 002273 to the Gene Ontology Consortium. GENEONTOLOGY Unifying Biology ID\t\r Mapping\t\r at\t\r hYp://www.uniprot.org/ 307\t\r mouse\t\r genes\t\r on\t\r cilia\t\r candidate\t\r list AddiHonal\t\r research\t\r for\t\r human\t\r gene\t\r with\t\r no\t\r mouse\t\r homologs\t\r via\t\r MouseMine IdenHficaHon\t\r of\t\r mouse\t\r homologs\t\r \t\r with\t\r MouseMine\t\r -­‐\t\r hYp://www.mousemine.org/

1 citations

Posted ContentDOI
21 Apr 2023-bioRxiv
TL;DR: LeishMANIAdb as mentioned in this paper is a database specifically designed to investigate how Leishmania virulence factors may interfere with host proteins, which can provide new insights in the molecular mechanisms of infection and help to identify new therapeutic targets for this neglected disease.
Abstract: Leishmaniasis is a detrimental disease causing serious changes in quality of life and some forms lead to death. The disease is spread by the parasite Leishmania transmitted by sandfly vectors and their primary hosts are vertebrates including humans. The pathogen penetrates host cells and secretes proteins (the secretome) to repurpose cells for pathogen growth and to alter cell signaling via host-pathogen Protein-Protein Interactions (PPIs). Here we present LeishMANIAdb, a database specifically designed to investigate how Leishmania virulence factors may interfere with host proteins. Since the secretomes of different Leishmania species are only partially characterized, we collected various experimental evidence and used computational predictions to identify Leishmania secreted proteins to generate a user-friendly unified web resource allowing users to access all information available on experimental and predicted secretomes. In addition, we manually annotated host-pathogen interactions of 211 proteins, and the localization/function of 3764 transmembrane (TM) proteins of different Leishmania species. We also enriched all proteins with automatic structural and functional predictions that can provide new insights in the molecular mechanisms of infection. Our database, available at https://leishmaniadb.ttk.hu may provide novel insights into Leishmania host-pathogen interactions and help to identify new therapeutic targets for this neglected disease.

1 citations

Journal ArticleDOI
TL;DR: In this paper , the authors used DisProt to evaluate the transferability of the annotation terms to orthologous proteins and made multiple sequence alignment (MSAs) for each protein and their orthologs.
Abstract: Background DisProt is the primary repository of Intrinsically Disordered Proteins. This database is manually curated and the annotations there have strong experimental support. Currently DisProt contains a relatively small number of proteins highlighting the importance of transferring verified disorder and other annotations, in such a way as to increase the number of proteins that could benefit from this valuable information. While the principles and practicalities of homology transfer are well-established for globular proteins, these are largely lacking for disordered proteins. Methods We used DisProt to evaluate the transferability of the annotation terms to orthologous proteins. For each protein, we looked for their orthologs, with the assumption that they will have a similar function. Then, for each protein and their orthologs we made multiple sequence alignments (MSAs). Global and regional quality of the MSAs was evaluated with the NorMD score. Results We have designed a pipeline to obtain good quality MSAs and to transfer annotations from any protein to their orthologs. Applying the pipeline to DisProt proteins, from the 1931 entries with 5,623 annotations we can reach 97,555 orthologs and transfer a total of 301,190 terms by homology. We also provide a web server for consulting the results of DisProt proteins and execute the pipeline for any other protein. The server Homology Transfer IDP (HoTIDP) is accessible at http://hotidp.leloir.org.ar.

1 citations

Journal ArticleDOI
TL;DR: In this paper , a small motif (ConMot) in the linker of the MLH1-PMS2 endonuclease was found to be causative for Lynch syndrome.
Abstract: Abstract DNA mismatch repair (MMR) is essential for correction of DNA replication errors. Germline mutations of the human MMR gene MLH1 are the major cause of Lynch syndrome, a heritable cancer predisposition. In the MLH1 protein, a non-conserved, intrinsically disordered region connects two conserved, catalytically active structured domains of MLH1. This region has as yet been regarded as a flexible spacer, and missense alterations in this region have been considered non-pathogenic. However, we have identified and investigated a small motif (ConMot) in this linker which is conserved in eukaryotes. Deletion of the ConMot or scrambling of the motif abolished mismatch repair activity. A mutation from a cancer family within the motif (p.Arg385Pro) also inactivated MMR, suggesting that ConMot alterations can be causative for Lynch syndrome. Intriguingly, the mismatch repair defect of the ConMot variants could be restored by addition of a ConMot peptide containing the deleted sequence. This is the first instance of a DNA mismatch repair defect conferred by a mutation that can be overcome by addition of a small molecule. Based on the experimental data and AlphaFold2 predictions, we suggest that the ConMot may bind close to the C-terminal MLH1-PMS2 endonuclease and modulate its activation during the MMR process.
Posted ContentDOI
07 Jul 2023-bioRxiv
TL;DR: In this article , the pathogenic tropical flagellates Leishmania belong to an early-branching eukaryotic lineage (Kinetoplastida) with several unique features.
Abstract: The pathogenic tropical flagellates Leishmania belong to an early-branching eukaryotic lineage (Kinetoplastida) with several unique features. Here, we explore three ancient protein targeting linear motif systems and their receptors and demonstrate how they resemble or differ from other eukaryotic organisms, including their hosts. Secretory signal peptides, endoplasmic reticulum (ER) retention motifs (KDEL motifs), and autophagy signals (motifs interacting with ATG8 family members) are essential components of cellular life. Although expected to be conserved, we observe that all three systems show a varying degree of divergence from the eukaryotic version observed in animals, plants, or fungi. We not only describe their behavior but also build predictive models that allow the prediction of localization or function for several proteins in Leishmania species for the first time. Several of these critical protein-protein interactions could serve as targets of selective antimicrobial agents against Leishmaniasis due to their divergence from the host.

Cited by
More filters
Journal ArticleDOI
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Abstract: The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSIBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.

70,111 citations

Journal ArticleDOI
TL;DR: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved and modifications are incorporated into a new program, CLUSTAL W, which is freely available.
Abstract: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.

63,427 citations

Journal ArticleDOI
TL;DR: ClUSTAL X is a new windows interface for the widely-used progressive multiple sequence alignment program CLUSTAL W, providing an integrated system for performing multiple sequence and profile alignments and analysing the results.
Abstract: CLUSTAL X is a new windows interface for the widely-used progressive multiple sequence alignment program CLUSTAL W. The new system is easy to use, providing an integrated system for performing multiple sequence and profile alignments and analysing the results. CLUSTAL X displays the sequence alignment in a window on the screen. A versatile sequence colouring scheme allows the user to highlight conserved features in the alignment. Pull-down menus provide all the options required for traditional multiple sequence and profile alignment. New features include: the ability to cut-and-paste sequences to change the order of the alignment, selection of a subset of the sequences to be realigned, and selection of a sub-range of the alignment to be realigned and inserted back into the original alignment. Alignment quality analysis can be performed and low-scoring segments or exceptional residues can be highlighted. Quality analysis and realignment of selected residue ranges provide the user with a powerful tool to improve and refine difficult alignments and to trap errors in input sequences. CLUSTAL X has been compiled on SUN Solaris, IRIX5.3 on Silicon Graphics, Digital UNIX on DECstations, Microsoft Windows (32 bit) for PCs, Linux ELF for x86 PCs, and Macintosh PowerMac.

38,522 citations

Journal ArticleDOI
TL;DR: MUSCLE is a new computer program for creating multiple alignments of protein sequences that includes fast distance estimation using kmer counting, progressive alignment using a new profile function the authors call the log-expectation score, and refinement using tree-dependent restricted partitioning.
Abstract: We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the logexpectation score, and refinement using treedependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

37,524 citations

Journal ArticleDOI
TL;DR: Two unusual extensions are presented: Multiscale, which adds the ability to visualize large‐scale molecular assemblies such as viral coats, and Collaboratory, which allows researchers to share a Chimera session interactively despite being at separate locales.
Abstract: The design, implementation, and capabilities of an extensible visualization system, UCSF Chimera, are discussed. Chimera is segmented into a core that provides basic services and visualization, and extensions that provide most higher level functionality. This architecture ensures that the extension mechanism satisfies the demands of outside developers who wish to incorporate new features. Two unusual extensions are presented: Multiscale, which adds the ability to visualize large-scale molecular assemblies such as viral coats, and Collaboratory, which allows researchers to share a Chimera session interactively despite being at separate locales. Other extensions include Multalign Viewer, for showing multiple sequence alignments and associated structures; ViewDock, for screening docked ligand orientations; Movie, for replaying molecular dynamics trajectories; and Volume Viewer, for display and analysis of volumetric data. A discussion of the usage of Chimera in real-world situations is given, along with anticipated future directions. Chimera includes full user documentation, is free to academic and nonprofit users, and is available for Microsoft Windows, Linux, Apple Mac OS X, SGI IRIX, and HP Tru64 Unix from http://www.cgl.ucsf.edu/chimera/.

35,698 citations