scispace - formally typeset
Search or ask a question
Author

Toby J. Gibson

Bio: Toby J. Gibson is an academic researcher from European Bioinformatics Institute. The author has contributed to research in topics: Short linear motif & Eukaryotic Linear Motif resource. The author has an hindex of 78, co-authored 171 publications receiving 167371 citations. Previous affiliations of Toby J. Gibson include University of Rome Tor Vergata & University College Dublin.


Papers
More filters
Journal ArticleDOI
TL;DR: This work proposes that each Myb repeat consists of three alpha helices packed over a hydrophobic core which is built around the three highly conserved tryptophan residues.
Abstract: Myb-related proteins from plants to humans are characterized by a DNA-binding domain which contains two to three imperfect repeats of approximately 50 amino acids each. Based on the evolutionary conservation of specific residues, secondary structural predictions suggest an arrangement of alpha helices homologous to that seen in the homeodomains, members of the helix-turn-helix family of DNA-binding proteins. We have used molecular modelling in conjunction with site-directed mutagenesis to test the feasibility of this structure. We propose that each Myb repeat consists of three alpha helices packed over a hydrophobic core which is built around the three highly conserved tryptophan residues. The C-terminal helix forms part of the helix-turn-helix motif and can be positioned into the major groove of B-form DNA, allowing prediction of residues critical for specificity of interaction. Modelling also allowed positioning of adjacent repeats around the major groove over an 8 bp binding site.

80 citations

Journal ArticleDOI
TL;DR: It is concluded that there is no strong evidence against the octaploidy, provided that consecutive genome duplication was rapid, and the sequence trees are weakly supportive of ancient octaPloidy.
Abstract: Vertebrate genomes are larger than invertebrates and show evidence of extensive gene duplication, including many collinear chromosomal segments. On the basis of this intra-genomic synteny, it has been proposed that two rounds of whole genome duplication (octaploidy) occurred early in the vertebrate lineage. Recently, this early vertebrate octaploidy has been challenged on the basis of gene trees. We report new linkage groups encompassing the matrilin (MATN), syndecan (SDC), Eyes Absent (EYA), HCK kinase and SRC kinase paralogous gene quartets. In contrast to other studies, the sequence trees are weakly supportive of ancient octaploidy. It is concluded that there is no strong evidence against the octaploidy, provided that consecutive genome duplication was rapid.

79 citations

Journal ArticleDOI
TL;DR: The guidelines of this collaborative effort, the current status of contributed data, and the PDBe-KB infrastructure, which includes the data exchange format, the deposition system for added value annotations, the distributable database containing the assembledData, and programmatic access endpoints are described.
Abstract: The Protein Data Bank in Europe-Knowledge Base (PDBe-KB, https://pdbe-kb.org) is a community-driven, collaborative resource for literature-derived, manually curated and computationally predicted structural and functional annotations of macro-molecular structure data, contained in the Protein Data Bank (PDB). The goal of PDBe-KB is two-fold: (i) to increase the visibility and reduce the fragmentation of annotations contributed by specialist data resources, and to make these data more findable, accessible, interoperable and reusable (FAIR) and (ii) to place macromolecular structure data in their biological context, thus facilitating their use by the broader scientific community in fundamental and applied research. Here, we describe the guidelines of this collaborative effort, the current status of contributed data, and the PDBe-KB infrastructure, which includes the data exchange format, the deposition system for added value annotations, the distributable database containing the assembled data, and programmatic access endpoints. We also describe a series of novel web-pages-the PDBe-KB aggregated views of structure data-which combine information on macromolecular structures from many PDB entries. We have recently released the first set of pages in this series, which provide an overview of available structural and functional information for a protein of interest, referenced by a UniProtKB accession.

79 citations

Journal ArticleDOI
TL;DR: The new, greater focus on proteins that are in some way normally unstructured promises to provide a greater understanding of protein function, particularly with respect to protein–protein interactions.

79 citations

Journal ArticleDOI
16 May 2019-PLOS ONE
TL;DR: It is shown that OSCP1, which has previously been implicated in two distinct non-ciliary processes, causes ciliogenic and ciliopathy-associated tissue phenotypes when depleted in zebrafish.
Abstract: The cilium is an essential organelle at the surface of mammalian cells whose dysfunction causes a wide range of genetic diseases collectively called ciliopathies. The current rate at which new ciliopathy genes are identified suggests that many ciliary components remain undiscovered. We generated and rigorously analyzed genomic, proteomic, transcriptomic and evolutionary data and systematically integrated these using Bayesian statistics into a predictive score for ciliary function. This resulted in 285 candidate ciliary genes. We generated independent experimental evidence of ciliary associations for 24 out of 36 analyzed candidate proteins using multiple cell and animal model systems (mouse, zebrafish and nematode) and techniques. For example, we show that OSCP1, which has previously been implicated in two distinct non-ciliary processes, causes ciliogenic and ciliopathy-associated tissue phenotypes when depleted in zebrafish. The candidate list forms the basis of CiliaCarta, a comprehensive ciliary compendium covering 956 genes. The resource can be used to objectively prioritize candidate genes in whole exome or genome sequencing of ciliopathy patients and can be accessed at http://bioinformatics.bio.uu.nl/john/syscilia/ciliacarta/.

77 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Abstract: The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSIBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.

70,111 citations

Journal ArticleDOI
TL;DR: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved and modifications are incorporated into a new program, CLUSTAL W, which is freely available.
Abstract: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.

63,427 citations

Journal ArticleDOI
TL;DR: ClUSTAL X is a new windows interface for the widely-used progressive multiple sequence alignment program CLUSTAL W, providing an integrated system for performing multiple sequence and profile alignments and analysing the results.
Abstract: CLUSTAL X is a new windows interface for the widely-used progressive multiple sequence alignment program CLUSTAL W. The new system is easy to use, providing an integrated system for performing multiple sequence and profile alignments and analysing the results. CLUSTAL X displays the sequence alignment in a window on the screen. A versatile sequence colouring scheme allows the user to highlight conserved features in the alignment. Pull-down menus provide all the options required for traditional multiple sequence and profile alignment. New features include: the ability to cut-and-paste sequences to change the order of the alignment, selection of a subset of the sequences to be realigned, and selection of a sub-range of the alignment to be realigned and inserted back into the original alignment. Alignment quality analysis can be performed and low-scoring segments or exceptional residues can be highlighted. Quality analysis and realignment of selected residue ranges provide the user with a powerful tool to improve and refine difficult alignments and to trap errors in input sequences. CLUSTAL X has been compiled on SUN Solaris, IRIX5.3 on Silicon Graphics, Digital UNIX on DECstations, Microsoft Windows (32 bit) for PCs, Linux ELF for x86 PCs, and Macintosh PowerMac.

38,522 citations

Journal ArticleDOI
TL;DR: MUSCLE is a new computer program for creating multiple alignments of protein sequences that includes fast distance estimation using kmer counting, progressive alignment using a new profile function the authors call the log-expectation score, and refinement using tree-dependent restricted partitioning.
Abstract: We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the logexpectation score, and refinement using treedependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

37,524 citations

Journal ArticleDOI
TL;DR: Two unusual extensions are presented: Multiscale, which adds the ability to visualize large‐scale molecular assemblies such as viral coats, and Collaboratory, which allows researchers to share a Chimera session interactively despite being at separate locales.
Abstract: The design, implementation, and capabilities of an extensible visualization system, UCSF Chimera, are discussed. Chimera is segmented into a core that provides basic services and visualization, and extensions that provide most higher level functionality. This architecture ensures that the extension mechanism satisfies the demands of outside developers who wish to incorporate new features. Two unusual extensions are presented: Multiscale, which adds the ability to visualize large-scale molecular assemblies such as viral coats, and Collaboratory, which allows researchers to share a Chimera session interactively despite being at separate locales. Other extensions include Multalign Viewer, for showing multiple sequence alignments and associated structures; ViewDock, for screening docked ligand orientations; Movie, for replaying molecular dynamics trajectories; and Volume Viewer, for display and analysis of volumetric data. A discussion of the usage of Chimera in real-world situations is given, along with anticipated future directions. Chimera includes full user documentation, is free to academic and nonprofit users, and is available for Microsoft Windows, Linux, Apple Mac OS X, SGI IRIX, and HP Tru64 Unix from http://www.cgl.ucsf.edu/chimera/.

35,698 citations