scispace - formally typeset
Search or ask a question
Posted ContentDOI

Kincore: a web resource for structural classification of protein kinases and their inhibitors

13 Feb 2021-bioRxiv (Cold Spring Harbor Laboratory)-
TL;DR: Kincore as mentioned in this paper is a web resource providing access to the conformational assignments based on our clustering along with labels for ligand types bound to each kinase chain in the PDB.
Abstract: Protein kinases exhibit significant structural diversity, primarily in the conformation of the activation loop and adjacent components of the active site. We previously performed a clustering of the conformation of the activation loop of all protein kinase structures in the Protein Data Bank (PDB) (Modi and Dunbrack, PNAS, 116:6818-6827, 2019) into 8 classes based on the location of the Phe side chain of the DFG motif at the N- terminus of the activation loop. This is determined with a distance metric that measures the difference in the dihedral angles that determine the placement of the Phe side chains (the φ,ψ of X, D, and F of the X-DFG motif and the χ1 of the Phe side chain). The nomenclature is based on the regions of the Ramachandran map occupied by the XDF residues and the χ1 rotamer of the Phe residue. All active structures are "BLAminus", while the most common inactive DFGin conformations are "BLBplus" and "ABAminus." Type 2 inhibitors bind almost exclusively to the DFGout "BBAminus" conformation. In this paper, we present Kincore (http://dunbrack.fccc.edu/kincore), a web resource providing access to the conformational assignments based on our clustering along with labels for ligand types (Type 1, Type 2, etc.) bound to each kinase chain in the PDB. The data are annotated with several properties including PDBid, Uniprotid, gene, protein name, phylogenetic group, spatial and dihedral labels for orientation of DFGmotif residues, C-helix disposition, ligand name and type. The user can browse and query the database using these attributes individually or perform advanced search using a combination of them like a phylogenetic group with specific conformational label and ligand type. The user can also upload a structure and determine its spatial and dihedral labels using the web server and or use a freely available standalone program. The entire database can be downloaded as text files and structure files in PyMOL sessions and mmCIF format. We believe that Kincore will help in understanding conformational dynamics of these proteins and guide development of inhibitors targeting specific states.
Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, the activation mechanisms and ligand binding alter the internal motions of kinases and enable allosteric coupling between distal regulatory regions and the active site, showing how activation mechanism and ligands change the internal motion of protein kinases.

4 citations

Posted ContentDOI
05 Feb 2022-bioRxiv
TL;DR: An expanded classification of RAS conformations is defined by clustering all 699 available human KRAS, NRAS, and HRAS structures in the PDB by the arrangement of their catalytic switch 1 (SW1) and switch 2 (SW2) loops.
Abstract: RAS (KRAS, NRAS, and HRAS) proteins have widespread command of cellular circuitry and are high-priority drug targets in cancers and other diseases. Effectively targeting RAS proteins requires an exact understanding of their active, inactive, and druggable conformations, and the structural impact of mutations. Here we define an expanded classification of RAS conformations by clustering all 699 available human KRAS, NRAS, and HRAS structures in the Protein Data Bank (PDB) by the arrangement of their catalytic switch 1 (SW1) and switch 2 (SW2) loops. This enabled us to clearly define the geometry of closely related RAS conformations, many of which were not previously described. We determined the catalytic impact of the most common RAS mutations and identified several novel druggable RAS conformations. Our study expands the topography of characterized RAS conformations and will help inform future structure-guided RAS drug design.

1 citations

References
More filters
Journal ArticleDOI
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Abstract: The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSIBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.

70,111 citations

01 Jan 2002

19,213 citations

Journal ArticleDOI
15 Jul 2021-Nature
TL;DR: For example, AlphaFold as mentioned in this paper predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture. But the accuracy is limited by the fact that no homologous structure is available.
Abstract: Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort1–4, the structures of around 100,000 unique proteins have been determined5, but this represents a small fraction of the billions of known protein sequences6,7. Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’8—has been an important open research problem for more than 50 years9. Despite recent progress10–14, existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14)15, demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.

10,601 citations

Journal ArticleDOI
06 Dec 2002-Science
TL;DR: The protein kinase complement of the human genome is catalogued using public and proprietary genomic, complementary DNA, and expressed sequence tag sequences to provide a starting point for comprehensive analysis of protein phosphorylation in normal and disease states and a detailed view of the current state of human genome analysis through a focus on one large gene family.
Abstract: We have catalogued the protein kinase complement of the human genome (the "kinome") using public and proprietary genomic, complementary DNA, and expressed sequence tag (EST) sequences. This provides a starting point for comprehensive analysis of protein phosphorylation in normal and disease states, as well as a detailed view of the current state of human genome analysis through a focus on one large gene family. We identify 518 putative protein kinase genes, of which 71 have not previously been reported or described as kinases, and we extend or correct the protein sequences of 56 more kinases. New genes include members of well-studied families as well as previously unidentified families, some of which are conserved in model organisms. Classification and comparison with model organism kinomes identified orthologous groups and highlighted expansions specific to human and other lineages. We also identified 106 protein kinase pseudogenes. Chromosomal mapping revealed several small clusters of kinase genes and revealed that 244 kinases map to disease loci or cancer amplicons.

7,486 citations

Journal ArticleDOI
Alex Bateman, Maria Jesus Martin, Claire O'Donovan, Michele Magrane, Rolf Apweiler, Emanuele Alpi, Ricardo Antunes, Joanna Arganiska, Benoit Bely, Mark Bingley, Carlos Bonilla, Ramona Britto, Borisas Bursteinas, Gayatri Chavali, Elena Cibrian-Uhalte, Alan Wilter Sousa da Silva, Maurizio De Giorgi, Tunca Doğan, Francesco Fazzini, Paul Gane, Leyla Jael Garcia Castro, Penelope Garmiri, Emma Hatton-Ellis, Reija Hieta, Rachael P. Huntley, Duncan Legge, W Liu, Jie Luo, Alistair MacDougall, Prudence Mutowo, Andrew Nightingale, Sandra Orchard, Klemens Pichler, Diego Poggioli, Sangya Pundir, Luis Pureza, Guoying Qi, Steven Rosanoff, Rabie Saidi, Tony Sawford, Aleksandra Shypitsyna, Edward Turner, Vladimir Volynkin, Tony Wardell, Xavier Watkins, Hermann Zellner, Andrew Peter Cowley, Luis Figueira, Weizhong Li, Hamish McWilliam, Rodrigo Lopez, Ioannis Xenarios, Lydie Bougueleret, Alan Bridge, Sylvain Poux, Nicole Redaschi, Lucila Aimo, Ghislaine Argoud-Puy, Andrea H. Auchincloss, Kristian B. Axelsen, Parit Bansal, Delphine Baratin, Marie Claude Blatter, Brigitte Boeckmann, Jerven Bolleman, Emmanuel Boutet, Lionel Breuza, Cristina Casal-Casas, Edouard de Castro, Elisabeth Coudert, Béatrice A. Cuche, M Doche, Dolnide Dornevil, Séverine Duvaud, Anne Estreicher, L Famiglietti, Marc Feuermann, Elisabeth Gasteiger, Sebastien Gehant, Vivienne Baillie Gerritsen, Arnaud Gos, Nadine Gruaz-Gumowski, Ursula Hinz, Chantal Hulo, Florence Jungo, Guillaume Keller, Vicente Lara, P Lemercier, Damien Lieberherr, Thierry Lombardot, Xavier D. Martin, Patrick Masson, Anne Morgat, Teresa Batista Neto, Nevila Nouspikel, Salvo Paesano, Ivo Pedruzzi, Sandrine Pilbout, Monica Pozzato, Manuela Pruess, Catherine Rivoire, Bernd Roechert, Michel Schneider, Christian J. A. Sigrist, K Sonesson, S Staehli, Andre Stutz, Shyamala Sundaram, Michael Tognolli, Laure Verbregue, Anne Lise Veuthey, Cathy H. Wu, Cecilia N. Arighi, Leslie Arminski, Chuming Chen, Yongxing Chen, John S. Garavelli, Hongzhan Huang, Kati Laiho, Peter B. McGarvey, Darren A. Natale, Baris E. Suzek, C. R. Vinayaka, Qinghua Wang, Yuqi Wang, Lai-Su L. Yeh, Meher Shruti Yerramalla, Jian Zhang 
TL;DR: An annotation score for all entries in UniProt is introduced to represent the relative amount of knowledge known about each protein to help identify which proteins are the best characterized and most informative for comparative analysis.
Abstract: UniProt is an important collection of protein sequences and their annotations, which has doubled in size to 80 million sequences during the past year. This growth in sequences has prompted an extension of UniProt accession number space from 6 to 10 characters. An increasing fraction of new sequences are identical to a sequence that already exists in the database with the majority of sequences coming from genome sequencing projects. We have created a new proteome identifier that uniquely identifies a particular assembly of a species and strain or subspecies to help users track the provenance of sequences. We present a new website that has been designed using a user-experience design process. We have introduced an annotation score for all entries in UniProt to represent the relative amount of knowledge known about each protein. These scores will be helpful in identifying which proteins are the best characterized and most informative for comparative analysis. All UniProt data is provided freely and is available on the web at http://www.uniprot.org/.

4,050 citations