AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models.
Mihaly Varadi,Stephen Anyango,Mandar Deshpande,Sreenath Nair,Cindy Natassia,Galabina Yordanova,David Yu Yuan,Oana Stroe,Gemma Wood,Agata Laydon,Augustin Žídek,Tim Green,Kathryn Tunyasuvunakool,Stig Petersen,John M. Jumper,Ellen Clancy,Richard E. Green,Ankur Vora,Mira Lutfi,Michael Figurnov,Andrew Cowie,Nicole Hobbs,Pushmeet Kohli,Gerard J. Kleywegt,Ewan Birney,Demis Hassabis,Sameer Velankar +26 more
TLDR
The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions.Abstract:
The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions. Powered by AlphaFold v2.0 of DeepMind, it has enabled an unprecedented expansion of the structural coverage of the known protein-sequence space. AlphaFold DB provides programmatic access to and interactive visualization of predicted atomic coordinates, per-residue and pairwise model-confidence estimates and predicted aligned errors. The initial release of AlphaFold DB contains over 360,000 predicted structures across 21 model-organism proteomes, which will soon be expanded to cover most of the (over 100 million) representative sequences from the UniRef90 data set.read more
Citations
More filters
Journal ArticleDOI
Search and sequence analysis tools services from EMBL-EBI in 2022
Fábio Madeira,Matt Pearce,Adrian Tivey,Prasad Basutkar,Joon Seung Lee,Ossama Edbali,Nandana Madhusoodanan,A. Kolesnikov,Rodrigo Lopez +8 more
TL;DR: Recent improvements to EBI Search and Job Dispatcher tools frameworks are described and updates made to accommodate the increasing data requirements during the COVID-19 pandemic are described.
Journal ArticleDOI
OUP accepted manuscript
TL;DR: The EMBL-EBI search and sequence analysis tools frameworks as discussed by the authors provide integrated access to EMBL EBI's data resources and core bioinformatics analytical tools, allowing users to interact through user-friendly web applications, as well as via RESTful and SOAP-based APIs.
Journal ArticleDOI
UniProt: the Universal Protein Knowledgebase in 2023
Alex Bateman,Maria Jesus Martin,Sandra Orchard,Michele Magrane,Shadab Ahmad,Emanuele Alpi,Emily H Bowler-Barnett,Ramona Britto,Hema Bye-a-Jee,Austra Cukura,P. Denny,Tunca Doğan,ThankGod Ebenezer,Jun Fan,Penelope Garmiri,Leonardo Jose da Costa Gonzales,Emma Hatton-Ellis,Abdulrahman Hussein,Alexandr Ignatchenko,Giuseppe Insana,Rizwan Ishtiaq,Vishal Joshi,Dushyanth Jyothi,Swaathi Kandasaamy,Antonia Lock,Aurelien Luciani,Marija Lugarić,Jie Luo,Y. Lussi,Alistair MacDougall,Fábio Madeira,Mahdi Mahmoudy,Alok Mishra,Katie Moulang,Andrew Nightingale,Sangya Pundir,Guoying Qi,Shri K. Raman Raj,Pedro Duarte da Silva Fonseca Gândara Raposo,Daniel Rice,Rabie Saidi,Rafael Santos,Elena Speretta,James Stephenson,Prabhat Totoo,Edward Turner,N. Tyagi,Preethi Vasudev,Kate Warner,Xavier Watkins,Rossana Zaru,Hermann Zellner,Alan Bridge,Lucila Aimo,Ghislaine Argoud-Puy,Andrea H. Auchincloss,Kristian B. Axelsen,Parit Bansal,Delphine Baratin,Teresa M Batista Neto,Marie-Claude Blatter,Jerven Bolleman,Emmanuel Boutet,Lionel Breuza,B. Gil,C. Casals-Casas,Kamal Chikh Echioukh,Elisabeth Coudert,Béatrice A. Cuche,Edouard de Castro,Anne Estreicher,Maria Livia Famiglietti,Marc Feuermann,Elisabeth Gasteiger,Pascale Gaudet,Sebastien Gehant,Vivienne Baillie Gerritsen,Arnaud Gos,Nadine M. Gruaz,Chantal Hulo,Nevila Hyka-Nouspikel,Florence Jungo,Arnaud Kerhornou,Philippe Le Mercier,Damien Lieberherr,Patrick Masson,Anne Morgat,Venkatesh Muthukrishnan,Salvo Paesano,Ivo Pedruzzi,Sandrine Pilbout,Lucille Pourcel,Sylvain Poux,Monica Pozzato,Manuela Pruess,Nicole Redaschi,Catherine Rivoire,Christian J. A. Sigrist,K Sonesson,Shyamala Sundaram,Cathy H. Wu,Cecilia N. Arighi,Leslie Arminski,Chuming Chen,Yongxing Chen,Hongzhan Huang,Kati Laiho,Peter B. McGarvey,Darren A. Natale,Karen F. Ross,C. R. Vinayaka,Qinghua Wang,Yuqi Wang,Jian Zhang +113 more
TL;DR:
Journal ArticleDOI
Dali server: structural unification of protein families
TL;DR: Two most recent upgrades to the Dali server for 3D protein structure comparison are reported: the foldomes of key organisms in the AlphaFold Database (version 1) are searchable by Dali, and structural alignments are annotated with protein families.
Journal ArticleDOI
Evolutionary-scale prediction of atomic level protein structure with a language model
Zeming Lin,Halil Akin,Roshan Ara Rao,Brian Hie,Zhong-li Zhu,Wenting Lu,Nikita Smetanin,Robert Verkuil,Ori Kabeli,Yaniv Shmueli,Allan dos Santos Costa,Maryam Fazel-Zarandi,Tom Sercu,Salvatore Candido,Alexander Rives +14 more
TL;DR: The ESM Metage-nomic Atlas as discussed by the authors is the first large-scale structural characterization of metagenomic proteins, with more than 617 million structures, including more than 225 million high confidence predictions.
References
More filters
Journal ArticleDOI
Highly accurate protein structure prediction with AlphaFold
John M. Jumper,Richard O. Evans,Alexander Pritzel,Tim Green,Michael Figurnov,Olaf Ronneberger,Kathryn Tunyasuvunakool,Russell Bates,Augustin Žídek,Anna Potapenko,Alex Bridgland,Clemens Meyer,Simon A. A. Kohl,Andrew J. Ballard,Andrew Cowie,Bernardino Romera-Paredes,Stanislav Nikolov,R. D. Jain,Jonas Adler,Trevor Back,Stig Petersen,David Reiman,Ellen Clancy,Michal Zielinski,Martin Steinegger,Michalina Pacholska,Tamas Berghammer,Sebastian Bodenstein,David L. Silver,Oriol Vinyals,Andrew W. Senior,Koray Kavukcuoglu,Pushmeet Kohli,Demis Hassabis +33 more
TL;DR: For example, AlphaFold as mentioned in this paper predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture. But the accuracy is limited by the fact that no homologous structure is available.
Journal ArticleDOI
UniProt: the universal protein knowledgebase in 2021
Alex Bateman,Maria Jesus Martin,Sandra Orchard,Michele Magrane,Rahat Agivetova,Shadab Ahmad,Emanuele Alpi,Emily H Bowler-Barnett,Ramona Britto,Borisas Bursteinas,Hema Bye-A-Jee,Ray Coetzee,Austra Cukura,Alan Wilter Sousa da Silva,Paul Denny,Tunca Doğan,ThankGod Ebenezer,Jun Fan,Leyla Jael Garcia Castro,Penelope Garmiri,George Georghiou,Leonardo Gonzales,Emma Hatton-Ellis,Abdulrahman Hussein,Alexandr Ignatchenko,Giuseppe Insana,Rizwan Ishtiaq,Petteri Jokinen,Vishal Joshi,Dushyanth Jyothi,Antonia Lock,Rodrigo Lopez,Aurelien Luciani,Jie Luo,Yvonne Lussi,Alistair MacDougall,Fábio Madeira,Mahdi Mahmoudy,Manuela Menchi,Alok Mishra,Katie Moulang,Andrew Nightingale,Carla Susana Oliveira,Sangya Pundir,Guoying Qi,Shriya Raj,Daniel Rice,Milagros Rodriguez Lopez,Rabie Saidi,Joseph Sampson,Tony Sawford,Elena Speretta,Edward Turner,Nidhi Tyagi,Preethi Vasudev,Vladimir Volynkin,Kate Warner,Xavier Watkins,Rossana Zaru,Hermann Zellner,Alan Bridge,Sylvain Poux,Nicole Redaschi,Lucila Aimo,Ghislaine Argoud-Puy,Andrea H. Auchincloss,Kristian B. Axelsen,Parit Bansal,Delphine Baratin,Marie-Claude Blatter,Jerven Bolleman,Emmanuel Boutet,Lionel Breuza,Cristina Casals-Casas,Edouard de Castro,Kamal Chikh Echioukh,Elisabeth Coudert,Béatrice A. Cuche,M Doche,Dolnide Dornevil,Anne Estreicher,Maria Livia Famiglietti,Marc Feuermann,Elisabeth Gasteiger,Sebastien Gehant,Vivienne Baillie Gerritsen,Arnaud Gos,Nadine Gruaz-Gumowski,Ursula Hinz,Chantal Hulo,Nevila Hyka-Nouspikel,Florence Jungo,Guillaume Keller,Arnaud Kerhornou,Vicente Lara,Philippe Le Mercier,Damien Lieberherr,Thierry Lombardot,Xavier D. Martin,Patrick Masson,Anne Morgat,Teresa Batista Neto,Salvo Paesano,Ivo Pedruzzi,Sandrine Pilbout,Lucille Pourcel,Monica Pozzato,Manuela Pruess,Catherine Rivoire,Christian J. A. Sigrist,K Sonesson,Andre Stutz,Shyamala Sundaram,Michael Tognolli,Laure Verbregue,Cathy H. Wu,Cecilia N. Arighi,Leslie Arminski,Chuming Chen,Yongxing Chen,John S. Garavelli,Hongzhan Huang,Kati Laiho,Peter B. McGarvey,Darren A. Natale,Karen E. Ross,C. R. Vinayaka,Qinghua Wang,Yuqi Wang,Lai-Su L. Yeh,Jian Zhang,Patrick Ruch,Douglas Teodoro +132 more
TL;DR: The UniProtKB responded to the COVID-19 pandemic through expert curation of relevant entries that were rapidly made available to the research community through a dedicated portal and a credit-based publication submission interface was developed.
Journal ArticleDOI
Pfam: The protein families database in 2021.
Jaina Mistry,Sara Chuguransky,Lowri Williams,Matloob Qureshi,Gustavo A. Salazar,Erik L. L. Sonnhammer,Silvio C. E. Tosatto,Lisanna Paladin,Shriya Raj,Lorna Richardson,Robert D. Finn,Alex Bateman +11 more
TL;DR: The Pfam database is a widely used resource for classifying protein sequences into families and domains and the reintroduced Pfam-B which provides an automatically generated supplement to Pfam and contains 136 730 novel clusters of sequences that are not yet matched by a Pfam family.
Journal ArticleDOI
Accurate prediction of protein structures and interactions using a three-track neural network
Minkyung Baek,Frank DiMaio,Ivan Anishchenko,Justas Dauparas,Sergey Ovchinnikov,Gyu Rie Lee,Jue Wang,Qian Cong,Lisa N. Kinch,R. Dustin Schaeffer,Claudia Millán,Hahnbeom Park,Carson Adams,Caleb R. Glassman,Andy DeGiovanni,Jose Henrique Pereira,Andria V. Rodrigues,Alberdina A. van Dijk,Ana C. Ebrecht,Diederik J. Opperman,Theo Sagmeister,Christoph Buhlheller,Christoph Buhlheller,Tea Pavkov-Keller,Manoj K. Rathinaswamy,Udit Dalwadi,Calvin K. Yip,John E. Burke,K. Christopher Garcia,Nick V. Grishin,Paul D. Adams,Paul D. Adams,Randy J. Read,David Baker +33 more
TL;DR: In this article, a three-track network is proposed to combine information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level.
Journal ArticleDOI
Highly accurate protein structure prediction for the human proteome
Kathryn Tunyasuvunakool,Jonas Adler,Zachary Wu,Tim Green,Michal Zielinski,Augustin Žídek,Alex Bridgland,Andrew Cowie,Clemens Meyer,Agata Laydon,Sameer Velankar,Gerard J. Kleywegt,Alex Bateman,Richard Evans,Alexander Pritzel,Michael Figurnov,Olaf Ronneberger,Russell Bates,Simon A. A. Kohl,Anna Potapenko,Andrew J. Ballard,Bernardino Romera-Paredes,Stanislav Nikolov,R. D. Jain,Ellen Clancy,David Reiman,Stig Petersen,Andrew W. Senior,Koray Kavukcuoglu,Ewan Birney,Pushmeet Kohli,John M. Jumper,Demis Hassabis +32 more
TL;DR: The AlphaFold2 dataset as discussed by the authors is a large-scale and high-accuracy structure prediction dataset for protein structures, which is used to evaluate the structural properties of proteins.
Related Papers (5)
MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets
CDD: a Conserved Domain Database for protein classification
Aron Marchler-Bauer,John B. Anderson,Praveen F. Cherukuri,Carol DeWeese-Scott,Lewis Y. Geer,Marc Gwadz,Siqian He,David I. Hurwitz,John D. Jackson,Zhaoxi Ke,Christopher J. Lanczycki,Cynthia A. Liebert,Chunlei Liu,Fu Lu,Gabriele H. Marchler,Mikhail Mullokandov,Benjamin A. Shoemaker,Vahan Simonyan,James S. Song,Paul A. Thiessen,Roxanne A. Yamashita,Jodie J. Yin,Dachuan Zhang,Stephen H. Bryant +23 more