Home
/
Authors
/
Justas Dauparas

Author

Justas Dauparas

Other affiliations: University of Cambridge

Bio: Justas Dauparas is an academic researcher from University of Washington. The author has contributed to research in topics: Biology & Medicine. The author has an hindex of 8, co-authored 22 publications receiving 345 citations. Previous affiliations of Justas Dauparas include University of Cambridge.

Topics: Biology, Medicine, Protein design, Fluid transport, Sequence (biology) ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Accurate prediction of protein structures and interactions using a three-track neural network

[...]

Minkyung Baek¹, Frank DiMaio¹, Ivan Anishchenko¹, Justas Dauparas¹, Sergey Ovchinnikov², Gyu Rie Lee¹, Jue Wang¹, Qian Cong³, Lisa N. Kinch³, R. Dustin Schaeffer³, Claudia Millán⁴, Hahnbeom Park¹, Carson Adams¹, Caleb R. Glassman⁵, Andy DeGiovanni⁶, Jose Henrique Pereira⁶, Andria V. Rodrigues⁶, Alberdina A. van Dijk⁷, Ana C. Ebrecht⁷, Diederik J. Opperman⁸, Theo Sagmeister⁹, Christoph Buhlheller⁹, Christoph Buhlheller¹⁰, Tea Pavkov-Keller⁹, Manoj K. Rathinaswamy¹¹, Udit Dalwadi¹², Calvin K. Yip¹², John E. Burke¹¹, K. Christopher Garcia, Nick V. Grishin³, Paul D. Adams¹³, Paul D. Adams⁶, Randy J. Read⁴, David Baker¹ - Show less +30 more•Institutions (13)

University of Washington¹, Harvard University², University of Texas Southwestern Medical Center³, University of Cambridge⁴, Stanford University⁵, Lawrence Berkeley National Laboratory⁶, North-West University⁷, University of the Free State⁸, University of Graz⁹, Medical University of Graz¹⁰, University of Victoria¹¹, University of British Columbia¹², University of California, Berkeley¹³

20 Aug 2021-Science

TL;DR: In this article, a three-track network is proposed to combine information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level.

...read moreread less

Abstract: DeepMind presented notably accurate predictions at the recent 14th Critical Assessment of Structure Prediction (CASP14) conference. We explored network architectures that incorporate related ideas and obtained the best performance with a three-track network in which information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid solution of challenging x-ray crystallography and cryo-electron microscopy structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short-circuiting traditional approaches that require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biological research.

...read moreread less

1,907 citations

Journal Article•DOI•

Robust deep learning based protein sequence design using ProteinMPNN

[...]

Justas Dauparas, Ivan Anishchenko, N. Bennett, Hua Bai, Robert J. Ragotte, Lukas F. Milles, Basile I. M. Wicky, Alexis Courbet, R. de Haas, N. Bethel, P. J. Leung, Timothy Huddy, S.J. Pellock, Doug Tischer, F. Chan, Brian Koepnick, H. Nguyen, Alex Kang, Banumathi Sankaran, Aloke Kumar Bera, Neil P. King, David Baker - Show less +18 more

04 Jun 2022-Science

TL;DR: The broad utility and high accuracy of ProteinMPNN is demonstrated using X-ray crystallography, cryoEM and functional studies by rescuing previously failed designs, made using Rosetta or AlphaFold, of protein monomers, cyclic homo-oligomers, tetrahedral nanoparticles, and target binding proteins.

...read moreread less

Abstract: While deep learning has revolutionized protein structure prediction, almost all experimentally characterized de novo protein designs have been generated using physically based approaches such as Rosetta. Here we describe a deep learning based protein sequence design method, ProteinMPNN, with outstanding performance in both in silico and experimental tests. The amino acid sequence at different positions can be coupled between single or multiple chains, enabling application to a wide range of current protein design challenges. On native protein backbones, ProteinMPNN has a sequence recovery of 52.4%, compared to 32.9% for Rosetta. Incorporation of noise during training improves sequence recovery on protein structure models, and produces sequences which more robustly encode their structures as assessed using structure prediction algorithms. We demonstrate the broad utility and high accuracy of ProteinMPNN using X-ray crystallography, cryoEM and functional studies by rescuing previously failed designs, made using Rosetta or AlphaFold, of protein monomers, cyclic homo-oligomers, tetrahedral nanoparticles, and target binding proteins. One-sentence summary A deep learning based protein sequence design method is described that is widely applicable to current design challenges and shows outstanding performance in both in silico and experimental tests.

...read moreread less

193 citations

Journal Article•DOI•

Improved protein structure refinement guided by deep learning based accuracy estimation.

[...]

Naozumi Hiranuma¹, Hahnbeom Park¹, Minkyung Baek¹, Ivan Anishchenko¹, Justas Dauparas¹, David Baker², David Baker¹ - Show less +3 more•Institutions (2)

University of Washington¹, Howard Hughes Medical Institute²

26 Feb 2021-Nature Communications

TL;DR: DeepAccNet as discussed by the authors uses 3D convolutions to evaluate local atomic environments followed by 2D convolution to provide their global contexts and outperforms other methods that similarly predict the accuracy of protein structure models.

...read moreread less

Abstract: We develop a deep learning framework (DeepAccNet) that estimates per-residue accuracy and residue-residue distance signed error in protein models and uses these predictions to guide Rosetta protein structure refinement. The network uses 3D convolutions to evaluate local atomic environments followed by 2D convolutions to provide their global contexts and outperforms other methods that similarly predict the accuracy of protein structure models. Overall accuracy predictions for X-ray and cryoEM structures in the PDB correlate with their resolution, and the network should be broadly useful for assessing the accuracy of both predicted structure models and experimentally determined structures and identifying specific regions likely to be in error. Incorporation of the accuracy predictions at multiple stages in the Rosetta refinement protocol considerably increased the accuracy of the resulting protein structure models, illustrating how deep learning can improve search for global energy minima of biomolecules. Here the authors present DeepAccNet, a deep learning framework that estimates per-residue accuracy and residue-residue distance signed error in protein models, which are used to guide Rosetta protein structure refinement. Benchmarking suggests an improvement of accuracy prediction and refinement compared to other related state of the art methods.

...read moreread less

130 citations

Journal Article•DOI•

Scaffolding protein functional sites using deep learning

[...]

Jue Wang, Sidney Lisanza, David Juergens, Doug Tischer, Joseph L. Watson, Karla M Castro, Robert J. Ragotte, Amijai Saragovi, Lukas F. Milles, Minkyung Baek, Ivan Anishchenko, Wei Yang, Derrick R. Hicks, Marc Expòsit, Thomas Schlichthaerle, Jung Ho Chun, Justas Dauparas, N. Bennett, Basile I. M. Wicky, Andrew G. Muenks, Frank DiMaio, Bruno E. Correia, Sergey Ovchinnikov, David Baker - Show less +20 more

21 Jul 2022-Science

TL;DR: Wang et al. as mentioned in this paper proposed two deep learning methods to design proteins that contain prespecified functional sites, which can enable the scaffolding of desired functional residues within a well-folded designed protein.

...read moreread less

Abstract: The binding and catalytic functions of proteins are generally mediated by a small number of functional residues held in place by the overall protein structure. Here, we describe deep learning approaches for scaffolding such functional sites without needing to prespecify the fold or secondary structure of the scaffold. The first approach, “constrained hallucination,” optimizes sequences such that their predicted structures contain the desired functional site. The second approach, “inpainting,” starts from the functional site and fills in additional sequence and structure to create a viable protein scaffold in a single forward pass through a specifically trained RoseTTAFold network. We use these two methods to design candidate immunogens, receptor traps, metalloproteins, enzymes, and protein-binding proteins and validate the designs using a combination of in silico and experimental tests. Description Designing around function Protein design has had success in finding sequences that fold into a desired conformation, but designing functional proteins remains challenging. Wang et al. describe two deep-learning methods to design proteins that contain prespecified functional sites. In the first, they found sequences predicted to fold into stable structures that contain the functional site. In the second, they retrained a structure prediction network to recover the sequence and full structure of a protein given only the functional site. The authors demonstrate their methods by designing proteins containing a variety of functional motifs. —VV Deep-learning methods enable the scaffolding of desired functional residues within a well-folded designed protein.

...read moreread less

118 citations

Posted Content•DOI•

Improved protein structure refinement guided by deep learning based accuracy estimation

[...]

Naozumi Hiranuma¹, Hahnbeom Park¹, Minkyung Baek¹, Ivan Anishchanka¹, Justas Dauparas¹, David Baker², David Baker¹ - Show less +3 more•Institutions (2)

University of Washington¹, Howard Hughes Medical Institute²

04 Nov 2020-bioRxiv

TL;DR: A deep learning framework (DeepAccNet) that estimates per-residue accuracy and residue- Residue distance signed error in protein models and uses these predictions to guide Rosetta protein structure refinement considerably increased the accuracy of the resulting protein structure models, illustrating how deep learning can improve search for global energy minima of biomolecules.

...read moreread less

100 citations

1
2
3
4
…
5
6
7
8

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models.

[...]

Mihaly Varadi¹, Stephen Anyango¹, Mandar Deshpande¹, Sreenath Nair¹, Cindy Natassia¹, Galabina Yordanova¹, David Yu Yuan¹, Oana Stroe¹, Gemma Wood¹, Agata Laydon, Augustin Žídek, Tim Green, Kathryn Tunyasuvunakool, Stig Petersen, John M. Jumper, Ellen Clancy, Richard E. Green, Ankur Vora, Mira Lutfi, Michael Figurnov, Andrew Cowie, Nicole Hobbs, Pushmeet Kohli, Gerard J. Kleywegt¹, Ewan Birney¹, Demis Hassabis, Sameer Velankar¹ - Show less +23 more•Institutions (1)

European Bioinformatics Institute¹

17 Nov 2021-Nucleic Acids Research

TL;DR: The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions.

...read moreread less

Abstract: The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions. Powered by AlphaFold v2.0 of DeepMind, it has enabled an unprecedented expansion of the structural coverage of the known protein-sequence space. AlphaFold DB provides programmatic access to and interactive visualization of predicted atomic coordinates, per-residue and pairwise model-confidence estimates and predicted aligned errors. The initial release of AlphaFold DB contains over 360,000 predicted structures across 21 model-organism proteomes, which will soon be expanded to cover most of the (over 100 million) representative sequences from the UniRef90 data set.

...read moreread less

2,008 citations

Journal Article•DOI•

Accurate prediction of protein structures and interactions using a three-track neural network

[...]

Minkyung Baek¹, Frank DiMaio¹, Ivan Anishchenko¹, Justas Dauparas¹, Sergey Ovchinnikov², Gyu Rie Lee¹, Jue Wang¹, Qian Cong³, Lisa N. Kinch³, R. Dustin Schaeffer³, Claudia Millán⁴, Hahnbeom Park¹, Carson Adams¹, Caleb R. Glassman⁵, Andy DeGiovanni⁶, Jose Henrique Pereira⁶, Andria V. Rodrigues⁶, Alberdina A. van Dijk⁷, Ana C. Ebrecht⁷, Diederik J. Opperman⁸, Theo Sagmeister⁹, Christoph Buhlheller⁹, Christoph Buhlheller¹⁰, Tea Pavkov-Keller⁹, Manoj K. Rathinaswamy¹¹, Udit Dalwadi¹², Calvin K. Yip¹², John E. Burke¹¹, K. Christopher Garcia, Nick V. Grishin³, Paul D. Adams⁶, Paul D. Adams¹³, Randy J. Read⁴, David Baker¹ - Show less +30 more•Institutions (13)

20 Aug 2021-Science

TL;DR: In this article, a three-track network is proposed to combine information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level.

...read moreread less

1,907 citations

Journal Article•DOI•

ColabFold: making protein folding accessible to all

[...]

Milot Mirdita¹, Tatiana Valdez Bubnova², Oi Wah Liew³•Institutions (3)

Seoul National University¹, Harvard University², University of Göttingen³

30 May 2022-Nature Methods

TL;DR: ColabFold as discussed by the authors combines the fast homology search of MMseqs2 with AlphaFold2 or RoseTTAFold for protein folding and achieves 40-60fold faster search and optimized model utilization.

...read moreread less

Abstract: ColabFold offers accelerated prediction of protein structures and complexes by combining the fast homology search of MMseqs2 with AlphaFold2 or RoseTTAFold. ColabFold's 40-60-fold faster search and optimized model utilization enables prediction of close to 1,000 structures per day on a server with one graphics processing unit. Coupled with Google Colaboratory, ColabFold becomes a free and accessible platform for protein folding. ColabFold is open-source software available at https://github.com/sokrypton/ColabFold and its novel environmental databases are available at https://colabfold.mmseqs.com .

...read moreread less

1,553 citations

Journal Article•DOI•

Improved protein structure prediction using predicted interresidue orientations

[...]

Jianyi Yang¹, Ivan Anishchenko², Hahnbeom Park², Zhenling Peng³, Sergey Ovchinnikov⁴, David Baker² - Show less +2 more•Institutions (4)

Nankai University¹, University of Washington², Tianjin University³, Harvard University⁴

21 Jan 2020-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: A deep residual network for predicting interresidue orientations, in addition to distances, and a Rosetta-constrained energy-minimization protocol for rapidly and accurately generating structure models guided by these restraints are developed.

...read moreread less

Abstract: The prediction of interresidue contacts and distances from coevolutionary data using deep learning has considerably advanced protein structure prediction. Here, we build on these advances by developing a deep residual network for predicting interresidue orientations, in addition to distances, and a Rosetta-constrained energy-minimization protocol for rapidly and accurately generating structure models guided by these restraints. In benchmark tests on 13th Community-Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP13)- and Continuous Automated Model Evaluation (CAMEO)-derived sets, the method outperforms all previously described structure-prediction methods. Although trained entirely on native proteins, the network consistently assigns higher probability to de novo-designed proteins, identifying the key fold-determining residues and providing an independent quantitative measure of the "ideality" of a protein structure. The method promises to be useful for a broad range of protein structure prediction and design problems.

...read moreread less

1,026 citations

Posted Content•DOI•

Protein complex prediction with AlphaFold-Multimer

[...]

Richard Evans, Michael J. O'Neill, Alexander Pritzel, Natasha Antropova, Andrew W. Senior, Tim Green, Augustin Žídek, Russell Bates, Sam Blackwell, Jason Yim, Olaf Ronneberger, Sebastian Bodenstein, Michal Zielinski, Alex Bridgland, Anna Potapenko, Andrew Cowie, Kathryn Tunyasuvunakool, R. D. Jain, Ellen Clancy, Pushmeet Kohli, John M. Jumper, Demis Hassabis - Show less +18 more

04 Oct 2021-bioRxiv

TL;DR: In this article, an AlphaFold model trained specifically for multimeric inputs of known stoichiometry was proposed, which significantly increases the accuracy of predicted multimimeric interfaces over input-adapted single-chain AlphaFolds.

...read moreread less

Abstract: While the vast majority of well-structured single protein chains can now be predicted to high accuracy due to the recent AlphaFold [1] model, the prediction of multi-chain protein complexes remains a challenge in many cases. In this work, we demonstrate that an AlphaFold model trained specifically for multimeric inputs of known stoichiometry, which we call AlphaFold-Multimer, significantly increases accuracy of predicted multimeric interfaces over input-adapted single-chain AlphaFold while maintaining high intra-chain accuracy. On a benchmark dataset of 17 heterodimer proteins without templates (introduced in [2]) we achieve at least medium accuracy (DockQ [3] [≥] 0.49) on 14 targets and high accuracy (DockQ [≥] 0.8) on 6 targets, compared to 9 targets of at least medium accuracy and 4 of high accuracy for the previous state of the art system (an AlphaFold-based system from [2]). We also predict structures for a large dataset of 4,433 recent protein complexes, from which we score all non-redundant interfaces with low template identity. For heteromeric interfaces we successfully predict the interface (DockQ [≥] 0.23) in 67% of cases, and produce high accuracy predictions (DockQ [≥] 0.8) in 23% of cases, an improvement of +25 and +11 percentage points over the flexible linker modification of AlphaFold [4] respectively. For homomeric interfaces we successfully predict the interface in 69% of cases, and produce high accuracy predictions in 34% of cases, an improvement of +5 percentage points in both instances.

...read moreread less

1,023 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse