scispace - formally typeset
Search or ask a question
Author

Minkyung Baek

Bio: Minkyung Baek is an academic researcher from University of Washington. The author has contributed to research in topics: Biology & Medicine. The author has an hindex of 12, co-authored 34 publications receiving 770 citations. Previous affiliations of Minkyung Baek include Seoul National University & UPRRP College of Natural Sciences.

Papers published on a yearly basis

Papers
More filters
Journal ArticleDOI
20 Aug 2021-Science
TL;DR: In this article, a three-track network is proposed to combine information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level.
Abstract: DeepMind presented notably accurate predictions at the recent 14th Critical Assessment of Structure Prediction (CASP14) conference. We explored network architectures that incorporate related ideas and obtained the best performance with a three-track network in which information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid solution of challenging x-ray crystallography and cryo-electron microscopy structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short-circuiting traditional approaches that require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biological research.

1,907 citations

Journal ArticleDOI
11 Nov 2021-Science
TL;DR: The structures of many eukaryotic protein complexes are unknown, and there are likely many protein-protein interactions not yet identified as mentioned in this paper, but these structures play critical roles in biology.
Abstract: Protein-protein interactions play critical roles in biology, but the structures of many eukaryotic protein complexes are unknown, and there are likely many interactions not yet identified. We take ...

215 citations

Journal ArticleDOI
Marc F. Lensink, Sameer Velankar1, Andriy Kryshtafovych, Shen You Huang2, Dina Schneidman-Duhovny, Andrej Sali3, Joan Segura4, Narcis Fernandez-Fuentes5, Shruthi Viswanath6, Ron Elber6, Sergei Grudinin7, Petr Popov7, Emilie Neveu7, Hasup Lee, Minkyung Baek, Sangwoo Park, Lim Heo, Gyu Rie Lee, Chaok Seok, Sanbo Qin8, Huan-Xiang Zhou8, David W. Ritchie9, Bernard Maigret10, Marie-Dominique Devignes10, Anisah W. Ghoorah11, Mieczyslaw Torchala12, Raphael A. G. Chaleil12, Paul A. Bates12, Efrat Ben-Zeev13, Miriam Eisenstein13, Surendra S. Negi14, Zhiping Weng15, Thom Vreven15, Brian G. Pierce15, Tyler M. Borrman15, Jinchao Yu16, Françoise Ochsenbein16, Raphael Guerois16, Anna Vangone, João P. G. L. M. Rodrigues, Gydo C. P. van Zundert, Mehdi Nellen, Li C. Xue, Ezgi Karaca, Adrien S. J. Melquiond, Koen M. Visscher, Panagiotis L. Kastritis, Alexandre M. J. J. Bonvin, Xianjin Xu, Liming Qiu, Chengfei Yan, Jilong Li, Zhiwei Ma, Jianlin Cheng, Xiaoqin Zou, Yang Shen17, Lenna X. Peterson18, Hyung Rae Kim18, Amit Roy18, Amit Roy19, Xusi Han18, Juan Esquivel-Rodríguez18, Daisuke Kihara18, Xiaofeng Yu20, Neil J. Bruce20, Jonathan C. Fuller20, Rebecca C. Wade21, Ivan Anishchenko22, Petras J. Kundrotas22, Ilya A. Vakser22, Kenichiro Imai23, Kazunori D. Yamada23, Toshiyuki Oda23, Tsukasa Nakamura24, Kentaro Tomii23, Chiara Pallara, Miguel Romero-Durana, Brian Jiménez-García, Iain H. Moal, Juan Fernández-Recio, Jong Young Joung25, Jong Yun Kim25, Keehyoung Joo25, Jooyoung Lee26, Jooyoung Lee25, Dima Kozakov27, Sandor Vajda27, Scott E. Mottarella27, David R. Hall27, Dmitri Beglov27, Artem B. Mamonov27, Bing Xia27, Tanggis Bohnuud27, Carlos A. Del Carpio28, Carlos A. Del Carpio29, Eichiro Ichiishi30, Nicholas A. Marze, Daisuke Kuroda, Shourya S. Roy Burman, Jeffrey J. Gray31, Edrisse Chermak32, Luigi Cavallo32, Romina Oliva33, Andrey Tovchigrechko34, Shoshana J. Wodak 
01 Jun 2016-Proteins
TL;DR: Results show that the prediction of homodimer assemblies by homology modeling techniques and docking calculations is quite successful for targets featuring large enough subunit interfaces to represent stable associations, and that docking procedures tend to perform better than standard homology modeled techniques.
Abstract: We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014. The targets included mostly homodimers, a few homotetramers, and two heterodimers, and comprised protein chains that could readily be modeled using templates from the Protein Data Bank. On average 24 CAPRI groups and 7 CASP groups submitted docking predictions for each target, and 12 CAPRI groups per target participated in the CAPRI scoring experiment. In total more than 9500 models were assessed against the 3D structures of the corresponding target complexes. Results show that the prediction of homodimer assemblies by homology modeling techniques and docking calculations is quite successful for targets featuring large enough subunit interfaces to represent stable associations. Targets with ambiguous or inaccurate oligomeric state assignments, often featuring crystal contact-sized interfaces, represented a confounding factor. For those, a much poorer prediction performance was achieved, while nonetheless often providing helpful clues on the correct oligomeric state of the protein. The prediction performance was very poor for genuine tetrameric targets, where the inaccuracy of the homology-built subunit models and the smaller pair-wise interfaces severely limited the ability to derive the correct assembly mode. Our analysis also shows that docking procedures tend to perform better than standard homology modeling techniques and that highly accurate models of the protein components are not always required to identify their association modes with acceptable accuracy. Proteins 2016; 84(Suppl 1):323-348. © 2016 Wiley Periodicals, Inc.

139 citations

Journal ArticleDOI
TL;DR: DeepAccNet as discussed by the authors uses 3D convolutions to evaluate local atomic environments followed by 2D convolution to provide their global contexts and outperforms other methods that similarly predict the accuracy of protein structure models.
Abstract: We develop a deep learning framework (DeepAccNet) that estimates per-residue accuracy and residue-residue distance signed error in protein models and uses these predictions to guide Rosetta protein structure refinement. The network uses 3D convolutions to evaluate local atomic environments followed by 2D convolutions to provide their global contexts and outperforms other methods that similarly predict the accuracy of protein structure models. Overall accuracy predictions for X-ray and cryoEM structures in the PDB correlate with their resolution, and the network should be broadly useful for assessing the accuracy of both predicted structure models and experimentally determined structures and identifying specific regions likely to be in error. Incorporation of the accuracy predictions at multiple stages in the Rosetta refinement protocol considerably increased the accuracy of the resulting protein structure models, illustrating how deep learning can improve search for global energy minima of biomolecules. Here the authors present DeepAccNet, a deep learning framework that estimates per-residue accuracy and residue-residue distance signed error in protein models, which are used to guide Rosetta protein structure refinement. Benchmarking suggests an improvement of accuracy prediction and refinement compared to other related state of the art methods.

130 citations

Journal ArticleDOI
21 Jul 2022-Science
TL;DR: Wang et al. as mentioned in this paper proposed two deep learning methods to design proteins that contain prespecified functional sites, which can enable the scaffolding of desired functional residues within a well-folded designed protein.
Abstract: The binding and catalytic functions of proteins are generally mediated by a small number of functional residues held in place by the overall protein structure. Here, we describe deep learning approaches for scaffolding such functional sites without needing to prespecify the fold or secondary structure of the scaffold. The first approach, “constrained hallucination,” optimizes sequences such that their predicted structures contain the desired functional site. The second approach, “inpainting,” starts from the functional site and fills in additional sequence and structure to create a viable protein scaffold in a single forward pass through a specifically trained RoseTTAFold network. We use these two methods to design candidate immunogens, receptor traps, metalloproteins, enzymes, and protein-binding proteins and validate the designs using a combination of in silico and experimental tests. Description Designing around function Protein design has had success in finding sequences that fold into a desired conformation, but designing functional proteins remains challenging. Wang et al. describe two deep-learning methods to design proteins that contain prespecified functional sites. In the first, they found sequences predicted to fold into stable structures that contain the functional site. In the second, they retrained a structure prediction network to recover the sequence and full structure of a protein given only the functional site. The authors demonstrate their methods by designing proteins containing a variety of functional motifs. —VV Deep-learning methods enable the scaffolding of desired functional residues within a well-folded designed protein.

118 citations


Cited by
More filters
01 Jan 2011
TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

2,187 citations

Journal ArticleDOI
TL;DR: The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions.
Abstract: The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions. Powered by AlphaFold v2.0 of DeepMind, it has enabled an unprecedented expansion of the structural coverage of the known protein-sequence space. AlphaFold DB provides programmatic access to and interactive visualization of predicted atomic coordinates, per-residue and pairwise model-confidence estimates and predicted aligned errors. The initial release of AlphaFold DB contains over 360,000 predicted structures across 21 model-organism proteomes, which will soon be expanded to cover most of the (over 100 million) representative sequences from the UniRef90 data set.

2,008 citations

Journal ArticleDOI
20 Aug 2021-Science
TL;DR: In this article, a three-track network is proposed to combine information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level.
Abstract: DeepMind presented notably accurate predictions at the recent 14th Critical Assessment of Structure Prediction (CASP14) conference. We explored network architectures that incorporate related ideas and obtained the best performance with a three-track network in which information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid solution of challenging x-ray crystallography and cryo-electron microscopy structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short-circuiting traditional approaches that require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biological research.

1,907 citations

Journal ArticleDOI
TL;DR: This protocol describes the use of the various options, the construction of auxiliary restraints files, the selection of the energy parameters, and the analysis of the results of the ClusPro server.
Abstract: The ClusPro server (https://cluspro.org) is a widely used tool for protein-protein docking. The server provides a simple home page for basic use, requiring only two files in Protein Data Bank (PDB) format. However, ClusPro also offers a number of advanced options to modify the search; these include the removal of unstructured protein regions, application of attraction or repulsion, accounting for pairwise distance restraints, construction of homo-multimers, consideration of small-angle X-ray scattering (SAXS) data, and location of heparin-binding sites. Six different energy functions can be used, depending on the type of protein. Docking with each energy parameter set results in ten models defined by centers of highly populated clusters of low-energy docked structures. This protocol describes the use of the various options, the construction of auxiliary restraints files, the selection of the energy parameters, and the analysis of the results. Although the server is heavily used, runs are generally completed in <4 h.

1,699 citations

Journal ArticleDOI
TL;DR: ColabFold as discussed by the authors combines the fast homology search of MMseqs2 with AlphaFold2 or RoseTTAFold for protein folding and achieves 40-60fold faster search and optimized model utilization.
Abstract: ColabFold offers accelerated prediction of protein structures and complexes by combining the fast homology search of MMseqs2 with AlphaFold2 or RoseTTAFold. ColabFold's 40-60-fold faster search and optimized model utilization enables prediction of close to 1,000 structures per day on a server with one graphics processing unit. Coupled with Google Colaboratory, ColabFold becomes a free and accessible platform for protein folding. ColabFold is open-source software available at https://github.com/sokrypton/ColabFold and its novel environmental databases are available at https://colabfold.mmseqs.com .

1,553 citations