Topic

De novo protein structure prediction

About: De novo protein structure prediction is a research topic. Over the lifetime, 109 publications have been published within this topic receiving 27246 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features

[...]

Wolfgang Kabsch¹, Chris Sander¹•Institutions (1)

Max Planck Society¹

01 Dec 1983-Biopolymers

TL;DR: A set of simple and physically motivated criteria for secondary structure, programmed as a pattern‐recognition process of hydrogen‐bonded and geometrical features extracted from x‐ray coordinates is developed.

...read moreread less

Abstract: For a successful analysis of the relation between amino acid sequence and protein structure, an unambiguous and physically meaningful definition of secondary structure is essential. We have developed a set of simple and physically motivated criteria for secondary structure, programmed as a pattern-recognition process of hydrogen-bonded and geometrical features extracted from x-ray coordinates. Cooperative secondary structure is recognized as repeats of the elementary hydrogen-bonding patterns “turn” and “bridge.” Repeating turns are “helices,” repeating bridges are “ladders,” connected ladders are “sheets.” Geometric structure is defined in terms of the concepts torsion and curvature of differential geometry. Local chain “chirality” is the torsional handedness of four consecutive Cα positions and is positive for right-handed helices and negative for ideal twisted β-sheets. Curved pieces are defined as “bends.” Solvent “exposure” is given as the number of water molecules in possible contact with a residue. The end result is a compilation of the primary structure, including SS bonds, secondary structure, and solvent exposure of 62 different globular proteins. The presentation is in linear form: strip graphs for an overall view and strip tables for the details of each of 10.925 residues. The dictionary is also available in computer-readable form for protein structure prediction work.

...read moreread less

14,077 citations

Book Chapter•DOI•

Protein Structure Prediction Using Rosetta

[...]

Carol A. Rohl¹, Charlie E. M. Strauss², Kira M.S. Misura¹, David Baker¹•Institutions (2)

University of Washington¹, Los Alamos National Laboratory²

01 Jan 2004-Methods in Enzymology

TL;DR: This chapter elaborates protein structure prediction using Rosetta, where short fragments of known proteins are assembled by a Monte Carlo strategy to yield native-like protein conformations.

...read moreread less

Abstract: Publisher Summary This chapter elaborates protein structure prediction using Rosetta. Double-blind assessments of protein structure prediction methods have indicated that the Rosetta algorithm is perhaps the most successful current method for de novo protein structure prediction. In the Rosetta method, short fragments of known proteins are assembled by a Monte Carlo strategy to yield native-like protein conformations. Using only sequence information, successful Rosetta predictions yield models with typical accuracies of 3–6 A˚ Cα root mean square deviation (RMSD) from the experimentally determined structures for contiguous segments of 60 or more residues. For each structure prediction, many short simulations starting from different random seeds are carried out to generate an ensemble of decoy structures that have both favorable local interactions and protein-like global properties. This set is then clustered by structural similarity to identify the broadest free energy minima. The effectiveness of conformation modification operators for energy function optimization is also described in this chapter.

...read moreread less

1,677 citations

Journal Article•DOI•

Direct-coupling analysis of residue coevolution captures native contacts across many protein families

[...]

Faruck Morcos¹, Andrea Pagnani², Bryan Lunt, Arianna Bertolino³, Debora S. Marks⁴, Chris Sander⁵, Riccardo Zecchina², José N. Onuchic, Terence Hwa, Martin Weigt - Show less +6 more•Institutions (5)

University of California, San Diego¹, Polytechnic University of Turin², Istituto di Scienza e Tecnologie dell'Informazione³, University of Paris⁴, Kettering University⁵

06 Dec 2011-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The findings suggest that contacts predicted by DCA can be used as a reliable guide to facilitate computational predictions of alternative protein conformations, protein complex formation, and even the de novo prediction of protein domain structures, contingent on the existence of a large number of homologous sequences which are being rapidly made available due to advances in genome sequencing.

...read moreread less

Abstract: The similarity in the three-dimensional structures of homologous proteins imposes strong constraints on their sequence variability. It has long been suggested that the resulting correlations among amino acid compositions at different sequence positions can be exploited to infer spatial contacts within the tertiary protein structure. Crucial to this inference is the ability to disentangle direct and indirect correlations, as accomplished by the recently introduced direct-coupling analysis (DCA). Here we develop a computationally efficient implementation of DCA, which allows us to evaluate the accuracy of contact prediction by DCA for a large number of protein domains, based purely on sequence information. DCA is shown to yield a large number of correctly predicted contacts, recapitulating the global structure of the contact map for the majority of the protein domains examined. Furthermore, our analysis captures clear signals beyond intradomain residue contacts, arising, e.g., from alternative protein conformations, ligand-mediated residue couplings, and interdomain interactions in protein oligomers. Our findings suggest that contacts predicted by DCA can be used as a reliable guide to facilitate computational predictions of alternative protein conformations, protein complex formation, and even the de novo prediction of protein domain structures, contingent on the existence of a large number of homologous sequences which are being rapidly made available due to advances in genome sequencing.

...read moreread less

1,319 citations

Journal Article•DOI•

Protein 3D structure computed from evolutionary sequence variation.

[...]

Debora S. Marks¹, Lucy J. Colwell², Robert L. Sheridan³, Thomas A. Hopf¹, Andrea Pagnani, Riccardo Zecchina⁴, Chris Sander³ - Show less +3 more•Institutions (4)

Harvard University¹, Laboratory of Molecular Biology², Memorial Sloan Kettering Cancer Center³, Polytechnic University of Turin⁴

07 Dec 2011-PLOS ONE

TL;DR: Surprisingly, it is found that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures, and the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy.

...read moreread less

Abstract: The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing. In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy. We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues., including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7–4.8 A Cα-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org). This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of protein structures, new strategies in protein and drug design, and the identification of functional genetic variants in normal and disease genomes.

...read moreread less

1,125 citations

Journal Article•DOI•

Identification of direct residue contacts in protein-protein interaction by message passing.

[...]

Martin Weigt¹, Robert A. White, Hendrik Szurmant, James A. Hoch, Terence Hwa - Show less +1 more•Institutions (1)

University of California, San Diego¹

06 Jan 2009-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: This work has developed a method that combines covariance analysis with global inference analysis and successfully and robustly identified residue pairs that are proximal in space without resorting to ad hoc tuning parameters, both for heterointeractions between sensor kinase and response regulator proteins and for homointer interactions between RR proteins.

...read moreread less

Abstract: Understanding the molecular determinants of specificity in protein–protein interaction is an outstanding challenge of postgenome biology. The availability of large protein databases generated from sequences of hundreds of bacterial genomes enables various statistical approaches to this problem. In this context covariance-based methods have been used to identify correlation between amino acid positions in interacting proteins. However, these methods have an important shortcoming, in that they cannot distinguish between directly and indirectly correlated residues. We developed a method that combines covariance analysis with global inference analysis, adopted from use in statistical physics. Applied to a set of >2,500 representatives of the bacterial two-component signal transduction system, the combination of covariance with global inference successfully and robustly identified residue pairs that are proximal in space without resorting to ad hoc tuning parameters, both for heterointeractions between sensor kinase (SK) and response regulator (RR) proteins and for homointeractions between RR proteins. The spectacular success of this approach illustrates the effectiveness of the global inference approach in identifying direct interaction based on sequence information alone. We expect this method to be applicable soon to interaction surfaces between proteins present in only 1 copy per genome as the number of sequenced genomes continues to expand. Use of this method could significantly increase the potential targets for therapeutic intervention, shed light on the mechanism of protein–protein interaction, and establish the foundation for the accurate prediction of interacting protein partners.

...read moreread less

998 citations

Collapse

Network Information

Performance

Metrics

109

Papers

30,254

Citations

No. of papers in the topic in previous years
Year	Papers
2021	9
2020	7
2019	1
2018	5
2017	7
2016	9

De novo protein structure prediction

Papers published on a yearly basis

Papers

Network Information

Related Topics (5)

Performance

Metrics