Protein sequence-to-structure learning: Is this the end(-to-end revolution)?
read more
Citations
Critical assessment of methods of protein structure prediction (CASP)-Round XIV.
On the Potential of Machine Learning to Examine the Relationship Between Sequence, Structure, Dynamics and Function of Intrinsically Disordered Proteins
Protein Design: From the Aspect of Water Solubility and Stability
Protein Design with Deep Learning.
The Transporter-Mediated Cellular Uptake and Efflux of Pharmaceutical Drugs and Biotechnology Products: How and Why Phospholipid Bilayer Transport Is Negligible in Real Biomembranes
References
Highly accurate protein structure prediction with AlphaFold
Sparse inverse covariance estimation with the graphical lasso
A fast algorithm for particle simulations
The energy landscapes and motions of proteins.
Generalized neural-network representation of high-dimensional potential-energy surfaces.
Related Papers (5)
Frequently Asked Questions (15)
Q2. What is the recent effort to learn a local quality metric?
Recent efforts include the use of spherical convolutions in combination with a residue-level coordinate system to learn a local quality metric [107], and the development of invariant volumetric [101] and equivariant point clouds representations in 3D [110, 111].
Q3. What is the main reason for the popularity of equivariant architectures?
The authors believe that equivariant architectures in learning from macromolecular structure will grow further in popularity due to their parameter-efficient expressive power and their ability to directly reason about, and also predict geometric quantities such as vectors.
Q4. How does the method update the frames?
The method updates these frames indirectly by applying an attention mechanism to "3D points" generated from the query sequence embedding.
Q5. What is the popular method for analyzing protein transfer learning?
A2I2Prot and CUTSP leveraged the TAPE initiative [98], which provides data, tasks and benchmarks to facilitate the evaluation of protein transfer learning.
Q6. How many links are necessary to infer a triangle?
In case the nodes represent residues, and the attention weights can be interpreted in terms of 3D distances or contact, only 2 links are necessary to infer a triangle (in blue).
Q7. What is the role of spherical harmonics in the classical fast multipole method?
Spherical harmonics have played a prominent role in molecular surface representations for several decades [144, 145] and are also at the heart of the classical fast multipole method [146].
Q8. What is the role of QA blocks in a sequencetostructure prediction process?
QA blocksmay be used as an integral part of a sequenceto-structure prediction process, as is the case in DMPfold2 [49] and AlphaFold2 [31].
Q9. What was the goal of the first attempt to train 3D CNNs on a volumetric?
2.The first attempt to train 3D CNNs on a volumetric protein representation dates back to CASP12, with the goal of assessing protein model quality [100].
Q10. What is the advantage of deep learning methods compared with traditional machine learning approaches?
One of the advantages of deep learning methods compared with traditional machine learning approaches isEMBER-NLPHMS-Casper-NLPAlphaFold2 DMPfold2-NewDMPfold2 CUTSPrawMSA MSATransformerDeepPotentialCopulaNetPharmulator TOWERKiharalab_ContactHMS-CasperProSPr
Q11. What is the importance of relative orientation in the cat cartoons?
The importance of relative orientation is also apparent in the cat cartoons — rotating the mouth motif by 180◦ with respect to the nose turns the happy cat into a sad one.
Q12. What is the way to account for long-range dependencies?
this accounting of long-range dependencies comes at the expense of precision, since it occurs only after a certain depth in the network.
Q13. What is the strategy of state-of-the-art methods to leverage the high degenerative?
More commonly, the strategy of state-of-the-art methods is to leverage the very high degenerative nature of the sequence-structure relationship through the use of a multiple sequence alignment (MSA) of evolutionaryrelated sequences, or a pre-trained protein language model (see below).
Q14. When did the idea of ap-proaches show their full potential?
These ideas started to show their full potential about 10 years ago with the development of efficient methods dealing with large scale multiple sequence alignments [6, 7, 8].
Q15. What is the role of a network in identifying structural motifs?
Given a protein structure, a network should further be able to identify structural motifs independent of the orientation and position in which they occur.