What is the recent effort to learn a local quality metric?

Recent efforts include the use of spherical convolutions in combination with a residue-level coordinate system to learn a local quality metric [107], and the development of invariant volumetric [101] and equivariant point clouds representations in 3D [110, 111].

What is the main reason for the popularity of equivariant architectures?

The authors believe that equivariant architectures in learning from macromolecular structure will grow further in popularity due to their parameter-efficient expressive power and their ability to directly reason about, and also predict geometric quantities such as vectors.

How does the method update the frames?

The method updates these frames indirectly by applying an attention mechanism to "3D points" generated from the query sequence embedding.

How many links are necessary to infer a triangle?

In case the nodes represent residues, and the attention weights can be interpreted in terms of 3D distances or contact, only 2 links are necessary to infer a triangle (in blue).

What is the role of spherical harmonics in the classical fast multipole method?

Spherical harmonics have played a prominent role in molecular surface representations for several decades [144, 145] and are also at the heart of the classical fast multipole method [146].

What is the importance of relative orientation in the cat cartoons?

The importance of relative orientation is also apparent in the cat cartoons — rotating the mouth motif by 180◦ with respect to the nose turns the happy cat into a sad one.

What is the way to account for long-range dependencies?

this accounting of long-range dependencies comes at the expense of precision, since it occurs only after a certain depth in the network.

When did the idea of ap-proaches show their full potential?

These ideas started to show their full potential about 10 years ago with the development of efficient methods dealing with large scale multiple sequence alignments [6, 7, 8].

What is the role of a network in identifying structural motifs?

Given a protein structure, a network should further be able to identify structural motifs independent of the orientation and position in which they occur.

(Open Access) Protein sequence-to-structure learning: Is this the end(-to-end revolution)? (2021) | Elodie Laine

Q: What are the contributions in "Protein sequence-to-structure learning" ?

For instance, this paper proposed a deep learning-powered approach for protein 3D structure prediction, which can capture long-range dependencies between amino acid residues, transform these dependencies into structural constraints, and preserve the symmetry and properties of the 3D space.

Q: What is the role of QA blocks in a sequencetostructure prediction process?

QA blocksmay be used as an integral part of a sequenceto-structure prediction process, as is the case in DMPfold2 [49] and AlphaFold2 [31].

Q: What was the goal of the first attempt to train 3D CNNs on a volumetric?

2.The first attempt to train 3D CNNs on a volumetric protein representation dates back to CASP12, with the goal of assessing protein model quality [100].

Q: What is the advantage of deep learning methods compared with traditional machine learning approaches?

One of the advantages of deep learning methods compared with traditional machine learning approaches isEMBER-NLPHMS-Casper-NLPAlphaFold2 DMPfold2-NewDMPfold2 CUTSPrawMSA MSATransformerDeepPotentialCopulaNetPharmulator TOWERKiharalab_ContactHMS-CasperProSPr

R E V I E W

Protein sequence-to-structure learning: Is this the

end(-to-end revolution)?

Protein sequence-to-structure learning

Elodie Laine

1∗

| Stephan Eismann

2∗

| Arne Elofsson

3∗

| Sergei Grudinin

4∗

Sorbonne Université, CNRS, IBPS,

Laboratoire de Biologie Computationnelle

et Quantitative (LCQB), 75005 Paris, France

Dep. of Computer Science and Applied

Physics, Stanford University, Stanford, CA

94305, USA

Dep. of Biochemistry and Biophysics and

Science for Life Laboratory, Stockholm

University, Box 1031, 171 72 Solna,

Sweden

Univ. Grenoble Alpes, CNRS, Grenoble

INP, LJK, 38000 Grenoble, France

Correspondence

Sergei Grudinin, Univ. Grenoble Alpes,

CNRS, Grenoble INP, LJK, 38000 Grenoble,

France

Email:

sergei.grudinin@univ-grenoble-alpes.fr

Funding information

EL was funded by the French national

research agency grant ANR-17-CE12-0009.

SE was supported by a Stanford Bio-X

Bowes Fellowship. AE was funded by grants

from the Swedish, E-science Research

Center, Swedish National Infrastructure for

Computing, and Swedish Natural Science

Research Council No VR-NT 2016-03798.

The potential of deep learning has been recognized in the

protein structure prediction community for some time, and

became indisputable after CASP13. In CASP14, deep learn-

ing has boosted the ﬁeld to unanticipated levels reaching

near-experimental accuracy. This success comes from ad-

vances transferred from other machine learning areas, as

well as methods speciﬁcally designed to deal with protein

sequences and structures, and their abstractions. Novel

emerging approaches include (i) geometric learning, i.e.

learning on representations such as graphs, 3D Voronoi

tessellations, and point clouds; (ii) pre-trained protein lan-

guage models leveraging attention; (iii) equivariant architec-

tures preserving the symmetry of 3D space; (iv) use of large

meta-genome databases; (v) combinations of protein repre-

sentations; (vi) and ﬁnally truly end-to-end architectures,

i.e. diﬀerentiable models starting from a sequence and re-

turning a 3D structure. Here, we provide an overview and

our opinion of the novel deep learning approaches devel-

oped in the last two years and widely used in CASP14.

K E Y W O R D S

deep learning, protein structure prediction, CASP14, geometric

learning, equivariance, end-to-end architectures, protein

language models

1 | INTRODUCTION

In December 2020, the fourteenth edition of CASP

marked a big leap in protein three-dimensional (3D)

structure prediction. Indeed, deep learning-powered ap-

proaches have reached unprecedented levels of near-

experimental accuracy. This achievement has been

made possible thanks to the latest improvements in ge-

ometric learning and natural language processing (NLP)

techniques, and to the amounts of sequence and struc-

∗

Equally contributing authors.

2 Laine et al.

ture data accessible today. The fundamental basis for

the revolution in structure prediction comes from the

use of co-evolution. While traditional measures of co-

variations in natural sequences led to a few successes

[1, 2, 3], major improvements came from recasting the

problem as an inverse Potts model [4, 5]. These ideas

started to show their full potential about 10 years ago

with the development of eﬃcient methods dealing with

large scale multiple sequence alignments [6, 7, 8]. They

enabled the modelling of 3D structures for large protein

families [9, 10, 11, 12, 13, 14].

Shifting from unsupervised statistical inference to su-

pervised deep learning further boosted the accuracy

of the predicted contacts, and extended the applicabil-

ity of this conceptual framework to families with fewer

sequences [15, 16] and to the prediction of residue-

residue distances [17, 18]. These advances have signif-

icantly increased the protein structure modelling cover-

age of genomes [19, 20, 21], and also of bacterial inter-

actomes [22, 23, 24]. Over the past years, the CASP

community has contributed to these eﬀorts, with an in-

creasing number of teams developing and applying deep

learning approaches.

The emergence of novel deep learning techniques

has inspired a re-visit of the representations best suited

for biological objects (protein sequences and structures).

In particular, advances in the treatment of language

[25] and of 3D geometry [26, 27, 28, 29, 30] by deep

learning architectures have further beneﬁted the ﬁeld

of protein structure and function prediction. Expanding

on this progress, the DeepMind team demonstrated in

CASP14 that it is possible to produce extremely accu-

rate 3D models of proteins by learning end-to-end from

sequence alignments of related proteins [31]. This im-

plies being able to capture long-range dependencies be-

tween amino acid residues, to transform these depen-

dencies into structural constraints, and to preserve the

symmetry and properties of the 3D space when operat-

ing on protein structures.

This article is a follow-up to Kandathil et al. [32].

It aims at providing CASP participants and observers

with some overview of the recent developments in deep

learning applied to protein structure prediction, and

some comprehensive description of key concepts we

think have contributed to the formidable improvements

we have witnessed in the latest CASP edition. We

then discuss the implications of these improvements,

the next-to-solve problems, and speculate about the fu-

ture of structural (and computational) biology.

2 | END-TO-END LEARNING FOR

PROTEIN STRUCTURE PREDIC-

TION

One of the advantages of deep learning methods com-

pared with traditional machine learning approaches is

EMBER-NLP

HMS-Casper-NLP

AlphaFold2

DMPfold2-New

DMPfold2

CUTSP

rawMSA

MSATransformer

DeepPotential

CopulaNet

Pharmulator

TOWER

Kiharalab_Contact

HMS-Casper

ProSPr

NOVA

iPhord

trRosetta

RaptorX

Galaxy

tripletRes

EMBER

A2I2Prot

DESTINI2

DeepHelicon

DeepHomo

ICOS

PrayogRealDistance

RBO-PSP-CP

DeepECA

ropius0

tFold

QDeep

RaptorX-QA

Ornate

3DCNN

AngularQA

3DCNN_prof

DeepAccNet

topQA

graphQA

graph-sh (S-GCN)

Deep-ML

GQArank

EDN

VoroCNN-GEMME

BrainFold*

Pretrained

MSA

MSA-feat

Contacts

Geometry

Structure

F I G U R E 1 Schematic representation of the inputs

and outputs of deep learning-based methods in

CASP14, excluding pipelines compiling several

methods coming from diﬀerent sources, and methods

lacking a clear description. The blue and red lines

indicate the input and output levels, respectively.

Pretrained: sequence embeddings determined from

NLP models pre-trained on huge amounts of sequence

data. MSA: raw multiple sequence alignement.

MSA-feat: MSA features (such as PSSMs, covariance

and precision matrices). Contacts: contact or distance

matrix. Geometry: geometrical features, typically

including contacts/distances and torsion angles.

Structure: 3D coordinates. QA: model quality. In case

of several inputs and/or outputs, we report those

closest to the "end". BrainFold is highlighted with a star

as it takes only the query sequence as input, without

using pre-trained embeddings. This classiﬁcation is

based on available information from CASP abstracts

and publications/preprints. See Supplementary Table

S1 for more details.

the ability to automatically extract features from the in-

put data without the need to carefully handcraft them

(and potentially miss salient information). Assuming suf-

ﬁcient training data is available, learned features are ex-

pected to better generalize to heterogeneous or novel

datasets. In addition, it is generally accepted that end-to-

end learning, where the network is trained to produce

the exact desired output and not some sort of heuris-

tic representation of it, is advantageous. Indeed, achiev-

ing a high accuracy on some intermediate result does

not guarantee high accuracy on the ﬁnal output. For

instance, a learning algorithm may achieve a small loss

on dihedral angles, and yet computing atomic coordi-

nates from the predicted dihedral angles could lead to a

high reconstruction error [33]. Nevertheless, introduc-

ing well-chosen intermediate losses in a so-called "end-

to-end" architecture can help to produce better ﬁnal

outputs [31]. These auxiliary intermediate losses pro-

vide some guarantee that the method is not only able

to produce an accurate ﬁnal output (e.g. a protein 3D

structure) but also to accurately model some other prop-

Laine et al. 3

erties of the object under study (e.g. secondary struc-

ture, stereo-chemical quality...), and a mean to incorpo-

rate additional domain knowledge. While most protein

structure prediction methods take pre-computed fea-

tures as input and output a contact or distance map, pos-

sibly augmented with other geometrical features (Fig. 1,

see iPhord, ProSPr [34], Kiharalab_Contact [35], Phar-

mulator, DeepPotential, RaptorX [36], Galaxy, Triple-

tRes [37], A2I2Prot, DESTINI2 [38], DeepHelicon [39],

DeepHomo [40], ICOS, PrayogRealDistance [41, 42],

RBO-PSP-CP [43], DeepECA, ropius0 [44], tFOLD, plus

QUARK, Risoluto, Multicom [45] and those from the

Zhang lab), several eﬀorts have been recently engaged

towards developing end-to-end architectures. Here, we

will shortly review these eﬀorts and try to identify the

key components of what represents end-to-end learn-

ing in protein structure prediction (Table 1).

Ideally, the ultimate input would be the sequence

of the query protein. So far, only a couple of learn-

ing methods have exploited solely and directly this in-

formation to eﬃciently fold proteins de novo [46, 47].

They rely on diﬀerentiable [46] and neural [47] poten-

tials whose parameters are learnt from conformational

ensembles generated by Langevin dynamics simulations.

More commonly, the strategy of state-of-the-art meth-

ods is to leverage the very high degenerative nature of

the sequence-structure relationship through the use of

a multiple sequence alignment (MSA) of evolutionary-

related sequences, or a pre-trained protein language

model (see below). In this context, methods qualifying

for "end-to-X" learning should take as input raw (pos-

sibly aligned) sequence(s), as opposed to features de-

rived from them such as conservation levels (e.g. stored

in a Position-Speciﬁc Scoring Matrix or PSSM) or co-

evolution estimates (e.g. mutual information, direct pair-

wise couplings). One of the ﬁrst examples of end-to-

X method was rawMSA [48], which leveraged embed-

ding techniques from the ﬁeld of NLP, to map the amino

acid residues into a continuous space adaptively learned

based on the sequence context (Table 1). In DMP-

fold2 [49, 50], this idea was extended to MSAs of ar-

bitrary lengths by scanning individual columns in the

MSA with stacked Gated Recurrent Unit (GRU) layers.

CopulaNet [51] adopts a query-centered view by ex-

panding the input MSA to a set of query-homolog pair-

wise alignments prior to embedding it. In AlphaFold2

[31], the MSA embedding is obtained through several

rounds of self-attention (see below) applied to the MSA

rows and columns. Beyond computing MSA embed-

dings, rawMSA, CopulaNet and AlphaFold2 add an ex-

plicit step aimed at converting the information they con-

tain into residue-residue pairwise couplings through an

outer product operation on the embedding vectors. Re-

cently, a compromise end-to-X solution where the com-

putation of traditional hand-crafted features takes place

on the GPU and is tightly coupled to the network was im-

plemented into trRosetta [52], allowing for backpropa-

gating gradients all the way to the input sequences [53].

At the other end of the spectrum, the ultimate output

is the 3D structure of the query protein. Thus, an "X-

to-end" deep learning architecture should directly pro-

duce 3D coordinates and not some intermediate repre-

sentation such as a contact map. M. AlQuraishi [54] was

among the ﬁrst to develop such a method in 2019 (Table

1). The model takes as input a PSSM, without account-

ing for any co-evolutionary information, and outputs the

Cartesian coordinates of the protein. The torsion angles

are predicted and used to reconstruct the 3D structure.

Although novel, such an approach has so far not proven

to perform better than earlier methods in CASP. One

well-known problem is that internal coordinates are ex-

tremely sensitive to small deviations as the latter easily

propagate through the protein, generating large errors

in the reconstructed structure [33]. To overcome this

problem, it is possible to eﬃciently reconstruct Carte-

sian coordinates from a distance matrix by using multi-

dimensional scaling (MDS) or other optimization tech-

niques as in CUTSP [55], DMPfold2 [49], or E2E and

FALCON-geom methods of CASP14. In its classical for-

mulation, used by both DMPfold2 and E2E, MDS ex-

tracts exact 3D coordinates (provided that the distance

matrix is exact) through eigendecomposition of the cen-

tered distance matrix. Nevertheless, one issue with us-

ing MDS as the ﬁnal layer in the network is that the

output may be a mirror image (chiral version) of the pro-

tein. The most recent version of DMPfold2 (DPMfold2-

new in Table 1 [50]) attempted to resolve this issue by

adding an extra-GRU layer. AlphaFold2 takes a diﬀer-

ent route and elegantly solves the 3D reconstruction

and the mirror-image problems jointly by learning spatial

transformations of the local reference frames of each

of the protein residues. Computing the geometric loss

function in the local frames automatically distinguishes

the mirror images, as one of the local axes is a vector

product of the two others. Noticeably, even though X-

to-end approaches generate a 3D structure, the latter is

usually reﬁned afterwards (for example through molec-

ular dynamics simulations). For instance, relaxation of

AlphaFold2’s output with a physical force ﬁeld is neces-

sary to enforce peptide bond geometry [31].

Although the protein 3D structure appears as an obvi-

ous and legitimate target, one may wonder whether gen-

erating 3D coordinates confers any advantage, in terms

of problem solving and performance, compared to a per-

fect 2D contact map. First, as mentioned above, eﬃ-

cient methods to use 2D information for generating 3D

models exist [56, 52]. Further, the most popular residue-

or even atom-level loss functions used in deep neural

networks (DNNs) do not depend on the superposition of

the predicted model to the ground-truth structure and

are evaluated using the comparison of distance maps.

The most illustrative example is the local distance dif-

ference test (lDDT) [57], which has been employed as a

target function in CASP14 by some of the best perform-

4 Laine et al.

ers including AlphaFold2 [31] and Rosetta. The value of

this loss would not change if we swap the 3D and 2D

representations. Nevertheless, it is not clear whether

a perfect 2D map can be reached without using some

3D knowledge about the structure. Operating on 3D

representations allows calculating global or local quality

scores reﬂective of the structural accuracy in a way that

2D distance maps do not, as illustrated by the mirror-

image issue mentioned above. The DNN can then learn

to regress against these quality scores, and iteratively re-

ﬁne a ﬁrst rough 3D guess by predicting (local) deforma-

tions to arrive at a better structure. However, operating

in 3D poses speciﬁc challenges related to the preserva-

tion of symmetries, which we discuss in Section 5. So

far, the only successful example of indisputable improve-

ment of 3D structure representation over 2D maps is

given by AlphaFold2 [31]. Whether similar performance

can be achieved with 2D maps and whether 2D maps

are needed at all in the predictive process remain open

questions.

Being able to produce 3D models resembling experi-

mental structures implies being able to tell apart "good"

from "bad" models. Hence, protein model quality assess-

ment (MQA or QA), now referred in CASP to as estima-

tion of model accuracy (EMA), has always been an im-

portant step in protein structure prediction pipelines. It

allows, in principle, to choose the best models (in case

of global QA) and/or spot inaccuracies in the proposed

models for a subsequent reﬁnement (in case of local QA).

In recent years, a large number of deep learning-based

approaches have been speciﬁcally designed for this task.

Classically, they take a 3D model as input and then as-

sess its quality in a stand-alone fashion (Fig. 1). Alter-

natively, some teams proposed integrative approaches.

For example, QDeep QA predictions [58] are based on

distance estimations from DMPfold [21]. In GalaxyRe-

ﬁne2 [59], ReﬁneD [60], and Baker suite [61], the QA is

incorporated into a model reﬁnement pipeline. Finally,

QA blocks may be used as an integral part of a sequence-

to-structure prediction process, as is the case in DMP-

fold2 [49] and AlphaFold2 [31].

3 | THE IMPORTANCE OF DATA

AND DATA REPRESENTATIONS

The success of deep-learning methods is heavily

grounded in the availability of large amounts of data,

and the development of suitable representations struc-

turing and expressing the information they contain. The

advent of high throughput sequencing technologies has

widened the gap between the number of known protein

sequences and known protein structures. Genomics

has become pre-eminent in terms of data scale, with

an exponential growth [64, 65]. These huge amounts

of data oﬀer unprecedented opportunities to develop

high-capacity models detecting co-variation patterns

and learning the "protein language".

3.1 | Leveraging (meta-)genomics

In the last few years, the accessible resources for unan-

notated sequences coming from metagenomics exper-

iments have multiplied. They include databases like

NCBI GenBank [66], Metaclust [67], BFD [68], MetaEuk

[69], EBI MGnify [70], and IMG/M [71]. Since CASP12,

several teams attempted to exploit this type of data,

mostly to increase the depth of the MSAs and obtain a

more accurate estimation of (co-)evolutionary features.

For example, RaptorX [36], methods from the Yang and

Baker teams [72, 73], Multicom [45], and GALAXY ex-

ploited metagenome data for contact prediction and

distance estimation between residue pairs in combina-

tion with residual convolutional neural networks (resC-

NNs). The HMS-Casper [54, 74], DMPfold2 [49] and

AlphaFold2 methods [31] exploited them directly to

predict 3D structures. Regarding QA, DeepPotential

from the Zhang lab and QDeep [58] leverage gener-

ated MSA proﬁles from metagenome databases. To

gather large amounts of sequences, coming from dif-

ferent sources, many teams relied on the DeepMSA al-

gorithm [75]. Most of the time, the sequences were

integrated altogether in a single MSA. However, some

methods proposed to combine several MSAs with dif-

ferent weights (e.g. Kihara’s lab) or to select a few of

them with high depth and/or variability (e.g. DeepPo-

tential). Noteworthily, deep learning is not only used to

exploit sequence alignments, but also to generate them.

For instance, the SAdLSA algorithm improves the qual-

ity of low-sequence identity alignments by learning the

"protein folding code" from structural alignments [76].

NDThreader [77] and ProALIGN [78] are speciﬁcally de-

signed to optimally align the query with the template

in template-based modeling. Both methods exploit pre-

dicted or observed inter-residue distances to improve

the sequence alignments, a strategy that proved power-

ful already in CASP13 [72, 79, 80].

3.2 | From MSA to query-speciﬁc

embeddings

The most traditional way to extract information from an

MSA is to compute a probabilistic proﬁle or a PSSM re-

ﬂecting the abundance of each amino acid at each po-

sition. This type of representation has been very popu-

lar from the very ﬁrst CASP editions. Over the past 10

years, direct coupling analysis (DCA)-based models[12],

including Potts model and pseudolikelihood maximiza-

tion [8, 81, 82, 83], and Graphical lasso-based (low-rank)

models [84, 85, 86] became widespread in the com-

munity. These statistical methods explicitly estimate

residue pairwise couplings as proxies for 3D contacts.

More recently, some meta-models [87, 88], correlation

and precision matrix-based approaches [89, 90, 52], and

Laine et al. 5

TA B L E 1 Overview of X-to-end and end-to-X deep learning approaches for protein structure prediction.

End-to-end learning

AlphaFold2[31] The MSA, along with templates, is fed into a translation and rotation equivairant trans-

former architecture, which outputs a 3D structural model

DMPfold2

(new)[49, 50]

The MSA, along with the precision matrix, is fed into a GRU, which outputs a 3D struc-

ture

End-to-X learning

MSA Transformer[62] Transformer architecture

rawMSA[48] The MSA is fed into a 2DCNN (the ﬁrst convolutional layer creates an embedding) which

outputs a contact map

CopulaNet[51] Extracts all sequence pairs from the MSA and feeds them to a dilated resCNN

TOWER The network is trained with a deep dilated resCNN to predict inter-residue distances

directly from the raw MSA

trRosetta[52] Computes traditional MSA features on the ﬂy and passes them to dilated convolutional

layers

X-to-end learning

NOVA[63] Adopts DeepFragLib from the same team which uses Long Short Term Memory units

(LSTMs), to output a 3D structure

DMPfold2[49] The MSA, along with the precision matrix, is fed into a GRU, which outputs distances

and angles (version used in CASP14)

HMS-Casper[54] Raw sequences plus PSSMs are given to a "Recurrent Geometrical Network" comprising

LSTM and geometric units and outputting a 3D structure

a variety of of deep-learning models [91, 16, 92, 93,

21, 38, 73, 42, 41, 37, 45], including generative adver-

sarial networks for contact map generation and reﬁne-

ment [94, 35], got widely used to capture the same type

of co-evolutionary information. One limitation of these

methods is that they estimate average properties over

an ensemble of sequences representative of a protein

family. Hence, they may miss information speciﬁcally

relevant to the protein query. The DeepMind team cir-

cumvented this limitation with AlphaFold2 by comput-

ing embeddings for residue-residue relationships within

the query and sequence-residue relationships between

the sequences in the MSA and the query, and making

the information ﬂow between these two representa-

tions. Alternatively, one may transfer the knowledge ac-

quired on hundreds of millions of natural sequences to

generate query-speciﬁc embeddings (Table 2). Several

models developed for NLP, including BERT [95], ELMo

[96], and GPT-2 [97], have been adapted to the "protein

language". During the semi-supervised training phase,

the model attempts to predict a masked or the next to-

ken [98]. In CASP14, EMBER directly made use of ELMo

and BERT while HMS-Casper [54] used a reformulated

version of the latter, called AminoBert. A2I2Prot and

CUTSP leveraged the TAPE initiative [98], which pro-

vides data, tasks and benchmarks to facilitate the evalu-

ation of protein transfer learning.

3.3 | Representations of protein

structure

Sequence-based protein representation may be en-

riched with diﬀerent levels of structural information, for

example, some prior knowledge about secondary struc-

ture (SS) elements. In principle, some of these elements,

such as alpha helices or beta strands, can be represented

with 3D primitives. An interesting idea that we saw in

CASP14 was the use of a discrete version of Frenet-

Serret frames for the protein backbone parametrization

by HMS-Casper. However, such a representation is very

complex, and a much simpler way would be to abstract

SS primitives with a hydrogen-bond (HB) 2D map. For

example, the ISSEC network was speciﬁcally trained to

segment SS elements in 2D contact maps [99]. Similarly,

the protein 3D topology may be abstracted as a 2D con-

tact map, or its probabilistic generalization, e.g. a matrix

ﬁlled with continuous probabilities or contact propen-

sities between protein atoms or residues. Beyond 2D

contact maps, richer descriptions of the 3D structures

can be achieved with 2D contact manifolds and protein

surfaces, 3D molecular graphs, point clouds, sets of ori-

ented local frames, volumetric 3D maps, or 3D tessel-

lations, e.g. through Voronoi diagrams (Table 2). These

diﬀerent levels of protein representations and their ap-

plications in CASP are discussed in more details below

and schematically shown in Fig. 2.

3.3.1 | Volumetric protein

representations

The ﬁrst attempt to train 3D CNNs on a volumetric pro-

tein representation dates back to CASP12, with the goal

of assessing protein model quality [100]. The architec-

ture was robust but had two major limitations. Speciﬁ-

cally, it relied on a predeﬁned protein’s atom types, and

the orientation of the protein model given as input had

Protein sequence-to-structure learning: Is this the end(-to-end revolution)?

Figures

Citations

Critical assessment of methods of protein structure prediction (CASP)-Round XIV.

On the Potential of Machine Learning to Examine the Relationship Between Sequence, Structure, Dynamics and Function of Intrinsically Disordered Proteins

Protein Design: From the Aspect of Water Solubility and Stability

Protein Design with Deep Learning.

The Transporter-Mediated Cellular Uptake and Efflux of Pharmaceutical Drugs and Biotechnology Products: How and Why Phospholipid Bilayer Transport Is Negligible in Real Biomembranes

References

Highly accurate protein structure prediction with AlphaFold

Sparse inverse covariance estimation with the graphical lasso

A fast algorithm for particle simulations

The energy landscapes and motions of proteins.

Generalized neural-network representation of high-dimensional potential-energy surfaces.

Related Papers (5)

Learning Sequences: An Efficient Data Structure for Learning Spaces

Meta-Learning Symmetries by Reparameterization

Structurally Layered Representation Learning: Towards Deep Learning Through Genetic Programming

Learning Sequences

Deep Learning of Representations

Frequently Asked Questions (15)

Q1. What are the contributions in "Protein sequence-to-structure learning" ?

Q2. What is the recent effort to learn a local quality metric?

Q3. What is the main reason for the popularity of equivariant architectures?

Q4. How does the method update the frames?

Q5. What is the popular method for analyzing protein transfer learning?

Q6. How many links are necessary to infer a triangle?

Q7. What is the role of spherical harmonics in the classical fast multipole method?

Q8. What is the role of QA blocks in a sequencetostructure prediction process?

Q9. What was the goal of the first attempt to train 3D CNNs on a volumetric?

Q10. What is the advantage of deep learning methods compared with traditional machine learning approaches?

Q11. What is the importance of relative orientation in the cat cartoons?

Q12. What is the way to account for long-range dependencies?

Q13. What is the strategy of state-of-the-art methods to leverage the high degenerative?

Q14. When did the idea of ap-proaches show their full potential?

Q15. What is the role of a network in identifying structural motifs?