scispace - formally typeset
Open AccessPosted ContentDOI

Fitness landscape analysis reveals that the wild type allele is sub-optimal and mutationally robust

TLDR
In this paper, the authors analyzed a previously measured fitness landscape of a yeast tRNA gene and found that the wild type allele is sub-optimal, but is mutationally robust (9flat9).
Abstract
Fitness landscape mapping and the prediction of evolutionary trajectories on these landscapes are major tasks in evolutionary biology research. Evolutionary dynamics is tightly linked to the landscape topography, but this relation is not straightforward. Models predict different evolutionary outcomes depending on mutation rates: high-fitness genotypes should dominate the population under low mutation rates and lower-fitness, mutationally robust (also called 9flat9) genotypes - at higher mutation rates. Yet, so far, flat genotypes have been demonstrated in very few cases, particularly in viruses. The quantitative conditions for their emergence were studied only in simplified single-locus, two-peak landscapes. In particular, it is unclear whether within the same genome some genes can be flat while the remaining ones are fit. Here, we analyze a previously measured fitness landscape of a yeast tRNA gene. We found that the wild type allele is sub-optimal, but is mutationally robust (9flat9). Using computer simulations, we estimated the critical mutation rate in which transition from fit to flat allele should occur for a gene with such characteristics. We then used a scaling argument to extrapolate this critical mutation rate for a full genome, assuming the same mutation rate for all genes. Finally, we propose that while the majority of genes are still selected to be fittest, there are a few mutation hot-spots like the tRNA, for which the mutationally robust flat allele is favored by selection.

read more

Content maybe subject to copyright    Report

Fitness landscape analysis reveals that the wild type allele is
1
sub-optimal and mutationally robust
2
Tzahi Gabzi (1), Yitzhak Pilpel (1) and Tamar Friedlander (2)
(1) Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
(2) The Robert H. Smith Institute of Plant Sciences and Genetics in Agriculture
Faculty of Agriculture, Hebrew University of Jerusalem,
P.O. Box 12 Rehovot 7610001, Israel
Correspondence: tamar.friedlander@mail.huji.ac.il, pilpel@weizmann.ac.il.
3
September 27, 2021
4
Abstract
5
Fitness landscape mapping and the prediction of evolutionary trajectories on these landscapes are
6
major tasks in evolutionary biology research. Evolutionary dynamics is tightly linked to the land-
7
scape topography, but this relation is not straightforward. Models predict dierent evolutionary
8
outcomes depending on mutation rates: high-tness genotypes should dominate the population un-
9
der low mutation rates and lower-tness, mutationally robust (also called 'at') genotypes - at higher
10
mutation rates. Yet, so far, at genotypes have been demonstrated in very few cases, particularly in
11
viruses. The quantitative conditions for their emergence were studied only in simplied single-locus,
12
two-peak landscapes. In particular, it is unclear whether within the same genome some genes can
13
be at while the remaining ones are t. Here, we analyze a previously measured tness landscape
14
of a yeast tRNA gene. We found that the wild type allele is sub-optimal, but is mutationally robust
15
('at'). Using computer simulations, we estimated the critical mutation rate in which transition
16
from t to at allele should occur for a gene with such characteristics. We then used a scaling
17
argument to extrapolate this critical mutation rate for a full genome, assuming the same mutation
18
rate for all genes. Finally, we propose that while the majority of genes are still selected to be ttest,
19
there are a few mutation hot-spots like the tRNA, for which the mutationally robust at allele is
20
favored by selection.
21
Introduction
22
Fitness landscape mapping and prediction of evolutionary trajectories of these landscapes are major
23
tasks in evolutionary biology [1]. While evolutionary theory predicts that population mean tness
24
should increase over time, it oers only few quantitative predictions for the dynamics of evolution
25
and the possible evolutionary trajectories. The main hurdle for generally computing evolutionary
26
1
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under a
preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted September 27, 2021. ; https://doi.org/10.1101/2021.09.27.461914doi: bioRxiv preprint

trajectories is their dependence on the underlying tness landscape. Currently available tness
27
landscapes include between 16 and 100,000 dierent genotypes (for review see [2, 3]). Yet, even the
28
largest datasets [4, 5, 6, 7] encompass only small fractions of the entire tness landscape of even
29
a single gene. As detailed tness measurements have been unavailable until recently, most of the
30
associated theory was developed in isolation from data [8, 9, 10, 11, 12, 13, 14, 15, 16]. Additionally,
31
the development of a general theory is dicult, because tness landscapes are diverse and dier in
32
details.
33
Evolutionary dynamics on empirical tness landscapes was studied in cases in which genotype-
34
phenotype mapping was available, such as folded RNA molecules [17, 18, 19] and transcription-factor
35
binding sites [20, 21, 22] or in computational tness landscape models closely inspired by particular
36
experimental systems, such as maturation of the immune response [13, 23] and molecular interac-
37
tions [24, 25]. Evolutionary dynamics on phenotypic tness landscapes was studied for bacterial
38
metabolic networks [26] and antibiotic resistance [27]. Exploration of empirical tness landscapes
39
and extraction of their statistical features such as local correlation, epistasis, ruggedness and density
40
of local maxima [8, 28, 2, 6, 29, 30, 31], were pursued in the belief that these statistical hallmarks
41
will aid in translating evolutionary trajectories to more general landscapes [32, 33].
42
The focus of the studies surveyed above was genotype
tness
. Genes however are thought to
43
evolve not only to maximize tness, but also to reduce crosstalk [34, 35], increase network modu-
44
larity [36] and allow for desired signaling properties [37, 24]. Mutational robustness - the extent to
45
which tness changes due to mutations - has been demonstrated to be an additional driver of evolu-
46
tion [38, 39, 19, 40, 41, 42, 43, 44]. The quasi-species framework developed by Manfred Eigen and
47
Peter Schuster [45, 46, 47] is a theoretical framework that describes mutation-selection evolutionary
48
dynamics of a large number of distinct genotypes. This framework is suitable for studying evolution
49
of genetic sequences with a large variety of alleles, as those captured by tness landscapes. Quasi-
50
species theory is an extension of the simple single-locus systems studied in population genetics [48].
51
While the above-mentioned models mostly assumed the strong-selection-weak-mutation (SSWM)
52
regime, in which the population is nearly monomorphic, the quasi-species framework allows for high
53
mutation rates such that the population is polymorphic. This theory predicts a failure to adapt (so-
54
called "error catastrophe") if the mutation rate exceeds a threshold value. In intermediate mutation
55
rates, it predicts that populations could (depending on the landscape) favor sub-optimal but muta-
56
tionally robust genotypes over the ttest ones. This "survival of the attest" result has been shown
57
theoretically for the simple two-peak landscape case [49, 50]. It was demonstrated in simulations of
58
digital organisms [51] and experimentally in plant viral pathogens [52] and RNA viruses [53].
59
The advent in sequencing technologies now enables measurement of increasingly larger tness
60
landscape datasets [6, 7]. It is then desirable to predict evolutionary trajectories on these empirical
61
tness landscapes, using the previously developed theory in this eld.
62
A recent set of experiments characterized the tness landscape of the tRNA
Arg
CCU
gene of
S.
63
cerevisiae
. As this gene is relatively short (72 nucleotides), its landscape is signicantly smaller
64
than that of a typical protein (average of 1.4 kb in
S. cerevisiae
). It is a single-copy, non-essential
65
gene, such that many of its mutants are viable. Li
et al.
measured the growth rates of
23, 284
66
dierent mutants of this gene in four dierent growth conditions (
23
C,
30
C,
37
C and oxidative
67
2
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under a
preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted September 27, 2021. ; https://doi.org/10.1101/2021.09.27.461914doi: bioRxiv preprint

stress) [54, 55]. The richness of this dataset renders it a highly valuable case study for analyzing
68
topographic properties and evolutionary trajectories of an empirical landscape and for comparing
69
them with theoretical predictions. Here, we comprehensively analyze this tRNA tness landscape, in
70
eorts to identify the properties that dictate whether a particular genotype can be the "wild type",
71
namely the extant outcome of the evolutionary dynamics. We found that the wild type was not the
72
ttest genotype, in any of the four conditions measured, nor was it the ttest on average over all, nor
73
a local tness maximum. We then dened a measure of genotype local atness with respect to its
74
single-point mutants and found that the wild type was one of the attest genotypes in the dataset.
75
Stochastic evolutionary simulations over this empirical tness landscape showed a phase transition at
76
a threshold mutation rate, from a population dominated by a high-tness (non wild type) genotype
77
at low mutation rates to a collection of many intermediate-tness genotypes composed of the wild
78
type and other genotypes of similar tness. To estimate the full-genome mutation rate in which
79
this transition is expected, we used the threshold mutation rate for the tRNA alone, as obtained in
80
the simulations, and applied a scaling argument, assuming equal properties for all loci. Variation in
81
either local mutation rate or gene susceptibility to mutation could however cause hybrid constructs
82
with a mixture of t and at genes in the same genome.
83
Results
84
The wild type is not the ttest genotype.
85
Our dataset consists of experimental tness measurements of
65, 000
mutants of the
S. cerevisiae
86
tRNA
Arg
CCU
gene collected by Li
et al.
[54, 55]. Growth rates of 23,284 of these mutants were
87
measured under four dierent environmental conditions:
23
C,
30
C,
37
C and oxidative stress. In
88
the following, we refer only to the genotypes that were measured under all four conditions. The
89
tness of each genotype was dened as the base-2 exponent of its relative growth rate with respect
90
to the wild type (see Methods). Hence, by denition the wild type tness was set to 1, for each
91
condition.
92
We began by closely examining the tness values dataset. Our rst remarkable nding was that
93
the wild type was not the genotype with highest tness under any of the four conditions, as one
94
might expect from population-genetic models for single-locus selection, if the population is at steady
95
state. Under each of the conditions, between 2000 and 2400 mutants (out of the 23,284) exhibited
96
higher tness than the wild type (Fig. 1b-e)
1
. We then analyzed possible sources for measurement
97
errors, including read-count variability, as a source of inaccuracy in growth rate assessment and
98
the possibility that the tness eect was due to independent mutations that fortuitously occurred
99
elsewhere in the genome (SI - Figs. S1-S2). While such error sources did exist, they could not fully
100
account for the wild type's tness sub-optimality.
101
1
2441 genotypes in
23
C, 2075 genotypes in
30
C, 2008 genotypes in
37
C and 2236 genotypes in oxidative stress
3
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under a
preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted September 27, 2021. ; https://doi.org/10.1101/2021.09.27.461914doi: bioRxiv preprint

1
fitness
N
1
N
2
N
3
N
4
WT
(a)
(b) (c)
(d) (e)
4
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under a
preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted September 27, 2021. ; https://doi.org/10.1101/2021.09.27.461914doi: bioRxiv preprint

Figure 1
(previous page)
:
Empirical tness landscape of a tRNA gene (a)
A schematic
visualization of the experimentally measured tRNA tness landscape. Each circle represents a
genotype. Filled circles represent genotypes whose tness values (here encoded by dierent colors)
were measured. Empty circles represent genotypes whose tness values were not measured. We use
here a concentric representation of the tness landscape, centered around the wild type, where the
minimal number of steps on the graph between any two genotypes is the number of point mutations
separating them. The wild type is then surrounded by expanding circles of its single mutants
(denoted by
N
1
), double mutants (
N
2
), etc. The experiment probed all the wild type's single-point
mutants, but only decreasingly smaller proportions of the following mutational neighborhoods,
N
i
.
(b-e):
The distribution of all tness values measured under four dierent conditions (
30
C,
23
C,
DMSO and
37
C), at semi-log scale. The wild type tness value is shown in each by the red dotted
line. Fitness was dened relative to the wild type's tness, such that the wild type tness was set to
1 for each condition. Under each of the conditions tested, 8%-10% of the genotypes in this dataset
were tter than the wild type. The relative weights of dierent tness values were biased by the
non-uniform sampling of the landscape, with dense sampling close to the wild type, and sparser
sampling farther away.
The wild type is not the ttest on average across conditions
102
A possible explanation for the apparent sub-optimality of the wild type could be that while some
103
mutants are tter than the wild type under a specic condition, they are much less t under other
104
conditions, such that,
on average
the wild type is ttest. To test the applicability of this explanation
105
for our case, we checked for each genotype the correlation between its tness values under the various
106
growth conditions. For high-tness genotypes (>1.05
30
C), a high correlation was found between
107
the tness values measured under various conditions,
r
0.75
0.91
between tness values at
30
C
108
and tness values under the other conditions. Namely, most genotypes which are t under one
109
condition are also t under others (see Fig. 2a). In contrast, genotypes with low tness in the range
110
0.6-0.8 at
30
C, showed a much lower correlation between their tness values across conditions,
111
r
0.28
0.49
(see Fig. 2b). These results argue against the possibility that the wild type is the
112
ttest on average, which would imply that genotypes having high-tness under one condition should
113
have low tness under another.
114
To formally compare between tness values averaged over multiple conditions, we considered the
115
geometric mean tness [56],
x
f
i
y p
±
m
f
m
i
q
1
{
M
, where
f
m
i
is the tness value of the
i
-th genotype
116
in the
m
-th condition (out of
M
). The tness values we have are relative to the wild type's, whose
117
tness was dened to be 1 under each of the conditions. Since growth rates dier between conditions,
118
we must rst transform the tness values under the dierent conditions to a common baseline before
119
we can calculate the geometric mean. To do so, we used the wild type growth rates reported by
120
Li
et al.
for each of the conditions (See Methods section for details). Fig. 2c shows a histogram
121
of the geometric mean tness values
x
f
i
y
of all the genotypes in our dataset after transforming the
122
original values. A possible caveat to this calculation is the underlying assumption that all four
123
conditions are equally probable in the organism's natural habitat. Empirical tness values might
124
be inaccurate due to various reasons as discussed in the SI (Section 1). To reduce dependence on
125
tness value inaccuracies, we may also look at the tness ranks: under each condition separately,
126
all the genotypes are ranked according to their tness values in ascending order (lowest tness has
127
5
.CC-BY-NC-ND 4.0 International licenseperpetuity. It is made available under a
preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for thisthis version posted September 27, 2021. ; https://doi.org/10.1101/2021.09.27.461914doi: bioRxiv preprint

References
More filters
Journal ArticleDOI

Selforganization of matter and the evolution of biological macromolecules

TL;DR: The causes and effect of cause and effect, and the prerequisites of Selforganization, are explained in more detail in the I.IA.
Journal ArticleDOI

The hypercycle. A principle of natural self-organization. Part A: Emergence of the hypercycle.

TL;DR: In this article, a detailed study of a special type of functional organization and demonstrates its relevance with respect to the origin and evolution of life is presented, which can be formally represented by the concept of the quasi-species.
Journal ArticleDOI

Towards a general theory of adaptive walks on rugged landscapes.

TL;DR: This article develops parts of a universal theory of adaptation on correlated landscapes by adaptive processes that have sufficient numbers of mutations per individual to "jump beyond" the correlation lengths in the underlying landscape.
Journal ArticleDOI

Evolution of digital organisms at high mutation rates leads to survival of the flattest

TL;DR: According to quasi-species theory, selection favours the cloud of genotypes, interconnected by mutation, whose average replication rate is highest, and this prediction is confirmed using digital organisms that self-replicate, mutate and evolve.
Book

Population Genetics: A Concise Guide

TL;DR: While each chapter treats a specific topic or problem in genetics, the common thread throughout the book is the question "Why is there so much genetic variation in natural populations?".
Related Papers (5)
Trending Questions (1)
Is the major allele same as wild allele?

No, the wild type allele is sub-optimal but mutationally robust.