Lawrence Berkeley National Laboratory
Recent Work
Title
Genome evolution in the allotetraploid frog Xenopus laevis.
Permalink
https://escholarship.org/uc/item/8h02q9hf
Journal
Nature, 538(7625)
ISSN
0028-0836
Authors
Session, Adam M
Uno, Yoshinobu
Kwon, Taejoon
et al.
Publication Date
2016-10-01
DOI
10.1038/nature19840
Peer reviewed
eScholarship.org Powered by the California Digital Library
University of California
Genome evolution in the allotetraploid frog Xenopus laevis
A full list of authors and affiliations appears at the end of the article.
Abstract
To explore the origins and consequences of tetraploidy in the African clawed frog, we sequenced
the
Xenopus laevis
genome and compared it to the related diploid
X. tropicalis
genome. We
demonstrate the allotetraploid origin of
X. laevis
by partitioning its genome into two homeologous
subgenomes, marked by distinct families of “fossil” transposable elements. Based on the activity
of these elements and the age of hundreds of unitary pseudogenes, we estimate that the two diploid
progenitor species diverged ~34 million years ago (Mya) and combined to form an allotetraploid
~17–18 Mya. 56% of all genes are retained in two homeologous copies. Protein function, gene
expression, and the amount of flanking conserved sequence all correlate with retention rates. The
subgenomes have evolved asymmetrically, with one chromosome set more often preserving the
ancestral state and the other experiencing more gene loss, deletion, rearrangement, and reduced
gene expression.
Ancient polyploidization events have shaped diverse eukaryotic genomes
1
, including two
rounds of whole genome duplication at the base of the vertebrate radiation
2
. While such
polyploidy is rare in amniotes, presumably due to constraints on sex chromosome dosage
3,4
,
it is common in fish
5
and amphibian lineages
6,7
, and in plants
8
. Polyploidy provides raw
material for evolutionary diversification, since gene duplicates can support new functions
and networks
9
. However, the component subgenomes of a polyploid must cooperate to
mediate potential incompatibilities of dosage, regulatory controls, protein-protein
interactions, and transposable element activity.
The African clawed frog
Xenopus laevis
is one of a polyploid series that ranges from diploid
to dodecaploid, and thus is ideal for studying the impact of genome duplication
10
, especially
given its status as a premier model for cell and developmental biology
11
.
X. laevis
has a
chromosome number (2N=36) nearly double that of the Western clawed frog
Xenopus
(formerly
Silurana
)
tropicalis
(2N=20) and most other diploid frogs
12
, and is proposed to be
an allotetraploid that arose
via
the interspecific hybridization of diploid progenitors with
2N=18, followed by subsequent genome doubling to restore meiotic pairing and disomic
inheritance
10,13
(See Supplementary Note 1, Extended Data Fig. 1 for discussion of the
Xenopus
allotetraploidy hypothesis).
Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research,
subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms
Correspondence to: Richard M. Harland; Masanori Taira; Daniel S. Rokhsar.
*
equal contribution
Supplementary Information is linked to the online version of the paper. Please see Supplemental Note 15 for funding information and
data deposition information.
HHS Public Access
Author manuscript
Nature
. Author manuscript; available in PMC 2017 April 20.
Published in final edited form as:
Nature
. 2016 October 20; 538(7625): 336–343. doi:10.1038/nature19840.
Author Manuscript Author Manuscript Author Manuscript Author Manuscript
Here we prove the allotetraploid hypothesis by tracing the origins of the
X. laevis
genome
from its extinct progenitor diploids. The two subgenomes are distinct and maintain separate
recombinational identities. Despite sharing the same nucleus, we find that the subgenomes
have evolved asymmetrically: one of the two subgenomes has experienced more
intrachromosomal rearrangement, gene loss by deletion and pseudogenization, changes in
levels of gene expression, and in histone and DNA methylation. Superimposed on these
global trends are local gene family expansions and alteration of gene expression patterns.
Results
Assembly, annotation, and karyotype
We sequenced the genome of the
X. laevis
inbred “J” strain by whole genome shotgun
methods in combination with long-insert clone-based end sequencing, (Supplementary Note
2) and organized the assembled sequences into chromosomes using fluorescence
in situ
hybridization (FISH) of 798 bacterial artificial chromosome clones (BACs) and
in vivo
and
in vitro
chromatin conformation capture analysis (Supplementary Note 3; Online Methods).
These complementary methods produced a high quality chromosome-scale draft that
includes all previously known
X. laevis
genes and assigns >91% of the assembled sequence
(and 90% of the predicted protein-coding genes) to a chromosomal location.
We annotated 45,099 protein-coding genes and 342 microRNAs using RNAseq from 14
oocyte/developmental stages and 14 adult tissues and organs (Supplementary Note 4),
analysis of histone marks associated with transcription, and homology with
X. tropicalis
and
other tetrapods (Supplementary Note 5; Online Methods). 24,419
X. laevis
protein-coding
genes can be placed in 2:1 or 1:1 correspondence with 15,613
X. tropicalis
genes, defining
8,806 homeologous pairs of
X. laevis
genes with
X. tropicalis
orthologs, and 6,807 single
copy orthologs. The remaining genes are members of larger gene families (olfactory receptor
genes,
etc.
) whose
X. tropicalis
orthology is more complex.
The
X. laevis
karyotype (Fig. 1a) reveals nine pairs of homeologous chromosomes
1,14,15
.
Each of the first eight pairs is co-orthologous to and named for a corresponding
X. tropicalis
chromosome, appending an “L” and “S” for the longer and shorter homeologs,
respectively
16
. XLA2L is the Z/W sex chromosome
17
, for which we determined a W-
specific sequence in the q-subtelomeric region that includes the sex-determining gene
dmw
17
, and a corresponding Z-specific haplotype. The homeologous XLA2Sq, by contrast,
has no such locus, and neither does XTR2 (Extended Data Fig. 2a, Supplemental Note 6).
The ninth pair of homeologs is a q-q fusion of proto-chromosomes homologous to XTR9
and XTR10, which likely occurred prior to allotetraploidization (Extended Data Fig. 2b–d;
Supplementary Note 6). The S chromosomes are on average 13.2% shorter karyotypically
16
and 17.3% shorter in assembled sequence than their L counterparts. The single nucleotide
polymorphism rate in
X. laevis
is ~0.4%, far less than the ~6% divergence between
homeologous genes (Extended Data Fig. 1c; Supplementary Note 8.8).
Session et al. Page 2
Nature
. Author manuscript; available in PMC 2017 April 20.
Author Manuscript Author Manuscript Author Manuscript Author Manuscript
Subgenome identity and timing of allotetraploidization
We reasoned that dispersed relicts of transposable elements specific to each progenitor
would mark the descendent subgenomes in an allotetraploid (Fig. 2c, Extended Data Fig. 1).
Three classes of DNA transposon relicts appear almost exclusively on either the L or S
chromosomes (Supplementary Note 7). Xl-TpL_Harb and Xl-TpS_Harb are novel
subfamilies of miniature inverted-repeat transposable elements (MITE) of the PIF/Harbinger
superfamily
18,19
whose relicts are almost completely restricted to L or S chromosomes,
respectively (Fig. 1b, Extended Data Fig. 3a). Similarly, sequence relicts of the Tc1/mariner
superfamily member Xl-TpS_Mar (closely related to the fish MMTS subfamily
20
) are found
almost exclusively on the S chromosomes (Fig. 1b), as confirmed by FISH analysis using
Xl-TpS_Mar as a probe (Fig. 1c, Supplemental Note 7.4; see Supplemental Note 7.3 for
details on the rare elements that map to the opposite subgenome).
The L and S chromosome sets therefore represent the descendants of two distinct diploid
progenitors, confirming the allotetraploid hypothesis even in the absence of extant
progenitor species. Based on analysis of synonymous divergence of protein-coding genes,
the L and S subgenomes diverged from each other ~34 Mya (T
2
) and from
X. tropicalis
~48
Mya (T
1
)(Fig. 2a), consistent with prior gene-by-gene estimates from transcriptomes
21–24
(Supplementary Note 8, Extended Data Fig. 4; Online Methods). L- and S-specific
transposable elements were active ~18–34 Mya, indicating that the two progenitors were
independently evolving diploids during that period (Fig. 2a; Supplementary Note 7.5;
Extended Data Fig. 3). More recent transposon activity is more uniformly distributed across
the L and S chromosomes (not shown). Finally, consistent with a common origin for
tetraploid
Xenopus
species, we can clearly identify orthologs of L and S genes in whole
genome sequences of another allotetraploid frog,
X. borealis
, and estimate the
X. laevis
-
X.
borealis
divergence to be ~17 Mya (T
3
). These considerations constrain the allotetraploid
event to ~17–18 Mya (T*). This timing is consistent with other estimates of the radiation of
tetraploid
Xenopus
species, which are presumed to emerge from the bottleneck of a shared
allotetraploid founder population
23,24
.
Karyotype stability
Remarkably, with the exception of the chromosome 9/10 fusion,
X. laevis
and
X. tropicalis
chromosomes have maintained conserved synteny since their divergence ~48 Mya (Fig.
1a,b). The absence of inter-chromosomal rearrangements is consistent with the relative
stability of amphibian and avian karyotypes compared to mammals
25
, which typically show
dozens of inter-chromosome rearrangements
26
. It also contrasts with many plant polyploids,
which can show considerable inter-subgenome rearrangement
27
. The distribution of L- and
S-specific repeats along entire chromosomes implies the absence of crossover recombination
between homeologs since allotetraploidization, presumably because the two progenitors
were sufficiently diverged to avoid meiotic pairing between homeologous chromosomes,
though we cannot rule out very limited localized inter-homeolog exchanges (Supplementary
Note 7).
The extensive collinearity between homologous
X. laevis
L and
X. tropicalis
chromosomes
(Fig. 1a) implies that they represent the ancestral chromosome organization. In contrast, the
Session et al. Page 3
Nature
. Author manuscript; available in PMC 2017 April 20.
Author Manuscript Author Manuscript Author Manuscript Author Manuscript
S subgenome shows extensive intra-chromosomal rearrangements, evident in the large
inversions of XLA2S, XLA3S, XLA4S, XLA5S and XLA8S, as well as shorter
rearrangements (Fig. 1a). The S subgenome has also experienced more deletions. For
example, the 45S pre-ribosomal RNA gene cluster is found on
X. laevis
XLA3Lp, but its
homeologous locus on XLA3Sp is absent (Extended Data Fig. 5a). Extensive small-scale
deletions (Extended Data Fig. 5b) reduce the length of S chromosomes relative to the L and
X. tropicalis
counterparts (see below).
Response of subgenomes to allotetraploidy
Redundant functional elements in a polyploid are expected to rapidly revert to single copy
through the fixation of disabling mutations and/or loss
28
unless prevented by
neofunctionalization
8
, subfunctionalization
26
, or selection for gene dosage
29
. Differential
gene loss between homeologous chromosomes is sometimes referred to as “genome
fractionation”
30–32
(see Supplementary Note 1) At least 56.4% of the protein-coding genes
duplicated by allotetraploidization have been retained in the
X. laevis
genome
(Supplementary Note 10; 60.2% if genes on unassigned short scaffolds are included).
Previous studies that rely on cDNA
21
and EST surveys
22,33,34
have observed far lower rates
of retention, probably due to sampling biases from gene expression (Supplementary Note
8.2).
Even higher retention rates are found for homeologous microRNAs (156 of 180, 86.7%), as
also found in the salmonid-specific duplication
5
, and both primary copies are expressed for
intergenic homeologous microRNAs (Supplementary Note 8.6; Extended Data Fig. 5e). Pan-
vertebrate putatively
cis
-regulatory conserved non-coding elements
35
are also highly
retained (541 of 550, 98.4%; Supplementary Note 8.7; Table 1). CNEs conserved between
X. laevis
and
X. tropicalis
, however, are retained at a significantly lower rate (49%; Table 1).
Longer genes (by genomic span, exon number, or coding length) are more likely to be
retained (Wilcoxon p-value <= 1E-5; Supplementary Note 10.5; Extended Data Fig. 5 h–j),
broadly consistent with the idea that longer genes have more independently mutable
functions and are therefore more susceptible to subfunctionalization and subsequent
retention
36
.
Genes have been lost asymmetrically between the two subgenomes of
X. laevis
. Similar
results have been reported for some plant polyploids
30
but not in rainbow trout
5
. For
X.
laevis
protein-coding genes with clear 1:1 or 2:1 orthologs in
X. tropicalis
, we find that
significantly more genes are lost on the S subgenome (31.5%)
vs.
the L subgenome (8.3%;
χ
2
test p-value=2.23E-50, Supplemental Table 2), with the same trend for other types of
functional elements, such as H3K4me3-enriched promoters and p300-bound enhancers
(Table 1). Across most of the genome, genes appear to be lost independently of their
neighbors, as the distribution of runs of gene losses are nearly geometrically distributed (Fig.
3a, right). We do observe some large block deletions (
e.g
., several olfactory clusters
(Extended Data Fig. 5b) and a few unusually long blocks of functionally unrelated genes
that are retained in two copies without loss (Fig 3a, left).
Many lost genes are simply deleted, as demonstrated by significantly shorter distances
between conserved flanking genes. Both the size and number of deletions are greater on the
Session et al. Page 4
Nature
. Author manuscript; available in PMC 2017 April 20.
Author Manuscript Author Manuscript Author Manuscript Author Manuscript