scispace - formally typeset
Search or ask a question
Posted ContentDOI

Meta-Analysis of the Dynamics of the Emergence of Mutations and Variants of SARS-CoV-2

08 Mar 2021-medRxiv (Cold Spring Harbor Laboratory Press)-
TL;DR: In this article, the authors analyzed the appearance and prevalence trajectory of mutations that appeared in all SARS-CoV-2 genes from December 2019 to January 2021, and analyzed the structural properties of the spike glycoprotein of the B.1.7, B.351 and P.1 (Brazil) variants of concern.
Abstract: The novel Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) emerged in late December 2019 in Wuhan, China, and is the causative agent for the worldwide COVID-19 pandemic. SARS-CoV-2 is a 29,811 nucleotides positive-sense single-stranded RNA virus belonging to the betacoronavirus genus. Due to inefficient proofreading ability of the viral RNA-dependent polymerase complex, coronaviruses are known to acquire new mutations following replication, which constitutes one of the main factors driving the evolution of its genome and the emergence of new genetic variants. In the last few months, the identification of new B.1.1.7 (UK), B.1.351 (South Africa) and P.1 (Brazil) variants of concern (VOC) highlighted the importance of tracking the emergence of mutations in the SARS-CoV-2 genome and their impact on transmissibility, infectivity, and neutralizing antibody escape capabilities. These VOC demonstrate increased transmissibility and antibody escape, and reduce current vaccine efficacy. Here we analyzed the appearance and prevalence trajectory of mutations that appeared in all SARS-CoV-2 genes from December 2019 to January 2021. Our goals were to identify which modifications are the most frequent, study the dynamics of their spread, their incorporation into the consensus sequence, and their impact on virus biology. We also analyzed the structural properties of the spike glycoprotein of the B.1.1.7, B.1.351 and P.1 variants. This study offers an integrative view of the emergence, disappearance, and consensus sequence integration of successful mutations that constitute new SARS-CoV-2 variants and their impact on neutralizing antibody therapeutics and vaccines. IMPORTANCESARS-CoV-2 is the etiological agent of COVID-19, which has caused > 2 million deaths worldwide as of January, 2021. Mutations occur in the genome of SARS-CoV-2 during viral replication and affect viral infectivity, transmissibility and virulence. In early March 2020, the D614G mutation in the spike protein emerged, which increased the viral transmissibility and is now found in >90% of all SARS-CoV-2 genomic sequences in GISAID database. Between October and December 2020, B.1.1.7 (UK), B.1.351 (South Africa) and P.1 (Brazil) variants of concern (VOCs) emerged, which have increased neutralizing antibody escape capabilities because of mutations in the receptor binding domain of the spike protein. Characterizing mutations in these variants is crucial because of their effect on adaptive immune response, neutralizing antibody therapy, and their impact on vaccine efficacy. Here we tracked and analyzed mutations in SARS-CoV-2 genes over a twelve-month period and investigated functional alterations in the spike of VOCs.

Summary (1 min read)

INTRODUCTION

  • In late December 2019, a new betacoronavirus known as Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-COV-2) emerged in the city of Wuhan in the province of Hubei, China (1).
  • Coronaviruses are therefore expected to evolve through genetic drift much slower than other RNA viruses that do not have this ability, such as influenza viruses (8,10).
  • Genetic variants are therefore the rare successful offshoots of the reference strain.
  • In addition to the N501Y mutation, both the South African and Brazil variants possesses RBD mutations K417N(T) and E484K, which are also associated with further increased Nabs escape capabilities (24,25).

MATERIALS AND METHODS

  • Data collection and mutational analysis Genomes uploaded to the GISAID EpiCoVTM server database were analyzed from December 1st, 2019, to December 31st, 2020, and selected viral sequences with submission dates from December 1st, 2019 to January 6, 2021.
  • The authors filtered through 309,962 genomes for the analysis of selected mutations.
  • For the analysis of the mutations in B.1.1.7, B.1.351, and P.1 variants, the authors used the GISAID EpiCoVTM server database.
  • Only complete SARS-CoV-2 genomes (28 to 30 Kbps) isolated from human hosts were analyzed.

Structural modeling

  • Mutations in the spike protein in complex with hACE2 were analyzed using a mutagenesis tool for PyMOL (PDB: 7A94).
  • Figures and rendering were prepared with PyMOL.

RESULTS

  • Identification of emerging mutations in various SARS-CoV-2 genes.
  • The authors also demonstrate that most genes in SARS-CoV-2 have mutations with overall frequencies lower than 10% (Fig.1).
  • Therefore, mutations presented here vastly underrepresent the global landscape of mutation frequency dynamics.
  • These results allow us to better understand the frequencies, localization, and interactions of mutations in the S protein of the B.1.1.7 variant.

DISCUSSION

  • The emergence of new genetic variants that are more transmissible, virulent and resistant to antibody neutralization have highlighted the importance of studying the function of mutations in the viral genome.
  • The Oxford-AstraZeneca vaccine has displayed compromised efficacy against the B.1.351 variant with only 10% vaccine efficacy (36).
  • The combination of D614G with other mutations in S can enable Nabs escape (19).
  • The global frequency of mutations and variants in the database is therefore biased to represent the genetic landscape of the virus in the countries doing the most testing, sequencing, and data sharing.
  • A global approach to analyzing both transmitted variants and non-transmitted sub-variants and quasispecies could provide a better understanding of the effects of SARS-CoV-2 mutations.

FIGURE LEGENDS

  • Variations in mutations and mutation frequencies in SARS-CoV-2 genes, also known as Figure 1.
  • Graphs were generated using RStudio and Biorender.
  • G) Genome of the SARS-CoV-2 P.1 variant with identified nucleotides substitution, deletions, and insertions.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Meta-Analysis and Structural Dynamics of the Emergence of
Genetic Variants of SARS-CoV-2
Nicolas Castonguay
1
, Wandong Zhang
2,3
and Marc-Andre Langlois
1,4*
1
Department of Biochemistry, Microbiology & Immunology, Faculty of Medicine, University of
Ottawa, Ontario, Canada K1H 8M5.
2
Department of Cellular & Molecular Medicine, Faculty of Medicine, University of Ottawa,
Ontario, Canada K1H 8M5.
3
Human Health Therapeutics Research Centre, National Research Council Canada, Ottawa,
Canada K1A 0R6
4
uOttawa Center for Infection, Immunity and Inflammation (CI3).
Running title: Genome Evolution of SARS-CoV-2.
*Correspondence should be addressed to: langlois@uottawa.ca
Key words: SARS-CoV-2, COVID-19, Variants of concern, B.1.1.7, B1.351, P.1, D614G
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted May 15, 2021. ; https://doi.org/10.1101/2021.03.06.21252994doi: medRxiv preprint
NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.

Castonguay et al., 2021 Genome Evolution of SARS-CoV-2
2
ABSTRACT
The novel Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) emerged in late December
2019 in Wuhan, China, and is the causative agent for the worldwide COVID-19 pandemic. SARS-CoV-2
is a positive-sense single-stranded RNA virus belonging to the betacoronavirus genus. Due to the error-
prone nature of the viral RNA-dependent polymerase complex, coronaviruses are known to acquire new
mutations at each cycle of genome replication. This constitutes one of the main factors driving the evolution
of its relatively large genome and the emergence of new genetic variants. In the past few months, the
identification of new B.1.1.7 (UK), B.1.351 (South Africa) and P.1 (Brazil) variants of concern (VOC) have
highlighted the importance of tracking the emergence of mutations in the SARS-CoV-2 genome that impact
transmissibility, virulence, and immune and neutralizing antibody escape. Here we analyzed the appearance
and prevalence trajectory over time of mutations that appeared in all SARS-CoV-2 genes from December,
2019 to April, 2021. The goal of the study was to identify which genetic modifications are the most frequent
and study the dynamics of their propagation, their incorporation into the consensus sequence, and their
impact on virus biology. We also analyzed the structural properties of the spike glycoprotein of the B.1.1.7,
B.1.351 and P.1 variants for its binding to the host receptor ACE2. This study offers an integrative view of
the emergence, disappearance, and consensus sequence integration of successful mutations that constitute
new SARS-CoV-2 variants and their impact on neutralizing antibody therapeutics and vaccines.
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted May 15, 2021. ; https://doi.org/10.1101/2021.03.06.21252994doi: medRxiv preprint

Castonguay et al., 2021 Genome Evolution of SARS-CoV-2
3
IMPORTANCE
SARS-CoV-2 is the etiological agent of COVID-19, which has caused > 3.4 million deaths worldwide as
of April, 2021. Mutations occur in the genome of SARS-CoV-2 during viral replication and affect viral
infectivity, transmissibility, and virulence. In early March 2020, the D614G mutation in the spike protein
emerged, which increased viral transmissibility and is now found in over 90% of all SARS-CoV-2 genomic
sequences in GISAID database. Between October and December 2020, B.1.1.7 (UK), B.1.351 (South
Africa) and P.1 (Brazil) variants of concern (VOCs) emerged, which have increased neutralizing antibody
escape capabilities because of mutations in the receptor binding domain of the spike protein. Characterizing
mutations in these variants is crucial because of their effect on adaptive immune responses, neutralizing
antibody therapy, and their impact on vaccine efficacy. Here we tracked and analyzed mutations in SARS-
CoV-2 genes since the beginning of the pandemic and investigated their functional impact on the spike of
these three VOCs.
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted May 15, 2021. ; https://doi.org/10.1101/2021.03.06.21252994doi: medRxiv preprint

Castonguay et al., 2021 Genome Evolution of SARS-CoV-2
4
Graphical Abstract
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted May 15, 2021. ; https://doi.org/10.1101/2021.03.06.21252994doi: medRxiv preprint

Castonguay et al., 2021 Genome Evolution of SARS-CoV-2
5
INTRODUCTION
In late December 2019, a new betacoronavirus known as Severe Acute Respiratory Syndrome Coronavirus
2 (SARS-COV-2) emerged in the city of Wuhan in the province of Hubei, China (1). SARS-CoV-2 is the
etiological viral agent for the worldwide COVID-19 pandemic resulting in more than 162 million infected
and 3.4 million deaths worldwide as of April, 2021 (2,3). SARS-CoV-2 is an enveloped, positive-sense
single-stranded RNA (+ssRNA) virus with a genome length of 29,811 nucleotides (4). The mutation rates
of RNA viruses are generally higher than that of DNA viruses because of the low fidelity of their viral RNA
polymerases (5,6). Mutations occur when viral replication enzymes introduce errors in the viral genome
resulting in the creation of premature termination codons, deletions and insertions of nucleotides that can
alter open reading frames and result in amino acid substitutions in viral proteins. These mutations combined
with the selective pressure of the human immune system lead to the selection and evolution of viral genomes
(6,7). However, coronaviruses are one of the few members of the RNA virus family that possess limited
but measurable proofreading ability via the 3'-to 5'- exoribonuclease activity of the non-structural viral
protein 14 (nsp14) (8,9). Coronaviruses are therefore expected to evolve through genetic drift much slower
than other RNA viruses that do not have this ability, such as influenza viruses (8,10). Additionally, SARS-
CoV-2 and other coronaviruses have low known occurrences of recombination between family members
(i.e., genetic shift), and therefore are mostly susceptible to genetic drift (11).
SARS-CoV-2 has reached pandemic status due to its presence on every continent and has since maintained
a high level of transmissibility across hosts of various ethnical and genetic backgrounds (2, 12). Moreover,
SARS-CoV-2 infections have been reported to naturally infect minks, ferrets, cats, tiger, and dogs, which
allows the virus to replicate in completely new hosts and mutate to produce new variants and possibly new
strains (13,14). In March 2020, the now dominant D614G mutation first emerged in the spike protein (S)
of SARS-CoV-2. The S protein is present as a trimer at the surface of the viral envelope and is responsible
for attachment of the virus to the human angiotensin converting enzyme 2 (hACE2), the entry receptor for
SARS-CoV-2 into human cells (15). Published evidence has now shown that D614G increases viral fitness,
transmissibility and viral load but does not directly affect COVID-19 pathogenicity (16,17,18,19).
Additionally, emerging evidence indicates that D614G may have epistatic interactions that exacerbate the
impact of several other independent mutations (19). Mutations in the S protein, and particularly in the
receptor binding domain (RBD), are of very high concern given that they can directly influence viral
infectivity, transmissibility, and resistance to neutralizing antibodies and T cell responses.
New mutations are frequently and regularly detected in the genome of SARS-CoV-2 through whole genome
sequencing; however, very few of these mutations make it into the transmitted viral consensus sequence.
The reference strain is generally regarded as the dominant transmitted strain at a given time. Its sequence
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted May 15, 2021. ; https://doi.org/10.1101/2021.03.06.21252994doi: medRxiv preprint

Citations
More filters
Posted ContentDOI
11 Nov 2020-bioRxiv
TL;DR: A mouse adapted SARS-CoV-2 strain that harbored three amino acid substitutions in the RBD of S protein showed 100% mortality in aged, male BALB/c mice is generated and the molecular mechanism for the rapid adaption and evolution of Sars-Cov-2 in mice is unveiled.
Abstract: The ongoing SARS-CoV-2 pandemic has brought an urgent need for animal models to study the pathogenicity of the virus. Herein, we generated and characterized a novel mouse-adapted SARS-CoV-2 strain named MASCp36 that causes acute respiratory symptoms and mortality in standard laboratory mice. Particularly, this model exhibits age and gender related skewed distribution of mortality akin to severe COVID-19, and the 50% lethal dose (LD50) of MASCp36 was ~100 PFU in aged, male BALB/c mice. Deep sequencing identified three amino acid mutations, N501Y, Q493H, and K417N, subsequently emerged at the receptor binding domain (RBD) of MASCp36, which significantly enhanced the binding affinity to its endogenous receptor, mouse ACE2 (mACE2). Cryo-electron microscopy (cryo-EM) analysis of mACE2 in complex with the RBD of MASCp36 at 3.7-angstrom resolution elucidates molecular basis for the receptor-binding switch driven by amino acid substitutions. Our study not only provides a robust platform for studying the pathogenesis of severe COVID-19 and rapid evaluation of coutermeasures against SARS-CoV-2, but also unveils the molecular mechanism for the rapid adaption and evolution of SARS-CoV-2 in mice.

67 citations

Journal ArticleDOI
TL;DR: In this paper, Qin et al. presented a mouse-adapted SARS-CoV-2 strain, MASCp36, that causes severe respiratory symptoms, and mortality.
Abstract: There is an urgent need for animal models to study SARS-CoV-2 pathogenicity. Here, we generate and characterize a novel mouse-adapted SARS-CoV-2 strain, MASCp36, that causes severe respiratory symptoms, and mortality. Our model exhibits age- and gender-related mortality akin to severe COVID-19. Deep sequencing identified three amino acid substitutions, N501Y, Q493H, and K417N, at the receptor binding domain (RBD) of MASCp36, during in vivo passaging. All three RBD mutations significantly enhance binding affinity to its endogenous receptor, ACE2. Cryo-electron microscopy analysis of human ACE2 (hACE2), or mouse ACE2 (mACE2), in complex with the RBD of MASCp36, at 3.1 to 3.7 A resolution, reveals the molecular basis for the receptor-binding switch. N501Y and Q493H enhance the binding affinity to hACE2, whereas triple mutations at N501Y/Q493H/K417N decrease affinity and reduce infectivity of MASCp36. Our study provides a platform for studying SARS-CoV-2 pathogenesis, and unveils the molecular mechanism for its rapid adaptation and evolution. In this study, Qin et al. present a murine-adapted SARS-CoV-2 strain, MASCp36, as a model for studying the pathogenicity, evolution and adaptation of the virus to human and animal hosts.

58 citations

Journal ArticleDOI
TL;DR: In this article, a normalization of COVID-19 cases was performed to calculate the relative frequency of SARS-CoV-2 mutations and explore their dynamics over time.
Abstract: Coronavirus disease 2019 (COVID-19) is a contagious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). This disease has spread globally, causing more than 161.5 million cases and 3.3 million deaths to date. Surveillance and monitoring of new mutations in the virus' genome are crucial to our understanding of the adaptation of SARS-CoV-2. Moreover, how the temporal dynamics of these mutations is influenced by control measures and non-pharmaceutical interventions (NPIs) is poorly understood. Using 1,058,020 SARS-CoV-2 from sequenced COVID-19 cases from 98 countries (totaling 714 country-month combinations), we perform a normalization by COVID-19 cases to calculate the relative frequency of SARS-CoV-2 mutations and explore their dynamics over time. We found 115 mutations estimated to be present in more than 3% of global COVID-19 cases and determined three types of mutation dynamics: high-frequency, medium-frequency, and low-frequency. Classification of mutations based on temporal dynamics enable us to examine viral adaptation and evaluate the effects of implemented control measures in virus evolution during the pandemic. We showed that medium-frequency mutations are characterized by high prevalence in specific regions and/or in constant competition with other mutations in several regions. Finally, taking N501Y mutation as representative of high-frequency mutations, we showed that level of control measure stringency negatively correlates with the effective reproduction number of SARS-CoV-2 with high-frequency or not-high-frequency and both follows similar trends in different levels of stringency.

10 citations

Journal ArticleDOI
26 Oct 2021-PLOS ONE
TL;DR: In this paper, the authors evaluated the robustness of those reopening plans under a wide range of uncertainties and showed that seemingly sensible re-opening plans can lead to both unnecessary COVID-19 deaths and days of interventions.
Abstract: The COVID-19 pandemic required significant public health interventions from local governments. Although nonpharmaceutical interventions often were implemented as decision rules, few studies evaluated the robustness of those reopening plans under a wide range of uncertainties. This paper uses the Robust Decision Making approach to stress-test 78 alternative reopening strategies, using California as an example. This study uniquely considers a wide range of uncertainties and demonstrates that seemingly sensible reopening plans can lead to both unnecessary COVID-19 deaths and days of interventions. We find that plans using fixed COVID-19 case thresholds might be less effective than strategies with time-varying reopening thresholds. While we use California as an example, our results are particularly relevant for jurisdictions where vaccination roll-out has been slower. The approach used in this paper could also prove useful for other public health policy problems in which policymakers need to make robust decisions in the face of deep uncertainty.

3 citations

Posted ContentDOI
28 Apr 2021-medRxiv
TL;DR: In this paper, the authors use simulation models and the Robust Decision Making (RDM) approach to stress test Californias reopening strategy and other alternatives over a wide range of futures.
Abstract: Amid global scarcity of COVID-19 vaccines and the threat of new variant strains, California and other jurisdictions face the question of when and how to implement and relax COVID-19 Nonpharmaceutical Interventions (NPIs). While policymakers have attempted to balance the health and economic impacts of the pandemic, decentralized decision-making, deep uncertainty, and the lack of widespread use of comprehensive decision support methods can lead to the choice of fragile or inefficient strategies. This paper uses simulation models and the Robust Decision Making (RDM) approach to stress-test Californias reopening strategy and other alternatives over a wide range of futures. We find that plans which respond aggressively to initial outbreaks are required to robustly control the pandemic. Further, the best plans adapt to changing circumstances, lowering their stringent requirements to reopen over time or as more constituents are vaccinated. While we use California as an example, our results are particularly relevant for jurisdictions where vaccination roll-out has been slower.

2 citations

References
More filters
Journal ArticleDOI
TL;DR: Human airway epithelial cells were used to isolate a novel coronavirus, named 2019-nCoV, which formed a clade within the subgenus sarbecovirus, Orthocoronavirinae subfamily, which is the seventh member of the family of coronaviruses that infect humans.
Abstract: In December 2019, a cluster of patients with pneumonia of unknown cause was linked to a seafood wholesale market in Wuhan, China. A previously unknown betacoronavirus was discovered through the use of unbiased sequencing in samples from patients with pneumonia. Human airway epithelial cells were used to isolate a novel coronavirus, named 2019-nCoV, which formed a clade within the subgenus sarbecovirus, Orthocoronavirinae subfamily. Different from both MERS-CoV and SARS-CoV, 2019-nCoV is the seventh member of the family of coronaviruses that infect humans. Enhanced surveillance and further investigation are ongoing. (Funded by the National Key Research and Development Program of China and the National Major Project for Control and Prevention of Infectious Disease in China.).

21,455 citations

Journal ArticleDOI
TL;DR: A two-dose regimen of BNT162b2 conferred 95% protection against Covid-19 in persons 16 years of age or older and safety over a median of 2 months was similar to that of other viral vaccines.
Abstract: Background Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and the resulting coronavirus disease 2019 (Covid-19) have afflicted tens of millions of people in a world...

10,274 citations

Journal ArticleDOI
03 Feb 2020-Nature
TL;DR: Phylogenetic and metagenomic analyses of the complete viral genome of a new coronavirus from the family Coronaviridae reveal that the virus is closely related to a group of SARS-like coronaviruses found in bats in China.
Abstract: Emerging infectious diseases, such as severe acute respiratory syndrome (SARS) and Zika virus disease, present a major threat to public health1–3. Despite intense research efforts, how, when and where new diseases appear are still a source of considerable uncertainty. A severe respiratory disease was recently reported in Wuhan, Hubei province, China. As of 25 January 2020, at least 1,975 cases had been reported since the first patient was hospitalized on 12 December 2019. Epidemiological investigations have suggested that the outbreak was associated with a seafood market in Wuhan. Here we study a single patient who was a worker at the market and who was admitted to the Central Hospital of Wuhan on 26 December 2019 while experiencing a severe respiratory syndrome that included fever, dizziness and a cough. Metagenomic RNA sequencing4 of a sample of bronchoalveolar lavage fluid from the patient identified a new RNA virus strain from the family Coronaviridae, which is designated here ‘WH-Human 1’ coronavirus (and has also been referred to as ‘2019-nCoV’). Phylogenetic analysis of the complete viral genome (29,903 nucleotides) revealed that the virus was most closely related (89.1% nucleotide similarity) to a group of SARS-like coronaviruses (genus Betacoronavirus, subgenus Sarbecovirus) that had previously been found in bats in China5. This outbreak highlights the ongoing ability of viral spill-over from animals to cause severe disease in humans. Phylogenetic and metagenomic analyses of the complete viral genome of a new coronavirus from the family Coronaviridae reveal that the virus is closely related to a group of SARS-like coronaviruses found in bats in China.

9,231 citations

Journal ArticleDOI
TL;DR: The outbreak of the 2019 novel coronavirus disease (COVID-19) has induced a considerable degree of fear, emotional stress and anxiety among individuals around the world.
Abstract: The outbreak of the 2019 novel coronavirus disease (COVID-19) has induced a considerable degree of fear, emotional stress and anxiety among individuals around t

8,336 citations

Journal ArticleDOI
16 Apr 2020-Cell
TL;DR: It is demonstrating that cross-neutralizing antibodies targeting conserved S epitopes can be elicited upon vaccination, and it is shown that SARS-CoV-2 S uses ACE2 to enter cells and that the receptor-binding domains of Sars- coV- 2 S and SARS S bind with similar affinities to human ACE2, correlating with the efficient spread of SATS among humans.

7,219 citations

Frequently Asked Questions (17)
Q1. What have the authors contributed in "Meta-analysis and structural dynamics of the emergence of genetic variants of sars-cov-2" ?

Castonguay et al. this paper have shown that D614G increases viral fitness, transmissibility and viral load but does not directly affect COVID-19 pathogenicity. 

SARS-CoV-2 is the etiological viral agent for the worldwide COVID-19 pandemic resulting in more than 162 million infected and 3.4 million deaths worldwide as of April, 2021 (2,3). 

Mutations occur in the genome of SARS-CoV-2 during viral replication and affect viral infectivity, transmissibility, and virulence. 

Tracking mutations and the evolution of the SARS-CoV-2 genome is critical for the development and deployment of effective treatments and vaccines. 

GISAID is a formidable tool for tracking the emergence of mutations, identifying the geographic region where it emerged, and tracking its spread around the globe. 

Another caveat in analyzing sequences in the GISAID database is that consensus sequences are uploaded, but subsequences and quasispecies are generally not included. 

The K417T mutation reduces interactions with neighboring residues in the C102 Nab and therefore the model predicts, as with K417N in B.1.1.7, an increased ability to escape neutralization (Fig. 7D). 

The recently approved Johnson & Johnson adenovirus-based vaccine only requires a single dose, in comparison to two for the mRNA vaccines and Oxford-AstraZeneca vaccine, and has an efficacy of 66% against the original Wuhan reference strain, 52.0 to 64.0% against the B.1.351 variant and 66% against the P.1 variant (43). 

antibodies induced by the PfizerBioNTech and Moderna vaccines appear to display a 6.7-fold, and 4.5-fold decrease in neutralization efficacy against the P.1 variant (Table 6) (39). 

This mutation is reported to cause increased resistance to Nabs, increased transmissibility, and increased virulence in animal models (27). 

The number of sequenced viral genomes uploaded to the GISAID database grew rapidly from 131,417 at the end of September, 2020 to 451,913 by January 30th, 2021 (28). 

The N501Y mutation is associated with an increased affinity to hACE2, along with an increase in infectivity and virulence (Table 2). 

A new variant was discovered in late 2020 in the UK that displayed increased affinity to hACE2, virulence and Nabs escape capabilities (Fig. 4) (24,25,30,31). 

Here the authors present a retrospective metadata analysis of mutations throughout the SARS-CoV-2 genome that reached at least a 1% worldwide frequency between December 2019 and January 2021. 

The authors also observe the emergence of several mutations in S and N present in the B.1.1.7 variant, which now have a frequency higher than 65% (Fig. 2C, 2D and Table 2). 

Emergence of new variants may therefore go undetected until they leave their point of origin and enter countries with high testing and sequencing rates. 

The authors first selected recurring mutations that were present in more than 500 reported genomes by August 2020, and another selection was made in January 2021 to capture recurring mutations present in more than 4000 reported genomes in GISAID.