Open accessJournal ArticleDOI: 10.3390/V13030394

The Mutation Profile of SARS-CoV-2 Is Primarily Shaped by the Host Antiviral Defense.

02 Mar 2021-Viruses (MDPI AG)-Vol. 13, Iss: 3, pp 394
Abstract: Understanding SARS-CoV-2 evolution is a fundamental effort in coping with the COVID-19 pandemic. The virus genomes have been broadly evolving due to the high number of infected hosts world-wide. Mutagenesis and selection are two inter-dependent mechanisms of virus diversification. However, which mechanisms contribute to the mutation profiles of SARS-CoV-2 remain under-explored. Here, we delineate the contribution of mutagenesis and selection to the genome diversity of SARS-CoV-2 isolates. We generated a comprehensive phylogenetic tree with representative genomes. Instead of counting mutations relative to the reference genome, we identified each mutation event at the nodes of the phylogenetic tree. With this approach, we obtained the mutation events that are independent of each other and generated the mutation profile of SARS-CoV-2 genomes. The results suggest that the heterogeneous mutation patterns are mainly reflections of host (i) antiviral mechanisms that are achieved through APOBEC, ADAR, and ZAP proteins, and (ii) probable adaptation against reactive oxygen species.

Open accessJournal ArticleDOI: 10.1002/RMV.2231
Daniele Focosi, Fabrizio Maggi1Institutions (1)
Abstract: The Spike protein is the target of both antibody-based therapeutics (convalescent plasma, polyclonal serum, monoclonal antibodies) and vaccines. Mutations in Spike could affect efficacy of those treatments. Hence, monitoring of mutations is necessary to forecast and readapt the inventory of therapeutics. Different phylogenetic nomenclatures have been used for the currently circulating SARS-CoV-2 clades. The Spike protein has different hotspots of mutation and deletion, the most dangerous for immune escape being the ones within the receptor binding domain (RBD), such as K417N/T, N439K, L452R, Y453F, S477N, E484K, and N501Y. Convergent evolution has led to different combinations of mutations among different clades. In this review we focus on the main variants of concern, that is, the so-called UK (B.1.1.7), South African (B.1.351) and Brazilian (P.1) strains.

35 Citations

Open accessJournal ArticleDOI: 10.1128/MBIO.01140-21
31 Aug 2021-Mbio
Abstract: The recent emergence of multiple variants of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has become a significant concern for public health worldwide. New variants have been classified either as variants of concern (VOCs) or variants of interest (VOIs) by the CDC (USA) and WHO. The VOCs include lineages such as B.1.1.7 (20I/501Y.V1 variant), P.1 (20J/501Y.V3 variant), B.1.351 (20H/501Y.V2 variant), and B.1.617.2. In contrast, the VOI category includes B.1.525, B.1.526, P.2, and B.1.427/B.1.429. The WHO provided the alert for last two variants (P.2 and B.1.427/B.1.429) and labeled them for further monitoring. As per the WHO, these variants can be reclassified due to their status at a particular time. At the same time, the CDC (USA) has marked these two variants as VOIs up through today. This article analyzes the evolutionary patterns of all these emerging variants, as well as their geographical distributions and transmission patterns, including the circulating frequency, entropy diversity, and mutational event diversity throughout the genomes of all SARS-CoV-2 lineages. The transmission pattern was observed highest in the B.1.1.7 lineage. Our frequency evaluation found that this lineage achieved 100% frequency in early October 2020. We also critically evaluated the above emerging variants mutational landscape and significant spike protein mutations (E484K, K417T/N, N501Y, and D614G) impacting public health. Finally, the effectiveness of vaccines against newly SARS-CoV-2 variants was also analyzed. IMPORTANCE Irrespective of the aggressive vaccination drive, the newly emerging multiple SARS-CoV-2 variants are causing havoc in several countries. As per the CDC (USA) and WHO, the VOCs include the B.1.1.7, P.1, B.1.351, and B.1.617.2 lineages, while the VOIs include the B.1.525, B.1.526, P.2, and B.1.427/B.1.429 lineages. This study analyzed the evolutionary patterns, geographical distributions and transmission patterns, circulating frequency, entropy diversity, and mutational event diversity throughout the genome of significant SARS-CoV-2 lineages. A higher transmission pattern was observed for the B.1.1.7 variant. The study also evaluated the mutational landscape and important spike protein mutations (E484K, K417T/N, N501Y, and D614G) of all of the above variants. Finally, a survey was performed on the efficacy of vaccines against these variants from the previously published literature. The results presented in this article will help design future countrywide pandemic planning strategies for the emerging variants, next-generation vaccine development using alternative wild-type antigens and significant viral antigens, and immediate planning for ongoing vaccination programs worldwide.

7 Citations

Open accessJournal ArticleDOI: 10.3390/CELLS10061557
20 Jun 2021-Cells
Abstract: The current SARS-CoV-2 pandemic underscores the importance of understanding the evolution of RNA genomes. While RNA is subject to the formation of similar lesions as DNA, the evolutionary and physiological impacts RNA lesions have on viral genomes are yet to be characterized. Lesions that may drive the evolution of RNA genomes can induce breaks that are repaired by recombination or can cause base substitution mutagenesis, also known as base editing. Over the past decade or so, base editing mutagenesis of DNA genomes has been subject to many studies, revealing that exposure of ssDNA is subject to hypermutation that is involved in the etiology of cancer. However, base editing of RNA genomes has not been studied to the same extent. Recently hypermutation of single-stranded RNA viral genomes have also been documented though its role in evolution and population dynamics. Here, we will summarize the current knowledge of key mechanisms and causes of RNA genome instability covering areas from the RNA world theory to the SARS-CoV-2 pandemic of today. We will also highlight the key questions that remain as it pertains to RNA genome instability, mutations accumulation, and experimental strategies for addressing these questions.

3 Citations

Open accessJournal ArticleDOI: 10.1016/J.CRMETH.2021.100093
21 Oct 2021-
Abstract: Low- and middle-income countries (LMICs) are significantly affected by SARS-CoV-2, partially due to their limited capacity for local production and implementation of molecular testing. Here, we provide detailed methods and validation of a molecular toolkit that can be readily produced and deployed using laboratory equipment available in LMICs. Our results show that lab-scale production of enzymes and nucleic acids can supply over 50,000 tests per production batch. The optimized one-step RT-PCR coupled to CRISPR-Cas12a-mediated detection showed a limit of detection of 102 ge/μL in a turnaround time of 2 h. The clinical validation indicated an overall sensitivity of 80%-88%, while for middle and high viral load samples (Cq ≤ 31) the sensitivity was 92%-100%. The specificity was 96%-100% regardless of viral load. Furthermore, we show that the toolkit can be used with the mobile laboratory Bento Lab, potentially enabling LMICs to implement detection services in unattended remote regions.

2 Citations

Open accessPosted ContentDOI: 10.1101/2021.07.05.451089
Xionglei He1Institutions (1)
05 Jul 2021-bioRxiv
Abstract: The before-outbreak evolutionary history of SARS-CoV-2 is enigmatic because it shares only [~]96% genomic similarity with RaTG13, the closest relative so far found in wild animals (horseshoe bats). Since mutations on single-stranded viral RNA are heavily shaped by host factors, the viral mutation signatures can in turn inform the host. By comparing publically available viral genomes we here inferred the mutations SARS-CoV-2 accumulated before the outbreak and after the split from RaTG13. We found the mutation spectrum of SARS-CoV-2, which measures the relative rates of 12 mutation types, is 99.9% identical to that of RaTG13. It is also similar to that of two other bat coronaviruses but distinct from that evolved in non-bat hosts. The viral mutation spectrum informed the activities of a variety of mutation-associated host factors, which were found almost identical between SARS-CoV-2 and RaTG13, a pattern difficult to create in laboratory. All the findings are robust after replacing RaTG13 with RshSTT182, another coronavirus found in horseshoe bats with [~]93% similarity to SARS-CoV-2. Our analyses suggest SARS-CoV-2 shared almost the same host environment with RaTG13 and RshSTT182 before the outbreak.

1 Citations


Open accessBook
13 Aug 2009-
Abstract: This book describes ggplot2, a new data visualization package for R that uses the insights from Leland Wilkisons Grammar of Graphics to create a powerful and flexible system for creating data graphics. With ggplot2, its easy to: produce handsome, publication-quality plots, with automatic legends created from the plot specification superpose multiple layers (points, lines, maps, tiles, box plots to name a few) from different data sources, with automatically adjusted common scales add customisable smoothers that use the powerful modelling capabilities of R, such as loess, linear models, generalised additive models and robust regression save any ggplot2 plot (or part thereof) for later modification or reuse create custom themes that capture in-house or journal style requirements, and that can easily be applied to multiple plots approach your graph from a visual perspective, thinking about how each component of the data is represented on the final plot. This book will be useful to everyone who has struggled with displaying their data in an informative and attractive way. You will need some basic knowledge of R (i.e. you should be able to get your data into R), but ggplot2 is a mini-language specifically tailored for producing graphics, and youll learn everything you need in the book. After reading this book youll be able to produce graphics customized precisely for your problems,and youll find it easy to get graphics out of your head and on to the screen or page.

23,839 Citations

Open accessJournal ArticleDOI: 10.1056/NEJMOA2001017
Na Zhu1, Dingyu Zhang, Wenling Wang1, Xingwang Li2  +15 moreInstitutions (3)
Abstract: In December 2019, a cluster of patients with pneumonia of unknown cause was linked to a seafood wholesale market in Wuhan, China. A previously unknown betacoronavirus was discovered through the use of unbiased sequencing in samples from patients with pneumonia. Human airway epithelial cells were used to isolate a novel coronavirus, named 2019-nCoV, which formed a clade within the subgenus sarbecovirus, Orthocoronavirinae subfamily. Different from both MERS-CoV and SARS-CoV, 2019-nCoV is the seventh member of the family of coronaviruses that infect humans. Enhanced surveillance and further investigation are ongoing. (Funded by the National Key Research and Development Program of China and the National Major Project for Control and Prevention of Infectious Disease in China.).

15,285 Citations

Open accessJournal ArticleDOI: 10.1038/S41586-020-2012-7
Peng Zhou1, Xing-Lou Yang1, Xian Guang Wang2, Ben Hu1  +25 moreInstitutions (3)
03 Feb 2020-Nature
Abstract: Since the outbreak of severe acute respiratory syndrome (SARS) 18 years ago, a large number of SARS-related coronaviruses (SARSr-CoVs) have been discovered in their natural reservoir host, bats1–4. Previous studies have shown that some bat SARSr-CoVs have the potential to infect humans5–7. Here we report the identification and characterization of a new coronavirus (2019-nCoV), which caused an epidemic of acute respiratory syndrome in humans in Wuhan, China. The epidemic, which started on 12 December 2019, had caused 2,794 laboratory-confirmed infections including 80 deaths by 26 January 2020. Full-length genome sequences were obtained from five patients at an early stage of the outbreak. The sequences are almost identical and share 79.6% sequence identity to SARS-CoV. Furthermore, we show that 2019-nCoV is 96% identical at the whole-genome level to a bat coronavirus. Pairwise protein sequence analysis of seven conserved non-structural proteins domains show that this virus belongs to the species of SARSr-CoV. In addition, 2019-nCoV virus isolated from the bronchoalveolar lavage fluid of a critically ill patient could be neutralized by sera from several patients. Notably, we confirmed that 2019-nCoV uses the same cell entry receptor—angiotensin converting enzyme II (ACE2)—as SARS-CoV. Characterization of full-length genome sequences from patients infected with a new coronavirus (2019-nCoV) shows that the sequences are nearly identical and indicates that the virus is related to a bat coronavirus.

12,056 Citations

Open accessJournal ArticleDOI: 10.1093/BIOINFORMATICS/BTS565
Limin Fu1, Beifang Niu1, Zhengwei Zhu1, Sitao Wu1  +1 moreInstitutions (1)
01 Dec 2012-Bioinformatics
Abstract: Summary: CD-HIT is a widely used program for clustering biological sequences to reduce sequence redundancy and improve the performance of other sequence analyses. In response to the rapid increase in the amount of sequencing data produced by the next-generation sequencing technologies, we have developed a new CD-HIT program accelerated with a novel parallelization strategy and some other techniques to allow efficient clustering of such datasets. Our tests demonstrated very good speedup derived from the parallelization for up to ~24 cores and a quasi-linear speedup for up to ~8 cores. The enhanced CD-HIT is capable of handling very large datasets in much shorter time than previous versions. Availability: Contact: [email protected] Supplementary information:Supplementary data are available at Bioinformatics online.

3,948 Citations

Open accessJournal ArticleDOI: 10.18637/JSS.V035.B01
Abstract: ggplot2: Elegant Graphics for Data Analysis is a new addition to the UseR! series by Springer, probably the fastest expanding source of resources for computational statistics at the current moment. The books in this series are all linked with R, either presenting a new package developed by the own authors of the book or describing how to applying statistical techniques with the different packages available in R. ggplot2 is an implementation in R of The Grammar of Graphics (Wilkinson 2005) a systematic approach to the specification of statistical graphics that was introduced in a book previously reviewed in the Journal of Statistical Software by Cox (2007). This implementation has been developed by Hadley Wickham, who is also the author of the book reviewed here.

3,547 Citations