Open accessPosted ContentDOI: 10.1101/2021.02.26.21252227

Sequence Analysis of 20,453 SARS-CoV-2 Genomes from the Houston Metropolitan Area Identifies the Emergence and Widespread Distribution of Multiple Isolates of All Major Variants of Concern

02 Mar 2021-medRxiv (Cold Spring Harbor Laboratory Press)-
Abstract: []Since the beginning of the SARS-CoV-2 pandemic, there has been international concern about the emergence of virus variants with mutations that increase transmissibility, enhance escape from the human immune response, or otherwise alter biologically important phenotypes. In late 2020, several "variants of concern" emerged globally, including the UK variant (B.1.1.7), South Africa variant (B.1.351), Brazil variants (P.1 and P.2), and two related California "variants of interest" (B.1.429 and B.1.427). These variants are believed to have enhanced transmissibility capacity. For the South Africa and Brazil variants, there is evidence that mutations in spike protein permit it to escape from some vaccines and therapeutic monoclonal antibodies. Based on our extensive genome sequencing program involving 20,453 virus specimens from COVID-19 patients dating from March 2020, we report identification of all important SARS-CoV-2 variants among Houston Methodist Hospital patients residing in the greater metropolitan area. Although these variants are currently at relatively low frequency in the population, they are geographically widespread. Houston is the first city in the United States to have all variants documented by genome sequencing. As vaccine deployment accelerates worldwide, increased genomic surveillance of SARS-CoV-2 is essential to understanding the presence and frequency of consequential variants and their patterns and trajectory of dissemination. This information is critical for medical and public health efforts to effectively address and mitigate this global crisis.

Open accessPosted ContentDOI: 10.1101/2021.03.10.21253321
12 Mar 2021-medRxiv
Abstract: Real-time epidemiological tracking of variants of interest can help limit the spread of more contagious forms of SARS-CoV-2, such as those containing the N501Y mutation. Typically, genetic sequencing is required to be able to track variants of interest in real-time. However, sequencing can take time and may not be accessible in all laboratories. Genotyping by RT-ddPCR offers an alternative to sequencing to rapidly detect variants of concern through discrimination of specific mutations such as N501Y that is associated with increased transmissibility. Here we describe the first cases of the B.1.1.7 lineage of SARS-CoV-2 detected in Washington State by using a combination of RT-PCR, RT-ddPCR, and next-generation sequencing. We screened 1,035 samples positive for SARS-CoV-2 by our CDC-based laboratory developed assay using ThermoFishers multiplex RT-PCR COVID-19 assay over four weeks from late December 2020 to early January 2021. S gene dropout candidates were subsequently assayed by RT-ddPCR to confirm four mutations within the S gene associated with the B.1.1.7 lineage: a deletion at amino acid (AA) 69-70 (ACATGT), deletion at AA 145, (TTA), N501Y mutation (TAT), and S982A mutation (GCA). All four targets were detected in two specimens, and follow-up sequencing revealed a total of 10 mutations in the S gene and phylogenetic clustering within the B.1.1.7 lineage. As variants of concern become increasingly prevalent, molecular diagnostic tools like RT-ddPCR can be utilized to quickly, accurately, and sensitively distinguish more contagious lineages of SARS-CoV-2.

Open accessPosted ContentDOI: 10.1101/2021.03.30.437622
30 Mar 2021-bioRxiv
Abstract: The SARS-CoV-2 spike (S) protein is a critical component of subunit vaccines and a target for neutralizing antibodies. Spike is also undergoing immunogenic selection with clinical variants that increase infectivity and partially escape convalescent plasma. Here, we describe spike display, a high-throughput platform to rapidly characterize glycosylated spike ectodomains across multiple coronavirus-family proteins. We assayed ∼200 variant SARS-CoV-2 spikes for their expression, ACE2 binding, and recognition by thirteen neutralizing antibodies (nAbs). An alanine scan of all five N-terminal domain (NTD) loops highlights a public class of epitopes in the N1, N3, and N5 loops that are recognized by most of the NTD-binding nAbs. Some clinical NTD substitutions abrogate binding to these epitopes but are circulating at low frequencies around the globe. NTD mutations in variants of concern B.1.1.7 (United Kingdom), B.1.351 (South Africa), B.1.1.248 (Brazil), and B.1.427/B.1.429 (California) impact spike expression and escape most NTD-targeting nAbs. However, two classes of NTD nAbs still bind B.1.1.7 spikes and neutralize in pseudoviral assays. B.1.351 and B.1.1.248 include compensatory mutations that either increase spike expression or increase ACE2 binding affinity. Finally, B.1.351 and B.1.1.248 completely escape a potent ACE2 peptide mimic. We anticipate that spike display will accelerate antigen design, deep scanning mutagenesis, and antibody epitope mapping for SARS-CoV-2 and other emerging viral threats.

Open accessPosted ContentDOI: 10.1101/2021.09.27.461949
28 Sep 2021-bioRxiv
Abstract: The ARTIC Network provides a common resource of PCR primer sequences and recommendations for amplifying SARS-CoV-2 genomes. The initial tiling strategy was developed with the reference genome Wuhan-01, and subsequent iterations have addressed areas of low amplification and sequence drop out. Recently, a new version (V4) was released, based on new variant genome sequences, in response to the realization that some V3 primers were located in regions with key mutations. Herein, we compare the performance of the ARTIC V3 and V4 primer sets with a matched set of 663 SARS-CoV-2 clinical samples sequenced with an Illumina NovaSeq 6000 instrument. We observe general improvements in sequencing depth and quality, and improved resolution of the SNP causing the D950N variation in the spike protein. Importantly, we also find nearly universal presence of spike protein substitution G142D in Delta-lineage samples. Due to the prior release and widespread use of the ARTIC V3 primers during the initial surge of the Delta variant, it is likely that the G142D amino acid substitution is substantially underrepresented among early Delta variant genomes deposited in public repositories. In addition to the improved performance of the ARTIC V4 primer set, this study also illustrates the importance of the primer scheme in downstream analyses. Importance ARTIC Network primers are commonly used by laboratories worldwide to amplify and sequence SARS-CoV-2 present in clinical samples. As new variants have evolved and spread, it was found that the V3 primer set poorly amplified several key mutations. In this report, we compare the results of sequencing a matched set of samples with the V3 and V4 primer sets. We find that adoption of the ARTIC V4 primer set is critical for accurate sequencing of the SARS-CoV-2 spike region. The absence of metadata describing the primer scheme used will negatively impact the downstream use of publicly available SARS-Cov-2 sequencing reads and assembled genomes.

Open accessPosted ContentDOI: 10.1101/2021.03.16.21253753
24 Mar 2021-medRxiv
Abstract: Genetic variants of the SARS-CoV-2 virus have become of great interest worldwide because they have the potential to detrimentally alter the course of the SARS-CoV-2 pandemic, and disease in individual patients. We recently sequenced 20,453 SARS- CoV-2 genomes from patients with COVID-19 disease in metropolitan Houston (population 7 million), dating from March 2020 to early February 2021. We discovered that all major variants of concern or interest are circulating in the region. To follow up on this discovery, we analyzed 8,857 genome sequences from patients in eight Houston Methodist hospitals dispersed throughout the metroplex diagnosed from January 1, 2021 to March 7, 2021. This sample represents 94% of Houston Methodist cases and 4.8% of all reported cases in metropolitan Houston during this period. We discovered rapid, widespread, and preferential increase of the SARS-CoV-2 UK B.1.1.7 throughout the region. The estimated case doubling time in the Houston area is 6.9 days. None of the 648 UK B.1.1.7 samples identified had the E484K change in spike protein that can cause decreased recognition by antibodies.

Open accessJournal ArticleDOI: 10.3389/FPUBH.2021.696664
Huaimin Yi1, Jin Wang1, Jiong Wang2, Yuying Lu1  +5 moreInstitutions (3)
Abstract: Since severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) began to spread in late 2019, laboratories around the world have widely used whole genome sequencing (WGS) to continuously monitor the changes in the viral genes and discovered multiple subtypes or branches evolved from SARS-CoV-2. Recently, several novel SARS-CoV-2 variants have been found to be more transmissible. They may affect the immune response caused by vaccines and natural infections and reduce the sensitivity to neutralizing antibodies. We analyze the distribution characteristics of prevalent SARS-CoV-2 variants and the frequency of mutant sites based on the data available from GISAID and PANGO by R 4.0.2 and ArcGIS 10.2. Our analysis suggests that B.1.1.7, B.1.351, and P.1 are more easily spreading than other variants, and the key mutations of S protein, including N501Y, E484K, and K417N/T, have high mutant frequencies, which may have become the main genotypes for the spread of SARS-CoV-2.

Open accessJournal ArticleDOI: 10.1093/BIOINFORMATICS/BTP352
Heng Li1, Bob Handsaker2, Alec Wysoker2, T. J. Fennell2  +5 moreInstitutions (4)
01 Aug 2009-Bioinformatics
Abstract: Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: Contact: [email protected]

Open accessJournal ArticleDOI: 10.1016/S0140-6736(20)30183-5
Chaolin Huang1, Yeming Wang2, Xingwang Li3, Lili Ren4  +25 moreInstitutions (8)
24 Jan 2020-The Lancet
Abstract: A recent cluster of pneumonia cases in Wuhan, China, was caused by a novel betacoronavirus, the 2019 novel coronavirus (2019-nCoV). We report the epidemiological, clinical, laboratory, and radiological characteristics and treatment and clinical outcomes of these patients. All patients with suspected 2019-nCoV were admitted to a designated hospital in Wuhan. We prospectively collected and analysed data on patients with laboratory-confirmed 2019-nCoV infection by real-time RT-PCR and next-generation sequencing. Data were obtained with standardised data collection forms shared by the International Severe Acute Respiratory and Emerging Infection Consortium from electronic medical records. Researchers also directly communicated with patients or their families to ascertain epidemiological and symptom data. Outcomes were also compared between patients who had been admitted to the intensive care unit (ICU) and those who had not.

Open accessJournal ArticleDOI: 10.1056/NEJMOA2001017
Na Zhu1, Dingyu Zhang, Wenling Wang1, Xingwang Li2  +15 moreInstitutions (3)
Abstract: In December 2019, a cluster of patients with pneumonia of unknown cause was linked to a seafood wholesale market in Wuhan, China. A previously unknown betacoronavirus was discovered through the use of unbiased sequencing in samples from patients with pneumonia. Human airway epithelial cells were used to isolate a novel coronavirus, named 2019-nCoV, which formed a clade within the subgenus sarbecovirus, Orthocoronavirinae subfamily. Different from both MERS-CoV and SARS-CoV, 2019-nCoV is the seventh member of the family of coronaviruses that infect humans. Enhanced surveillance and further investigation are ongoing. (Funded by the National Key Research and Development Program of China and the National Major Project for Control and Prevention of Infectious Disease in China.).

Open accessJournal ArticleDOI: 10.1038/S41586-020-2008-3
Fan Wu1, Su Zhao2, Bin Yu3, Yan-Mei Chen1  +17 moreInstitutions (4)
03 Feb 2020-Nature
Abstract: Emerging infectious diseases, such as severe acute respiratory syndrome (SARS) and Zika virus disease, present a major threat to public health1–3. Despite intense research efforts, how, when and where new diseases appear are still a source of considerable uncertainty. A severe respiratory disease was recently reported in Wuhan, Hubei province, China. As of 25 January 2020, at least 1,975 cases had been reported since the first patient was hospitalized on 12 December 2019. Epidemiological investigations have suggested that the outbreak was associated with a seafood market in Wuhan. Here we study a single patient who was a worker at the market and who was admitted to the Central Hospital of Wuhan on 26 December 2019 while experiencing a severe respiratory syndrome that included fever, dizziness and a cough. Metagenomic RNA sequencing4 of a sample of bronchoalveolar lavage fluid from the patient identified a new RNA virus strain from the family Coronaviridae, which is designated here ‘WH-Human 1’ coronavirus (and has also been referred to as ‘2019-nCoV’). Phylogenetic analysis of the complete viral genome (29,903 nucleotides) revealed that the virus was most closely related (89.1% nucleotide similarity) to a group of SARS-like coronaviruses (genus Betacoronavirus, subgenus Sarbecovirus) that had previously been found in bats in China5. This outbreak highlights the ongoing ability of viral spill-over from animals to cause severe disease in humans. Phylogenetic and metagenomic analyses of the complete viral genome of a new coronavirus from the family Coronaviridae reveal that the virus is closely related to a group of SARS-like coronaviruses found in bats in China.

Open accessJournal ArticleDOI: 10.1016/S0140-6736(20)30154-9
Jasper Fuk-Woo Chan1, Shuofeng Yuan1, Kin-Hang Kok1, Kelvin K. W. To1  +19 moreInstitutions (2)
15 Feb 2020-The Lancet
Abstract: Summary Background An ongoing outbreak of pneumonia associated with a novel coronavirus was reported in Wuhan city, Hubei province, China. Affected patients were geographically linked with a local wet market as a potential source. No data on person-to-person or nosocomial transmission have been published to date. Methods In this study, we report the epidemiological, clinical, laboratory, radiological, and microbiological findings of five patients in a family cluster who presented with unexplained pneumonia after returning to Shenzhen, Guangdong province, China, after a visit to Wuhan, and an additional family member who did not travel to Wuhan. Phylogenetic analysis of genetic sequences from these patients were done. Findings From Jan 10, 2020, we enrolled a family of six patients who travelled to Wuhan from Shenzhen between Dec 29, 2019 and Jan 4, 2020. Of six family members who travelled to Wuhan, five were identified as infected with the novel coronavirus. Additionally, one family member, who did not travel to Wuhan, became infected with the virus after several days of contact with four of the family members. None of the family members had contacts with Wuhan markets or animals, although two had visited a Wuhan hospital. Five family members (aged 36–66 years) presented with fever, upper or lower respiratory tract symptoms, or diarrhoea, or a combination of these 3–6 days after exposure. They presented to our hospital (The University of Hong Kong-Shenzhen Hospital, Shenzhen) 6–10 days after symptom onset. They and one asymptomatic child (aged 10 years) had radiological ground-glass lung opacities. Older patients (aged >60 years) had more systemic symptoms, extensive radiological ground-glass lung changes, lymphopenia, thrombocytopenia, and increased C-reactive protein and lactate dehydrogenase levels. The nasopharyngeal or throat swabs of these six patients were negative for known respiratory microbes by point-of-care multiplex RT-PCR, but five patients (four adults and the child) were RT-PCR positive for genes encoding the internal RNA-dependent RNA polymerase and surface Spike protein of this novel coronavirus, which were confirmed by Sanger sequencing. Phylogenetic analysis of these five patients' RT-PCR amplicons and two full genomes by next-generation sequencing showed that this is a novel coronavirus, which is closest to the bat severe acute respiatory syndrome (SARS)-related coronaviruses found in Chinese horseshoe bats. Interpretation Our findings are consistent with person-to-person transmission of this novel coronavirus in hospital and family settings, and the reports of infected travellers in other geographical regions. Funding The Shaw Foundation Hong Kong, Michael Seak-Kan Tong, Respiratory Viral Research Foundation Limited, Hui Ming, Hui Hoy and Chow Sin Lan Charity Fund Limited, Marina Man-Wai Lee, the Hong Kong Hainan Commercial Association South China Microbiology Research Fund, Sanming Project of Medicine (Shenzhen), and High Level-Hospital Program (Guangdong Health Commission).

