scispace - formally typeset
Search or ask a question
Journal ArticleDOI

RDP4: Detection and analysis of recombination patterns in virus genomes.

01 Mar 2015-Virus Evolution (Oxford University Press)-Vol. 1, Iss: 1
TL;DR: The key feature of RDP4 that differentiates it from other recombination detection tools is its flexibility, which can be run either in fully automated mode from the command line interface or with a graphically rich user interface that enables detailed exploration of both individual recombination events and overall recombination patterns.
Abstract: RDP4 is the latest version of recombination detection program (RDP), a Windows computer program that implements an extensive array of methods for detecting and visualising recombination in, and stripping evidence of recombination from, virus genome sequence alignments. RDP4 is capable of analysing twice as many sequences (up to 2,500) that are up to three times longer (up to 10 Mb) than those that could be analysed by older versions of the program. RDP4 is therefore also applicable to the analysis of bacterial full-genome sequence datasets. Other novelties in RDP4 include (1) the capacity to differentiate between recombination and genome segment reassortment, (2) the estimation of recombination breakpoint confidence intervals, (3) a variety of ‘recombination aware’ phylogenetic tree construction and comparison tools, (4) new matrix-based visualisation tools for examining both individual recombination events and the overall phylogenetic impacts of multiple recombination events and (5) new tests to detect the influences of gene arrangements, encoded protein structure, nucleic acid secondary structure, nucleotide composition, and nucleotide diversity on recombination breakpoint patterns. The key feature of RDP4 that differentiates it from other recombination detection tools is its flexibility. It can be run either in fully automated mode from the command line interface or with a graphically rich user interface that enables detailed exploration of both individual recombination events and overall recombination patterns.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: SARS-CoV-2 itself is not a recombinant of any sarbecoviruses detected to date, and its receptor-binding motif appears to be an ancestral trait shared with bat viruses and not one acquired recently via recombination.
Abstract: There are outstanding evolutionary questions on the recent emergence of human coronavirus SARS-CoV-2 including the role of reservoir species, the role of recombination and its time of divergence from animal viruses. We find that the sarbecoviruses—the viral subgenus containing SARS-CoV and SARS-CoV-2—undergo frequent recombination and exhibit spatially structured genetic diversity on a regional scale in China. SARS-CoV-2 itself is not a recombinant of any sarbecoviruses detected to date, and its receptor-binding motif, important for specificity to human ACE2 receptors, appears to be an ancestral trait shared with bat viruses and not one acquired recently via recombination. To employ phylogenetic dating methods, recombinant regions of a 68-genome sarbecovirus alignment were removed with three independent methods. Bayesian evolutionary rate and divergence date estimates were shown to be consistent for these three approaches and for two different prior specifications of evolutionary rates based on HCoV-OC43 and MERS-CoV. Divergence dates between SARS-CoV-2 and the bat sarbecovirus reservoir were estimated as 1948 (95% highest posterior density (HPD): 1879–1999), 1969 (95% HPD: 1930–2000) and 1982 (95% HPD: 1948–2009), indicating that the lineage giving rise to SARS-CoV-2 has been circulating unnoticed in bats for decades. In this manuscript, the authors address evolutionary questions on the emergence of SARS-CoV-2. They find that SARS-CoV-2 is not a recombinant of any sarbecoviruses detected to date, and that the bat and pangolin sequences most closely related to SARS-CoV-2 probably diverged several decades ago or possibly earlier from human SARS-CoV-2 samples.

716 citations

Journal ArticleDOI
05 Feb 2021-Nature
TL;DR: In this paper, the authors report chronic SARS-CoV-2 with reduced sensitivity to neutralizing antibodies in an immune suppressed individual treated with convalescent plasma, generating whole genome ultradeep sequences over 23 time points spanning 101 days.
Abstract: SARS-CoV-2 Spike protein is critical for virus infection via engagement of ACE21, and is a major antibody target. Here we report chronic SARS-CoV-2 with reduced sensitivity to neutralising antibodies in an immune suppressed individual treated with convalescent plasma, generating whole genome ultradeep sequences over 23 time points spanning 101 days. Little change was observed in the overall viral population structure following two courses of remdesivir over the first 57 days. However, following convalescent plasma therapy we observed large, dynamic virus population shifts, with the emergence of a dominant viral strain bearing D796H in S2 and ΔH69/ΔV70 in the S1 N-terminal domain NTD of the Spike protein. As passively transferred serum antibodies diminished, viruses with the escape genotype diminished in frequency, before returning during a final, unsuccessful course of convalescent plasma. In vitro, the Spike escape double mutant bearing ΔH69/ΔV70 and D796H conferred modestly decreased sensitivity to convalescent plasma, whilst maintaining infectivity similar to wild type. D796H appeared to be the main contributor to decreased susceptibility but incurred an infectivity defect. The ΔH69/ΔV70 single mutant had two-fold higher infectivity compared to wild type, possibly compensating for the reduced infectivity of D796H. These data reveal strong selection on SARS-CoV-2 during convalescent plasma therapy associated with emergence of viral variants with evidence of reduced susceptibility to neutralising antibodies.

651 citations

Journal ArticleDOI
TL;DR: Evidence is shown that the novel coronavirus (2019-nCov) is not-mosaic consisting in almost half of its genome of a distinct lineage within the betacoronavirus, suggesting that the hypothesis that 2019-nCoV has originated from bats is very likely.

579 citations


Cites methods from "RDP4: Detection and analysis of rec..."

  • ...Methods: Putative recombination was investigated by RDP4 and Simplot v3....

    [...]

  • ...Putative recombination was investigated by RDP4 (Martin, 2015) and Simplot v3....

    [...]

Journal ArticleDOI
07 May 2020-Nature
TL;DR: It is shown that a coronavirus isolated from a Malayan pangolin has 100%, 98.6%, 97.8% and 90.7% amino acid identity with SARS-CoV-2 in the E, M, N and S proteins, respectively, which suggests that the latter may have originated from a recombination event involving Sars-related coronaviruses from bats and pangolins.
Abstract: The current outbreak of coronavirus disease-2019 (COVID-19) poses unprecedented challenges to global health1. The new coronavirus responsible for this outbreak-severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-shares high sequence identity to SARS-CoV and a bat coronavirus, RaTG132. Although bats may be the reservoir host for a variety of coronaviruses3,4, it remains unknown whether SARS-CoV-2 has additional host species. Here we show that a coronavirus, which we name pangolin-CoV, isolated from a Malayan pangolin has 100%, 98.6%, 97.8% and 90.7% amino acid identity with SARS-CoV-2 in the E, M, N and S proteins, respectively. In particular, the receptor-binding domain of the S protein of pangolin-CoV is almost identical to that of SARS-CoV-2, with one difference in a noncritical amino acid. Our comparative genomic analysis suggests that SARS-CoV-2 may have originated in the recombination of a virus similar to pangolin-CoV with one similar to RaTG13. Pangolin-CoV was detected in 17 out of the 25 Malayan pangolins that we analysed. Infected pangolins showed clinical signs and histological changes, and circulating antibodies against pangolin-CoV reacted with the S protein of SARS-CoV-2. The isolation of a coronavirus from pangolins that is closely related to SARS-CoV-2 suggests that these animals have the potential to act as an intermediate host of SARS-CoV-2. This newly identified coronavirus from pangolins-the most-trafficked mammal in the illegal wildlife trade-could represent a future threat to public health if wildlife trade is not effectively controlled.

570 citations

Journal ArticleDOI
Nuno R. Faria1, Joshua Quick2, Ingra Morales Claro3, Julien Thézé1, J G de Jesus4, Marta Giovanetti4, Moritz U. G. Kraemer1, Sarah C. Hill1, Allison Black5, Allison Black6, A. C. da Costa3, L. C Franco7, Sandro Patroca da Silva7, C-H Wu1, Jayna Raghwani1, Simon Cauchemez8, L. du Plessis1, M. P Verotti, W. K. de Oliveira4, Eduardo Hage Carmo, Giovanini E. Coelho, A. C. F. S Santelli4, L. C Vinhal, Cláudio Maierovitch Pessanha Henriques, Jared T. Simpson9, Matthew Loose10, Kristian G. Andersen11, Nathan D. Grubaugh11, Sneha Somasekar12, Charles Y. Chiu12, José Esteban Muñoz-Medina13, César González-Bonilla13, Carlos F. Arias14, Lia Laura Lewis-Ximenez4, Sally A. Baylis15, Alexandre Otavio Chieppe, Shirlei Ferreira Aguiar, Carlos Fernandes, Poliana da Silva Lemos7, B. L. S Nascimento7, Hamilton Antônio de Oliveira Monteiro7, Isadora Cristina de Siqueira4, M. G. de Queiroz, T. R. de Souza, João Felipe Bezerra, M. R Lemos, Gavin Pereira, D Loudal, L. C Moura, Rafael Dhalia4, Rafael F. O. França4, T Magalhães4, T Magalhães16, T Magalhães17, Ernesto T. A. Marques4, Thomas Jaenisch18, Gabriel Luz Wallau4, M. C. de Lima, Vitor H. Nascimento, E. M. de Cerqueira, M. M. de Lima19, D. L Mascarenhas, J. P Moura Neto20, Anna S. Levin3, Tania Regina Tozetto-Mendoza3, Silvia Nunes Szente Fonseca, Maria Cassia Mendes-Correa3, Flavio Augusto de Pádua Milagres21, Aluísio Augusto Cotrim Segurado3, Edward C. Holmes22, Andrew Rambaut23, Andrew Rambaut24, Trevor Bedford6, Márcio Roberto Teixeira Nunes7, Márcio Roberto Teixeira Nunes25, Ester Cerdeira Sabino3, Luiz Carlos Junior Alcantara4, Nicholas J. Loman2, Oliver G. Pybus1 
15 Jun 2017-Nature
TL;DR: The origin and epidemic history of ZIKV in Brazil and the Americas remain poorly understood, despite the value of this information for interpreting observed trends in reported microcephaly and other birth defects as mentioned in this paper.
Abstract: Transmission of Zika virus (ZIKV) in the Americas was first confirmed in May 2015 in northeast Brazil. Brazil has had the highest number of reported ZIKV cases worldwide (more than 200,000 by 24 December 2016) and the most cases associated with microcephaly and other birth defects (2,366 confirmed by 31 December 2016). Since the initial detection of ZIKV in Brazil, more than 45 countries in the Americas have reported local ZIKV transmission, with 24 of these reporting severe ZIKV-associated disease. However, the origin and epidemic history of ZIKV in Brazil and the Americas remain poorly understood, despite the value of this information for interpreting observed trends in reported microcephaly. Here we address this issue by generating 54 complete or partial ZIKV genomes, mostly from Brazil, and reporting data generated by a mobile genomics laboratory that travelled across northeast Brazil in 2016. One sequence represents the earliest confirmed ZIKV infection in Brazil. Analyses of viral genomes with ecological and epidemiological data yield an estimate that ZIKV was present in northeast Brazil by February 2014 and is likely to have disseminated from there, nationally and internationally, before the first detection of ZIKV in the Americas. Estimated dates for the international spread of ZIKV from Brazil indicate the duration of pre-detection cryptic transmission in recipient regions. The role of northeast Brazil in the establishment of ZIKV in the Americas is further supported by geographic analysis of ZIKV transmission potential and by estimates of the basic reproduction number of the virus.

470 citations

References
More filters
Journal ArticleDOI
TL;DR: This work presents some of the most notable new features and extensions of RAxML, such as a substantial extension of substitution models and supported data types, the introduction of SSE3, AVX and AVX2 vector intrinsics, techniques for reducing the memory requirements of the code and a plethora of operations for conducting post-analyses on sets of trees.
Abstract: Motivation: Phylogenies are increasingly used in all fields of medical and biological research. Moreover, because of the next-generation sequencing revolution, datasets used for conducting phylogenetic analyses grow at an unprecedented pace. RAxML (Randomized Axelerated Maximum Likelihood) is a popular program for phylogenetic analyses of large datasets under maximum likelihood. Since the last RAxML paper in 2006, it has been continuously maintained and extended to accommodate the increasingly growing input datasets and to serve the needs of the user community. Results: I present some of the most notable new features and extensions of RAxML, such as a substantial extension of substitution models and supported data types, the introduction of SSE3, AVX and AVX2 vector intrinsics, techniques for reducing the memory requirements of the code and a plethora of operations for conducting postanalyses on sets of trees. In addition, an up-to-date 50-page user manual covering all new RAxML options is available. Availability and implementation: The code is available under GNU

23,838 citations


"RDP4: Detection and analysis of rec..." refers background or methods in this paper

  • ...RDP4 can also be used to directly construct minimum evolution (with FastTree2; Price, Dehal, and Arkin 2010) and maximum-likelihood (with RAxML8; Stamatakis 2014) phylogenetic trees that account for the recombination events that it has detected....

    [...]

  • ...Further, the program can carry out ‘recombination aware’ inferences of ancestral sequences using parsimony (with PHYLIP; Felsenstein 1989), maximum likelihood (with RAxML8; Stamatakis 2014), or Bayesian (with MrBayes3....

    [...]

  • ...Phylogenetic incompatibility visualisations of the overall phylogenetic impacts of recombination within datasets (Fig. 2e; Jakobsen and Easteal 1996; Shimodaira and Hasegawa 2001; Simmonds and Welch 2006; Rousseau et al. 2007; Stamatakis 2014)....

    [...]

  • ...2e; Jakobsen and Easteal 1996; Shimodaira and Hasegawa 2001; Simmonds and Welch 2006; Rousseau et al. 2007; Stamatakis 2014)....

    [...]

Journal ArticleDOI
TL;DR: The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly, and provides more output options than previously, including samples of ancestral states, site rates, site dN/dS rations, branch rates, and node dates.
Abstract: Since its introduction in 2001, MrBayes has grown in popularity as a software package for Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) methods. With this note, we announce the release of version 3.2, a major upgrade to the latest official release presented in 2003. The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly. The introduction of new proposals and automatic optimization of tuning parameters has improved convergence for many problems. The new version also sports significantly faster likelihood calculations through streaming single-instruction-multiple-data extensions (SSE) and support of the BEAGLE library, allowing likelihood calculations to be delegated to graphics processing units (GPUs) on compatible hardware. Speedup factors range from around 2 with SSE code to more than 50 with BEAGLE for codon problems. Checkpointing across all models allows long runs to be completed even when an analysis is prematurely terminated. New models include relaxed clocks, dating, model averaging across time-reversible substitution models, and support for hard, negative, and partial (backbone) tree constraints. Inference of species trees from gene trees is supported by full incorporation of the Bayesian estimation of species trees (BEST) algorithms. Marginal model likelihoods for Bayes factor tests can be estimated accurately across the entire model space using the stepping stone method. The new version provides more output options than previously, including samples of ancestral states, site rates, site d(N)/d(S) rations, branch rates, and node dates. A wide range of statistics on tree parameters can also be output for visualization in FigTree and compatible software.

18,718 citations

Journal Article

16,851 citations

Journal ArticleDOI
10 Mar 2010-PLOS ONE
TL;DR: Improvements to FastTree are described that improve its accuracy without sacrificing scalability, and FastTree 2 allows the inference of maximum-likelihood phylogenies for huge alignments.
Abstract: Background We recently described FastTree, a tool for inferring phylogenies for alignments with up to hundreds of thousands of sequences. Here, we describe improvements to FastTree that improve its accuracy without sacrificing scalability.

10,010 citations


"RDP4: Detection and analysis of rec..." refers background in this paper

  • ...RDP4 can also be used to directly construct minimum evolution (with FastTree2; Price, Dehal, and Arkin 2010) and maximum-likelihood (with RAxML8; Stamatakis 2014) phylogenetic trees that account for the recombination events that it has detected....

    [...]

Journal ArticleDOI
TL;DR: BEAST 2 now has a fully developed package management system that allows third party developers to write additional functionality that can be directly installed to the BEAST 2 analysis platform via a package manager without requiring a new software release of the platform.
Abstract: We present a new open source, extensible and flexible software platform for Bayesian evolutionary analysis called BEAST 2. This software platform is a re-design of the popular BEAST 1 platform to correct structural deficiencies that became evident as the BEAST 1 software evolved. Key among those deficiencies was the lack of post-deployment extensibility. BEAST 2 now has a fully developed package management system that allows third party developers to write additional functionality that can be directly installed to the BEAST 2 analysis platform via a package manager without requiring a new software release of the platform. This package architecture is showcased with a number of recently published new models encompassing birth-death-sampling tree priors, phylodynamics and model averaging for substitution models and site partitioning. A second major improvement is the ability to read/write the entire state of the MCMC chain to/from disk allowing it to be easily shared between multiple instances of the BEAST software. This facilitates checkpointing and better support for multi-processor and high-end computing extensions. Finally, the functionality in new packages can be easily added to the user interface (BEAUti 2) by a simple XML template-based mechanism because BEAST 2 has been re-designed to provide greater integration between the analysis engine and the user interface so that, for example BEAST and BEAUti use exactly the same XML file format.

5,183 citations