scispace - formally typeset
Search or ask a question
Author

Rong Zhang

Bio: Rong Zhang is an academic researcher from University of Auckland. The author has an hindex of 2, co-authored 3 publications receiving 15 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the internal complexities of the relaxed clock model were explored in order to develop efficient MCMC operators for Bayesian phylogenetic inference, and an adaptive operator was introduced to learn the weights of other operators during MCMC.
Abstract: Relaxed clock models enable estimation of molecular substitution rates across lineages and are widely used in phylogenetics for dating evolutionary divergence times. Under the (uncorrelated) relaxed clock model, tree branches are associated with molecular substitution rates which are independently and identically distributed. In this article we delved into the internal complexities of the relaxed clock model in order to develop efficient MCMC operators for Bayesian phylogenetic inference. We compared three substitution rate parameterisations, introduced an adaptive operator which learns the weights of other operators during MCMC, and we explored how relaxed clock model estimation can benefit from two cutting-edge proposal kernels: the AVMVN and Bactrian kernels. This work has produced an operator scheme that is up to 65 times more efficient at exploring continuous relaxed clock parameters compared with previous setups, depending on the dataset. Finally, we explored variants of the standard narrow exchange operator which are specifically designed for the relaxed clock model. In the most extreme case, this new operator traversed tree space 40% more efficiently than narrow exchange. The methodologies introduced are adaptive and highly effective on short as well as long alignments. The results are available via the open source optimised relaxed clock (ORC) package for BEAST 2 under a GNU licence (https://github.com/jordandouglas/ORC).

37 citations

Posted ContentDOI
09 Sep 2020-bioRxiv
TL;DR: This article delved into the internal complexities of the relaxed clock model in order to develop efficient MCMC operators for Bayesian phylogenetic inference, and explored how a range of optimisations can improve the statistical inference of the relaxation clock.
Abstract: Uncorrelated relaxed clock models enable estimation of molecular substitution rates across lineages and are widely used in phylogenetics for dating evolutionary divergence times. In this article we delved into the internal complexities of the relaxed clock model in order to develop efficient MCMC operators for Bayesian phylogenetic inference. We compared three substitution rate parameterisations, introduced an adaptive operator which learns the weights of other operators during MCMC, and we explored how relaxed clock model estimation can benefit from two cutting-edge proposal kernels: the AVMVN and Bactrian kernels. This work has produced an operator scheme that is up to 65 times more efficient at exploring continuous relaxed clock parameters compared with previous setups, depending on the dataset. Finally, we explored variants of the standard narrow exchange operator which are specifically designed for the relaxed clock model. In the most extreme case, this new operator traversed tree space 40% more efficiently than narrow exchange. The methodologies introduced are adaptive and highly effective on short as well as long alignments. The results are available via the open source optimised relaxed clock (ORC) package for BEAST 2 under a GNU licence (https://github.com/jordandouglas/ORC). Author summary Biological sequences, such as DNA, accumulate mutations over generations. By comparing such sequences in a phylogenetic framework, the evolutionary tree of lifeforms can be inferred. With the overwhelming availability of biological sequence data, and the increasing affordability of collecting new data, the development of fast and efficient phylogenetic algorithms is more important than ever. In this article we focus on the relaxed clock model, which is very popular in phylogenetics. We explored how a range of optimisations can improve the statistical inference of the relaxed clock. This work has produced a phylogenetic setup which can infer parameters related to the relaxed clock up to 65 times faster than previous setups, depending on the dataset. The methods introduced adapt to the dataset during computation and are highly efficient when processing long biological sequences.

27 citations

Posted ContentDOI
22 Apr 2021-bioRxiv
TL;DR: This work implements, benchmark and validate popular phylogenetic models for the study of paleontological and neontological continuous trait data, incorporating these models into the BEAST2 platform and illustrating and advancing the paradigm of Bayesian, probabilistic total evidence.
Abstract: AO_SCPLOWBSTRACTC_SCPLOWTime-scaled phylogenetic trees are both an ultimate goal of evolutionary biology and a necessary ingredient in comparative studies. While accumulating genomic data has moved the field closer to a full description of the tree of life, the relative timing of certain evolutionary events remains challenging even when this data is abundant, and absolute timing is impossible without external information such as fossil ages and morphology. The field of phylogenetics lacks efficient tools integrating probabilistic models for these kinds of data into unified frameworks for estimating phylogenies. Here, we implement, benchmark and validate popular phylogenetic models for the study of paleontological and neontological continuous trait data, incorporating these models into the BEAST2 platform. Our methods scale well with number of taxa and of characters. We tip-date and estimate the topology of a phylogeny of Carnivora, comparing results from different configurations of integrative models capable of leveraging ages, as well as molecular and continuous morphological data from living and extinct species. Our results illustrate and advance the paradigm of Bayesian, probabilistic total evidence, in which explanatory models are fully defined, and inferential uncertainty in all their dimensions is accounted for.

4 citations


Cited by
More filters
Journal ArticleDOI

3,734 citations

Journal Article
TL;DR: A variety of local and relaxed clock methods have been proposed and implemented for phylogenetic divergence dating as discussed by the authors, which allows different molecular clocks in different parts of the phylogenetic tree, thereby retaining the advantages of the classical molecular clock while casting off the restrictive assumption of a single, global rate of substitution.
Abstract: The estimation of phylogenetic divergence times from sequence data is an important component of many molecular evolutionary studies. There is now a general appreciation that the procedure of divergence dating is considerably more complex than that initially described in the 1960s by Zuckerkandl and Pauling (1962, 1965). In particular, there has been much critical attention toward the assumption of a global molecular clock, resulting in the development of increasingly sophisticated techniques for inferring divergence times from sequence data. In response to the documentation of widespread departures from clocklike behavior, a variety of local- and relaxed-clock methods have been proposed and implemented. Local-clock methods permit different molecular clocks in different parts of the phylogenetic tree, thereby retaining the advantages of the classical molecular clock while casting off the restrictive assumption of a single, global rate of substitution (Rambaut and Bromham 1998; Yoder and Yang 2000).

707 citations

01 Jan 1965
TL;DR: Different types of molecules are discussed in relation to their fitness for providing the basis for a molecular phylogeny, i.e. the different types of macromolecules that carry the genetic information or a very extensive translation thereof.
Abstract: Different types of molecules are discussed in relation to their fitness for providing the basis for a molecular phylogeny Best fit are the “semantides”, ie the different types of macromolecules that carry the genetic information or a very extensive translation thereof The fact that more than one coding triplet may code for a given amino acid residue in a polypeptide leads to the notion of “isosemantic substitutions” in genic and messenger polynucleotides Such substitutions lead to differences in nucleotide sequence that are not expressed by differences in amino acid sequence Some possible consequences of isosemanticism are discussed

117 citations

Posted ContentDOI
09 Jan 2021-bioRxiv
TL;DR: This work leverage 155 genome assemblies, from 149 species, to generate a fossil-calibrated phylogeny and conduct multilocus tests for introgression across nine monophyletic radiations within the genus Drosophila, providing the first evidence of introgressive events occurring across the evolutionary history of this genus.
Abstract: Genome-scale sequence data has invigorated the study of hybridization and introgression, particularly in animals. However, outside of a few notable cases, we lack systematic tests for introgression at a larger phylogenetic scale across entire clades. Here we leverage 155 genome assemblies, from 149 species, to generate a fossil-calibrated phylogeny and conduct multilocus tests for introgression across 9 monophyletic radiations within the genus Drosophila. Using complementary phylogenomic approaches, we identify widespread introgression across the evolutionary history of Drosophila. Mapping gene-tree discordance onto the phylogeny revealed that both ancient and recent introgression has occurred, with introgression at the base of species radiations being particularly common. Our results provide the first evidence of introgression occurring across the evolutionary history of Drosophila and highlight the need to continue to study the evolutionary consequences of hybridization and introgression in this genus and across the Tree of Life.

90 citations

Journal ArticleDOI
TL;DR: In this paper, the authors leverage 155 genome assemblies from 149 species to generate a fossil-calibrated phylogeny and conduct multilocus tests for introgression across 9 monophyletic radiations within the genus Drosophila.

73 citations