Inferring Species Trees Directly from Biallelic Genetic Markers: Bypassing Gene Trees in a Full Coalescent Analysis
Citations
5,183 citations
Cites methods from "Inferring Species Trees Directly fr..."
...The SNAPP package implements a multi-species coalescent for SNP and AFLP data [33]....
[...]
2,045 citations
908 citations
Cites methods from "Inferring Species Trees Directly fr..."
...The three most common methods in this group, BEST (Liu and Pearl, 2007), *BEAST (Heled and Drummond, 2010), and SNAPP (Bryant et al., 2012), all seek to estimate the posterior distribution for the species tree using Markov chain Monte Carlo (MCMC), but differ in some details of the implementation....
[...]
...SNAPP infers the species tree using the coalescent model and is designed for biallelic data consisting of unlinked SNPs (Bryant et al., 2012)....
[...]
...We also carried out computations in SNAPP (Bryant et al., 2012), which is suitable for the soybean dataset as it consists of SNP (rather than multi-locus) data, to compare the run times....
[...]
...Much recent effort has been devoted to the development of methods to estimate species-level phylogenies from multi-locus data under the coalescent model (Bryant et al., 2012; Heled and Drummond, 2010; Kubatko et al., 2009; Liu and Pearl, 2007; Liu et al., 2009b; Than and Nakhleh, 2009)....
[...]
...The three most common methods in this group, BEST (Liu and Pearl, 2007), *BEAST (Heled and Drummond, 2010) and SNAPP (Bryant et al., 2012), all seek to estimate the posterior distribution for the species tree using Markov chain Monte Carlo (MCMC), but differ in some details of the implementation....
[...]
586 citations
578 citations
Cites methods from "Inferring Species Trees Directly fr..."
...In the MSC model, the quartet topology found in the true species tree has the highest probability of appearing in gene trees (Allman et al. 2011), and the two alternative topologies have identical probabilities....
[...]
...On real data, we need to instead estimate gene trees from sequence data, and further, it is not always clear that our sample is unbiased, nor that gene trees are generated by the MSC. Importantly, we further assume that all four clusters around the branch we are scoring are correct....
[...]
...The most scalable family of MSC-based methods are based on a two-step process where gene trees are first estimated independently for each gene and are then combined to build the species tree using a summary method....
[...]
...On the other hand, considering only the MSC and ignoring issues such as long branch attraction, long branches can be easily reconstructed confidently even with few genes....
[...]
...We now conclude Theorem 1 Given (1) a set of n gene trees generated by the MSC on a model species tree generated by the Yule process with rate k and (2) an internal branch represented by a quadripartition Q where the four clusters around Q are each present in the species tree, let z ¼ ðz1; z2; z3Þ be the average quartet frequencies around Q (where z1 corresponds to the topology of Q); the local PP that the species tree has the topology given by Q is: PðQj Z ¼ zÞ ¼ hðz1Þ hðz1Þ þ 2z2 z1 hðz2Þ þ 2z3 z1 hðz3Þ (6) for hðxÞ ¼ Bðxþ 1;n xþ 2kÞð1 I1 3 ðxþ 1;n xþ 2kÞÞ....
[...]
References
15,840 citations
"Inferring Species Trees Directly fr..." refers background or methods in this paper
...This ability represents a qualitative difference between SNAPP and the methods of Nielsen et al. (1998) and RoyChoudhury et al....
[...]
...This ability represents a qualitative difference between SNAPP and the methods of Nielsen et al. (1998) and RoyChoudhury et al. (2008). A more difficult and complex problem, and one beyond the scope of this paper, would be to properly characterize the situations in which the θ values can be reliably inferred....
[...]
...Early contributions to the development of multispecies models built on the branches of a species tree were made by Hudson (1983), Tajima (1983), Takahata and Nei (1985), Nei (1987), Pamilo and Nei (1988), and Takahata (1989)....
[...]
13,884 citations
13,111 citations
"Inferring Species Trees Directly fr..." refers background or methods in this paper
...See Felsenstein (2004), Degnan and Rosenberg (2009), and Heled and Drummond (2010) for general introductions to the multispecies coalescent. Early contributions to the development of multispecies models built on the branches of a species tree were made by Hudson (1983), Tajima (1983), Takahata and Nei (1985), Nei (1987), Pamilo and Nei (1988), and Takahata (1989)....
[...]
...See Felsenstein (2004), Degnan and Rosenberg (2009), and Heled and Drummond (2010) for general introductions to the multispecies coalescent....
[...]
...The algorithm works in a similar manner to Felsenstein’s pruning algorithm (Felsenstein 1981) for computing the likelihood of a gene tree: we define partial likelihoods that focus only on a specific subtree; the partial likelihoods are then computed starting at the leaves (of the species tree),…...
[...]
...See Felsenstein (2004), Degnan and Rosenberg (2009), and Heled and Drummond (2010) for general introductions to the multispecies coalescent. Early contributions to the development of multispecies models built on the branches of a species tree were made by Hudson (1983), Tajima (1983), Takahata and Nei (1985), Nei (1987), Pamilo and Nei (1988), and Takahata (1989). The multispecies coalescent determines a distribution for gene trees and their branch lengths, conditional on a species tree....
[...]
...The algorithm works in a similar manner to Felsenstein’s pruning algorithm (Felsenstein 1981) for computing the likelihood of a gene tree: we define partial likelihoods that focus only on a specific subtree; the partial likelihoods are then computed starting at the leaves (of the species tree), working upward to the root....
[...]
11,916 citations
"Inferring Species Trees Directly fr..." refers methods in this paper
...Following Drummond and Rambaut (2007), we assume a pure birth (Yule) model for the species tree topology and species divergence times, with a hyperparameter λ equal to the birth rate of the species tree. This hyperparameter is either fixed or allowed to vary with an improper uniform hyperprior. 3. Following Rannala and Yang (2003), we use independent gamma prior distributions for the population size parameters θ....
[...]
...This is the approach taken by BATWING (Wilson et al. 2003), BEST (Liu and Pearl 2007), and STAR-BEAST (Heled and Drummond 2010), among others....
[...]
...SNAPP, which interfaces with the BEAST package (Drummond and Rambaut 2007), takes a range of biallelic data types as input and returns a sample of species trees with (relative) divergence times and population sizes....
[...]
...The SNAPP sampler differs from methods such as BEST (Liu and Pearl 2007) and STAR-BEAST (Heled and Drummond 2010), which sample gene trees explicitly....
[...]
...The MCMC proposal functions implemented in SNAPP are standard and are a subset of those available in BEAST (Drummond and Rambaut 2007) when sampling from molecular clock trees....
[...]