What is the likelihood of a branch length being linked?

linked branch lengths allow for subset-specific substitution rates, but all subsets share a single set of relative branch lengths.

What is the data set of the ray-finned fish?

This data set comprises ten nuclear protein-coding genes (i.e., 30 data blocks) from 72 ray-finned fish, totaling 7,995 bp (Li et al. 2008).

(Open Access) PartitionFinder: Combined Selection of Partitioning Schemes and Substitution Models for Phylogenetic Analyses (2012) | Robert Lanfear

Q: What are the contributions in "Partitionfinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses" ?

Here, the authors describe two new objective methods for the combined selection of best-fit partitioning schemes and nucleotide substitution models. The authors demonstrate that these methods significantly outperform previous approaches, including both the ad hoc selection of partitioning schemes ( e. g., partitioning by gene or codon position ) and a recently proposed hierarchical clustering method. The authors hope that PartitionFinder will encourage the objective selection of partitioning schemes and thus lead to improvements in phylogenetic analyses.

HAL Id: lirmm-00705211

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00705211

Submitted on 16 Jun 2021

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-

entic research documents, whether they are pub-

lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diusion de documents

scientiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Distributed under a Creative Commons Attribution| 4.0 International License

PartitionFinder: Combined Selection of Partitioning

Schemes and Substitution Models for Phylogenetic

Analyses

Stéphane Guindon, Robert Lanfear, Brett Calcott, Simon Y.W. Ho

To cite this version:

Stéphane Guindon, Robert Lanfear, Brett Calcott, Simon Y.W. Ho. PartitionFinder: Combined Se-

lection of Partitioning Schemes and Substitution Models for Phylogenetic Analyses. Molecular Biology

and Evolution, Oxford University Press (OUP), 2012, 29 (6), pp.1695-1701. �10.1093/molbev/mss020�.

�lirmm-00705211�

PartitionFinder: Combined Selection of Partitioning Schemes

and Substitution Models for Phylogenetic Analyses

Robert Lanfear,*

Brett Calcott,

1,2

Simon Y. W. Ho,

and Stephane Guindon

Centre for Macroevolution and Macroecology, Ecology Evolution and Genetics, Research School of Biology, Australian National

University, Canberra, Australian Capital Territory, Australia

Philosophy Program, Research School of Social Sciences, Australian National Univers ity, Canberra, Australian Capital Territory,

Australia

School of Biological Sciences, University of Sydney, Sydney, New South Wales, Australia

Department of Statistics, University of Auckland, Auckland, New Zealand

*Corresponding author: E-mail: rob.lanfear@anu.edu.au.

Associate editor: Sudhir Kumar

Abstract

In phylogenetic analyses of molecular sequence data, partitioning involves estimating independent models of molecular

evolution for different sets of sites in a sequence alignment. Choosing an appropriate partitioning scheme is an important

step in most analyses because it can affect the accuracy of phylogenetic reconstruction. Despite this, partitioning schemes

are often chosen without explicit statistical justiﬁcation. Here, we describe two new objective methods for the combined

selection of best-ﬁt partitioning schemes and nucleotide substitution models. These methods allow millions of partitioning

schemes to be compared in realistic time frames and so permit the objective selection of partitioning schemes even for

large multilocus DNA data sets. We demonstrate that these methods signiﬁcantly outperform previous approaches,

including both the ad hoc selection of partitioning schemes (e.g., partitioning by gene or codon position) and a recently

proposed hierarchical clustering method. We have implemented these methods in an open-source program,

PartitionFinder. This program allows users to select partitioning schemes and substitution models using a range of

information-theoretic metrics (e.g., the Bayesian information criterion, akaike information criterion [AIC], and corrected

AIC). We hope that PartitionFinder will encourage the objective selection of partitioning schemes and thus lead to

improvements in phylogenetic analyses. PartitionFinder is written in Python and runs under Mac OSX 10.4 and above. The

program, source code, and a detailed manual are freely available from www.robertlanfear.com/partitionﬁnder.

Key words: partitioning, AIC, BIC, AICc, model selection, molecular evolution.

Introduction

Molecular phylogenetics provides a wea lth of importa nt in-

formation for evolutionary biologists. However, the accuracy

of molecular phylogenetic infere nce depends on having an

appropriate model of molecular evolution (Sullivan and

Joyce 2005; Simon et al. 2006). Because of this, there is a great

deal of interest in developing methods to select evolutionary

models and assess their adequacy (Ripplinger and Sullivan

2010; Jayaswal et al. 2011; Nguye n et al. 2011). The goal of

model selection is to identify a model that is sufﬁciently com-

plex to ca pture the evolutionary processes that have

occurred but to avoid models with more par ameters than

canbereliablyestimatedfromtheavailabledata(overpar-

ameterization). One of the most important aspects of

models of molecular evolution is how they ac count for

variation in evolutionary processes among the sites of an

alignments, because the failure to correctly account for this

variation can seriously mislead phylogenetic analyses

(Buckley et al. 20 01; Telford and Copley 2011).

There are two ways to incorporate the variation in

evolutionary processes among different sites using

currently available phylogenetic methods: mixture models

and partitioning. With mixture models, the likelihood of

each site is calculated under more than one substitution

model (e.g., Le et al. 2008). The parameters of these

substitution models, as well as the probability with which

each model applies to each site, can be determined directly

from the data (Pagel and Meade 2004). With partitioning,

the user ﬁrst groups together sites that are assumed to have

evolved under similar processes and then estimates inde-

pendent (i.e., unlinked) substitution models for each group

of sites (e.g., Nylander et al. 2004; Brandley et al. 2005;

McGuire et al. 2007). In contrast to mixture models, par-

titioning requires the a priori deﬁnition of appropriate

groups of sites. Although mixture models are implemented

in an increasing variety of phylogenetic software (e.g., Pagel

and Meade 2004; Stamatakis 2006; Le et al. 2008), partition-

ing remains by far the most common approach to

incorporating heterogeneity in evolutionary processes

among sites (Blair and Murphy 2011).

Choosing an appropriate partitioning scheme is a central

problem for most phylogenetic analyses (Brandley et al.

2005; Shapiro et al. 2006; McGuire et al. 2007; Li et al.

2008; Blair and Murphy 2011). Typically, phylogeneticists

use their biological intuition to group together similar sites

in an alignment into putatively homogeneous data blocks.

e-mail: journals.permissions@oup.com

Mol. Biol. Evol. 29(6):1695–1701. 2012 doi:10.1093/molbev/mss020 Advance Access publicat ion January 20, 2012 1695

Research article

Downloaded from https://academic.oup.com/mbe/article/29/6/1695/1000514 by Bibliothèque Universitaire de médecine - Nîmes user on 16 June 2021

This often involves deﬁning data blocks on the basis of

genes and codon positions (e.g., Shapiro et al. 2006; Ho

and Lanfear 2010). For example, in an analysis of four

protein-coding genes, one could deﬁne 12 data blocks—

one for each codon position in each gene. This

approach is biologically justiﬁed because differences be-

tween codon positions and genes are expected to account

for much of the heterogeneity in evolutionary processes

among sites (Shapiro et al. 2006). However, many studies

have shown that this approach can lead to overparamete-

rization, and that phylogenetic reconstruction can be

improved by merging certain data blocks together, thus de-

ﬁning a partitioning scheme that requires the estimation of

fewer independent substitution models (Brandley et al.

2005; Brown and Lemmon 2007; McGuire et al. 2007; Li

et al. 2008). For example, the second codon positions in

two similar nuclear genes may experience similar rates

and patterns of substitution and so might be better ana-

lyzed together rather than independently. Of course, it is

not always straightforward to identify which data blocks

should be merged and which should be analyzed indepen-

dently. One solution to this problem is to compare all

possible partitioning schemes for a given data set. However,

this approach is usually computationally intractable

because the number of possible partitioning schemes is

astronomical even for relatively small numbers of data

blocks (Li et al. 2008). As a result, most researchers either

choose a single partitioning scheme a priori or select the

best-ﬁt scheme from a handful of candidate schemes

(Brandley et al. 2005; McGuire et al. 2007). Thus, despite

signiﬁcant advances in phylogenetic methods in recent

years, the accuracy of the inferences we can make from

partitioned phylogenetic analyses remains limited by our

ability to select appropriate partitioning schemes.

In this study, we describe two new methods that solve

many of the problems associated with selecting partition-

ing schemes. These methods increase the efﬁciency of com-

paring partitioning schemes by many orders of magnitude,

allowing many millions of schemes to be compared in re-

alistic time frames. We describe these new methods below

and assess their performance on a range of published data

sets. We show that our methods select signiﬁcantly better

partitioning schemes than previous approaches—including

the ad hoc selection of partitioning schemes and previously

suggested objective approaches. We have implemented

these methods in an open-source program, PartitionFinder.

This program has ﬂexible options and allows users to efﬁ-

ciently and objectively ﬁnd best-ﬁt partitioning schemes

and nucleotide substitution models, even for large data

sets. PartitionFinder, its source code, and a detailed manual

are available from www.robertlanfear.com/partitionﬁnder.

Materials and Methods

We use the following deﬁnitions throughout this article.

We deﬁne a ‘‘data block’’ as a user-deﬁned set of sites

in an alignment; a ‘‘subset’’ as a set of one or more data

blocks; and a partitioning scheme as a set of subsets that

includes all sites in the alignment once and only once. For

clarity, we avoid the use of the term ‘‘partition,’’ as this has

different and potentially very confusing meanings in the

mathematical and molecular phylogenetics literature (in

the mathematical literature, a partition is equivalent to

our use of ‘‘partitioning scheme’’ here, whereas in the

molecular phylogenetics literature, it is equivalent to our

use of ‘‘subset’’ here). In the majority of cases, users will

specify data blocks based on genes and codon positions—

for example, by deﬁning 12 data blocks for an alignment of

four protein-coding genes. The sites in a data block need

not be contiguous in the alignment, but a single site can be

a member of only one data block. A subset can comprise

a single data block (e.g., ﬁrst codon sites from a protein-

coding gene) or multiple data blocks (e.g., ﬁrst and second

codon sites from a protein-coding gene). For example,

consider an alignment of four protein-coding genes for

which the user has deﬁned 12 data blocks, one for each

codon position in each gene. One possible partitioning

scheme for this data set involves treating each codon

position in each gene independently. This partitioning

scheme has 12 subsets, and so 12 unlinked substitution

models would be estimated from the data during the

phylogenetic analysis. Another possible partitioning

scheme involves treating each codon position indepen-

dently but merging the codon positions across genes. This

partitioning scheme has three subsets (one for each codon

position), and so three unlinked substitution models would

be estimated from the data during the phylogenetic anal-

ysis. The challenge is to ﬁnd the best-ﬁt partitioning

scheme for a given nucleotide alignment, given the prede-

ﬁned set of data blocks.

The number of possible partitioning schemes for a set of

n data blocks is equivalent to the number of ways of

putting n different-colored balls into one or more indistin-

guishable boxes. This relationship is known as a Bell

number (Bell 1934) and can be described by the following

relationship, where B

is the number of possible partition-

ing schemes given n user-deﬁned data blocks (Li et al.

2008), and the curly brackets deﬁne a Stirling number of

the second kind:

k 5 0

The number of possible partitioning schemes can be as-

tronomical even for relatively modest data sets. For exam-

ple, in an analysis of four protein-coding genes (4 genes  3

codons 5 12 data blocks), there are B

5 4.2  10

possible

partitioning schemes, and for an analysis of 20 protein-

coding genes (20 genes  3 codons 5 60 data blocks), there

are B

5 9.8  10

possible partitioning schemes.

The set of partitioning schemes will be made up of

a smaller number of possible subsets because most subsets

will be included in a many different partitioning schemes.

Speciﬁcally, the number of possible subsets, S

, that can be

created from a set of n user-deﬁned data blocks is the

Lanfear et al. · doi:10.1093/molbev/mss020 MBE

1696

Downloaded from https://academic.oup.com/mbe/article/29/6/1695/1000514 by Bibliothèque Universitaire de médecine - Nîmes user on 16 June 2021

number of possible nonempty subsets that can be

generated from a set of size n:

5 2

 1:

For example, in an analysis of four protein-coding genes

(12 data blocks), there are S

5 4,095 possible subsets, and

in an analysis of 20 protein-coding genes (60 data blocks),

there are S

5 1.2  10

possible subsets.

The PartitionFinder Algorithm

Previous approaches to comparing partitioning schemes

have been both labor-intensive and computationally inten-

sive because they have required a full likelihood or Bayesian

analysis for each partitioning scheme under consideration

(see e.g., McGuire et al. 2007; Li et al. 2008). This has fun-

damentally limited the number of partitioning schemes

that have been compared in most studies, as comparing

large numbers (e.g., hundreds) of partitioning schemes

in this way is simply not feasible for most data sets. This

approach is also highly inefﬁcient because it involves re-

peatedly recalculating the likelihood of every site in the

alignment, despite the fact that the substitution models

applied to those sites will be the same for many partition-

ing schemes. The PartitionFinder algorithm improves the

efﬁciency of ﬁnding best-ﬁt partitioning schemes by calcu-

lating the log likelihood of each subset of sites only once.

The log likelihood of each partitioning scheme is then cal-

culated by summing the log likelihoods of the subsets that

make up that scheme.

An outline of the PartitionFinder algorithm is as follows:

1. Estimate a phylogenet ic tree of sequences;

2. Select the best-ﬁt substitution model for each possible

subset;

3. Calculate the log likelihood of each partitioning scheme by

summing the log likelihoods of the subsets that make up

that scheme;

4. Select a partitioning scheme using information-theoretic

metrics.

All likelihood calculations are performed using a modi-

ﬁed version of PhyML 3.0 (Guindon et al. 2010), available

from the authors and as part of the PartitionFinder pro-

gram. Tree estimation (step 1) is performed using the BioNJ

algorithm implemented in PhyML 3.0 (Guindon et al. 2010),

using the combined data from all of the user-deﬁned data

blocks. PartitionFinder also allows the user to specify a tree

topology for step 1. The tree topology from step 1 is then

ﬁxed for the rest of the analysis. This differs from previous

approaches, which coestimate the tree topology and the

likelihood of each partitioning scheme. This is a computa-

tionally intensive method that has limited the number of

partitioning schemes that can be compared (see above).

Using a ﬁxed tree topology allows likelihoods from different

subsets to be combined, which increases the efﬁciency by

many orders of magnitude and allows many millions of par-

titioning schemes to be compared in a single run. Fixing the

tree topology is unlikely to adversely affect the results of

comparing partitioning schemes, as previous studies have

shown that doing so does not affect the results of model

selection procedures as long as a nonrandom tree topology

is used (Posada and Crandall 2001).

Model selection (step 2) is performed on a user-speciﬁed

set of up to 56 substitution models from the general time

reversible (GTR) family, and our approach is similar to

other model selection algorithms (e.g., Keane et al. 2006;

Posada 2008). During model selection, we ﬁrst calculate

the likelihood of each candidate substitution model,

conditioned on the tree topology from step 1. We then

select the best-ﬁt model according to one of three

user-speciﬁed information-theoretic metrics: the akaike

information criterion (AIC), the corrected AIC (AICc), or

the Bayesian information criterion (BIC) (Sullivan and Joyce

2005). PartitionFinder implements almost all of the models

of nucleotide evolution included in the most commonly

used phylogenetic tree estimation programs such as PhyML

(Guindon et al. 2010), RaxML (Stamatakis 2006), MrBayes

(Ronquist and Huelsenbeck 2003), and BEAST (Drummond

and Rambaut 2007). This means that the output from

PartitionFinder can be used to directly set up a phylogenetic

analysis in any of these programs. However, all of these

models and programs assume that the data evolved under

a time-reversible, stationary, and homogeneous process,

and they should not be used if the data violate any of these

assumptions.

PartitionFinder includes an option for either linked or

unlinked branch lengths between subsets. When branch

lengths are linked, step 1 includes the reestimation of

branch lengths on the BioNJ topology using a GTR

substitution model, with a proportion of invariant sites

and gamma distributed rates across sites estimated from

the data. The likelihood of each model for each subset (step

2) is then calculated conditioned on this topology and

these branch lengths, with each model afforded an

independent rate multiplier that can increase or decrease

all branch lengths by the same factor. Thus, linked branch

lengths allow for subset-speciﬁc substitution rates, but all

subsets share a single set of relative branch lengths. By

contrast, when branch lengths are unlinked, model selec-

tion (step 2) is conditioned on the topology from step 1,

but all branch lengths are estimated independently for each

model in each subset.

The log likelihood of each partitioning scheme (step 3) is

calculated by summing the log likelihoods of the best-ﬁt

model for each subset in the partitioning scheme. Finally,

the best-ﬁt partitioning scheme is selected (step 4) using

one of three information-theoretic measures: the AIC,

AICc, or BIC.

A Greedy Heuristic Algorithm to Search for

Partitioning Schemes

Even using the algorithm described above, exhaustive

searches on desktop computers are practically limited to

data sets for which 12 or fewer data blocks are deﬁned

(corresponding to data sets with 4.2 million or fewer pos-

sible partitioning schemes). Therefore, heuristic searches

Partitioning in Phylogenetics · doi:10.1093/molbev/mss020 MBE

1697

Downloaded from https://academic.oup.com/mbe/article/29/6/1695/1000514 by Bibliothèque Universitaire de médecine - Nîmes user on 16 June 2021

among partitioning schemes are necessary for larger data

sets, even though they cannot be guaranteed to ﬁnd the

optimum partitioning scheme (Li et al. 2008).

The heuristic search algorithm we describe below incor-

porates the increases in efﬁciency described above but

hugely reduces the number of partitioning schemes that

need to be considered for a given data set. Our method

builds on a recently proposed method (Li et al. 2008) that

involves estimating GTRþG model parameters for each

data block and then progressively merging the data blocks

with the most similar parameter estimates using hierarchical

cluster analysis. For a set of n data blocks, the hierarchical

clustering method objectively deﬁnes n partitioning schemes

that range from having n subsets (all data blocks treated in-

dependently) to having a single subset (all data blocks

merged together). The optimal scheme is then selected from

this set of n schemes using an information-theoretic metric

(e.g., the AIC, AICc, or BIC).

Because the hierarchical clustering approach combines

data blocks based on model parameter estimates, it relies

on those parameter estimates being accurate. For many

data blocks, there will be limited information available

for estimating many of the GTRþG model parameters. This

will result in these estimates being associated with high

variance because the value of the parameters will have

very little effect on the overall likelihood score. Since the

subsequent hierarchical clustering method treats all

parameters as equally important, uncertain parameter es-

timates might limit the ability of the hierarchical clustering

approach to ﬁnd optimal partitioning schemes. The

algorithm we propose below overcomes this limitation

by merging data blocks based directly on information-

theoretic comparisons between partitioning schemes.

These metrics are calculated directly from the likelihood

so they implicitly incorporate the relative importance

of different model parameters and so avoid problems

associated with error-prone parameter estimates.

In an analysis with n data blocks, our greedy heuristic

algorithm begins by calculating the information-theoretic

score (e.g., AIC, AICc, or BIC) of the partitioning scheme

with n subsets, that is, the scheme in which each data block

is treated independently (P

start

). It then calculates the score

of all partitioning schemes with n  1 subsets, that is, all

schemes that merge two subsets of P

start

, and selects the

scheme with the best score (P

merged

). If P

merged

has a better

score than P

start

, P

merged

replaces P

start

, and the algorithm

iterates. The algorithm continues until either P

merged

does

not have a better score than P

start

, or until all data blocks

have been merged into one subset. This process results in

a greedy hill-climbing algorithm that optimizes the

information-theoretic score of interest while searching

for partitioning schemes.

We can calculate the maximum number of partitioning

schemes (P

n_greedy

) that would need to be examined by

this algorithm as follows. In addition to the starting

scheme, each round of the algorithm involves calculating

the likelihood of k choose two schemes, where k is the

number of subsets in the best scheme from the previous

round. In the worst case, the algorithm has to continue

until k 5 2, at which point the partitioning scheme under

consideration has all data blocks merged into one subset.

Thus,inananalysiswithn data blocks, the maximum

number of partitioning schemes P

n_greedy

considered by

this algorithm is:

n gr eedy

5 1 þ

k 5 2





5 1 þ nðn

 1Þ

The maximum number of subsets that need to be ex-

amined by this algorithm (S

n_greedy

) is smaller than the

maximum number of partitioning schemes because many

subsets are contained in more than one scheme. S

n_greedy

can be calculated as follows. The starting scheme involves

examining n subsets. In the next round of the algorithm, we

examine all n choose two subsets that merge two data

blocks of the starting scheme. In subsequent rounds, we

need only examine the k  2 novel subsets that can be

created by merging the most recently created subset with

the remaining subsets in the current partitioning scheme.

Thus, the maximum number of subsets that need to be

considered by this algorithm is:

n greedy

5 n

 n þ 1:

The greedy algorithm can be many orders of magnitude

more efﬁcient than an exhaustive search. For instance,

a data set with 60 data blocks requires the analysis of

5 9.77  10

partitioning schemes and S

1.15  10

subsets for an exhaustive search, but at most

60_greedy

5 35,991 partitioning schemes and S

60_greedy

3,541 subsets with the heuristic algorithm described here.

Comparing Exhaustive and Heuristic Searche s in

PartitionFinder

We tested the ability of our heuristic algorithm to ﬁnd

optimal partitioning schemes for ten data sets obtained

from Data Dryad (www.datadryad.org) and TreeBase

(www.treebase.org; table 1). The data sets we used range

from 13 to 164 taxa, from 1,896 to 9,005 bp, and from 6

to 12 data blocks (table 1). They include a range of introns,

protein-coding genes, and RNA genes from the mitochon-

drial and nuclear genomes and are typical of the multilocus

data sets routinely used for phylogenetic analyses.

For each nucleotide sequence alignment (table 1), we

excluded sites that had been excluded by the authors of

the original study and then deﬁned data blocks based

on genes and codon positions, treating transfer RNAs

(tRNAs) as a single data block. For some data sets, we ex-

cluded certain genes used in the original studies in order to

limit the size of each data set to a maximum of 12 data

blocks, thus permitting an exhaustive search of partitioning

schemes. To ﬁnd the optimal partitioning scheme, we used

the algorithm described above, implemented in Partition-

Finder, to perform an exhaustive search of all possible par-

titioning schemes on each data set. We then used

Lanfear et al. · doi:10.1093/molbev/mss020 MBE

1698

Downloaded from https://academic.oup.com/mbe/article/29/6/1695/1000514 by Bibliothèque Universitaire de médecine - Nîmes user on 16 June 2021

PartitionFinder: Combined Selection of Partitioning Schemes and Substitution Models for Phylogenetic Analyses

Citations

ModelFinder: fast model selection for accurate phylogenetic estimates

IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era.

PartitionFinder 2: New Methods for Selecting Partitioned Models of Evolution for Molecular and Morphological Phylogenetic Analyses.

W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis.

Phylogenomics resolves the timing and pattern of insect evolution

References

MrBayes 3: Bayesian phylogenetic inference under mixed models

RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models

New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0

BEAST: Bayesian evolutionary analysis by sampling trees

jModelTest: Phylogenetic Model Averaging

Related Papers (5)

MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice across a Large Model Space

RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies.

MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability

MrBayes 3: Bayesian phylogenetic inference under mixed models

MUSCLE: multiple sequence alignment with high accuracy and high throughput

Frequently Asked Questions (10)

Q1. What are the contributions in "Partitionfinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses" ?

Q2. What are the typical data sets used for phylogenetic analyses?

Q3. Why does the hierarchical clustering approach depend on parameter estimates being accurate?

Q4. What are the two ways to incorporate the variation in evolutionary processes among different sites?

Q5. What is the common approach to incorporating heterogeneity in evolutionary processes among sites?

Q6. What is the common way to group together similar sites in an alignment?

Q7. What is the likelihood of each model for each subset?

Q8. What is the phylogenetic tree estimation algorithm?

Q9. What is the likelihood of a branch length being linked?

Q10. What is the data set of the ray-finned fish?