scispace - formally typeset
Search or ask a question
Author

Daniel P. Faith

Bio: Daniel P. Faith is an academic researcher from Commonwealth Scientific and Industrial Research Organisation. The author has contributed to research in topics: Phylogenetic diversity & Monophyly. The author has an hindex of 21, co-authored 31 publications receiving 6936 citations. Previous affiliations of Daniel P. Faith include Kansas Department of Agriculture, Division of Water Resources.

Papers
More filters
Journal ArticleDOI
TL;DR: Calculation of PD for different population subsets shows that protection of populations at either of two extremes of the geographic range of the group can significantly increase the phylogenetic diversity that is protected.

4,085 citations

Book ChapterDOI
TL;DR: In this article, the authors evaluated the robustness of quantitative measures of compositional dissimilarity between sites using extensive computer simulations of species' abundance patterns over one and two dimensional configurations of sample sites in ecological space.
Abstract: The robustness of quantitative measures of compositional dissimilarity between sites was evaluated using extensive computer simulations of species’ abundance patterns over one and two dimensional configurations of sample sites in ecological space. Robustness was equated with the strength, over a range of models, of the linear and monotonic (rank-order) relationship between the compositional dissimilarities and the corresponding Euclidean distances between sites measured in the ecological space. The range of models reflected different assumptions about species’ response curve shape, sampling pattern of sites, noise level of the data, species’ interactions, trends in total site abundance, and beta diversity of gradients.

1,530 citations

Journal ArticleDOI
TL;DR: A means of quantitative evaluation is presented based on tree length of the most parsimonious tree reflects the degree to which the observed characters co‐vary such that a single tree topology can explain shared character states among the taxa.

495 citations

Journal ArticleDOI
TL;DR: The strengths and limitations of Faith’s measure of ‘Phylogenetic Diversity’ (PD) as a method for predicting from multiple intraspecific phylogeographies the underlying feature diversity represented by combinations of areas are explored.
Abstract: Genetic diversity is recognized as a fundamental component of biodiversity and its protection is incorporated in several conventions and policies. However, neither the concepts nor the methods for assessing conservation value of the spatial distribution of genetic diversity have been resolved. Comparative phylogeography can identify suites of species that have a common history of vicariance. In this study we explore the strengths and limitations of Faith's measure of 'Phylogenetic Diversity' (PD) as a method for predicting from multiple intraspecific phylogeographies the underlying feature diversity represented by combinations of areas. An advantage of the PD approach is that information on the spatial distribution of genetic diversity can be combined across species and expressed in a form that allows direct comparison with patterns of species distributions. It also seeks to estimate the same parameter, feature diversity, regardless of the level of biological organization. We extend the PD approach by using Venn diagrams to identify the components of PD, including those unique to or shared among areas and those which represent homoplasy on an area tree or which are shared across all areas. PD estimation should be complemented by analysis of these components and inspection of the contributing phylogeographies. We illustrate the application of the approach using mtDNA phylogeographies from vertebrates resident in the wet tropical rainforests of north-east Queensland and compare the results to biodiversity assessments based on the distribution of endemic vertebrate species. The genetic vs. species approaches produce different assessments of conservation value, perhaps reflecting differences in the temporal and spatial scale of the determining processes. The two approaches should be seen as complementary and, in this case, conservation planning should incorporate information on both dimensions of biodiversity.

313 citations

Journal ArticleDOI
TL;DR: Application of the bootstrap test revealed significant support for the hypothesis that the thylacine is not an outgroup to the Australian marsupials, and is the sister of the Dasyuridae.
Abstract: Recent work on randomization or permutation tests for cladistic structure (Archie, 1989a, 1989b; Faith, 1990; Faith and Cranston, 1991) has revealed some dramatic cases in which apparently phylogenetically informative data in fact have structure that could easily be matched by chance alone. In these tests a particular null model is used in which each character's states are reassigned randomly to the taxa, so that the resulting randomized data set represents random covariation among the characters. The corresponding null hypothesis is that the observed hierarchical structure could be found for such randomized data. A necessary companion to the randomcharacter-covariation null model is some criterion for evaluating and quantifying hierarchical structure. This measure of structure represents a criterion model that prescribes the manner in which the data are expected to relate to the underlying pattern (Faith and Cranston, 1991). In the examples referred to above, hierarchical structure is measured by the length of the corresponding minimum-length tree (other measures of hierarchical structure have been used in a general version of this test; see Faith [1991]). The choice of the parsimony criterion of cladistics implies that the null hypothesis is evaluated by comparing the observed minimum length to that achieved for many randomized data sets. The proportion of all sets (observed and random) having lengths as short as or shorter than the observed length yields the "cladistic permutation tail probability," or PTP (Faith and Cranston, 1991; the same test was independentlyl proposed by Archie [1989a]). The null hypothesis may be rejected, for example at the usual 0.05 level, if 5% or fewer of the data sets have a length equal to or less than that of the original (PTP < 0.05). One of the initial applications of this test (Faith, 1990) and the reply to it (Thomas et al., 1990) have raised some important points of controversy that will be addressed in this paper. In an earlier paper, Thomas et al. (1989) used 12S ribosomal RNA gene sequence data to explore the relationships among South American and Australian marsupials, addressing longstanding controversies about the relationship of the presumed-extinct thylacine (Thylacinus) to these other taxa. Based on their cladistic analysis of these data, they argued that the thylacine is not an outgroup to the Australian marsupials, and is the sister of the Dasyuridae. However, application of the PTP test showed that this data set, which consisted of only seven phylogenetically informative characters (Table la), did not have significant cladistic structure. Thus, the 12S ribosomal RNA gene sequence data constituted insufficient evidence for the inference of the phylogenetic relationship of the thylacine to these other taxa (Faith, 1990). In their reply, Thomas et al. (1990) argued that although the data set as a whole did not have significant hierarchical structure, this was not directly relevant to their hypothesis of interest, which was represented in their reply as the specific question of the monophyly of thylacines and dasyurids. They claimed that, for the evaluation of such hypotheses of monophyly, the bootstrap (Felsenstein, 1985) was a more powerful test (see also Archie, 1989a). Application of the bootstrap test revealed significant support for their hypothesis of

296 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods for reconstructing phylogenetic trees from evolutionary distance data.
Abstract: A new method called the neighbor-joining method is proposed for reconstructing phylogenetic trees from evolutionary distance data. The principle of this method is to find pairs of operational taxonomic units (OTUs [= neighbors]) that minimize the total branch length at each stage of clustering of OTUs starting with a starlike tree. The branch lengths as well as the topology of a parsimonious tree can quickly be obtained by using this method. Using computer simulation, we studied the efficiency of this method in obtaining the correct unrooted tree in comparison with that of five other tree-making methods: the unweighted pair group method of analysis, Farris's method, Sattath and Tversky's method, Li's method, and Tateno et al.'s modified Farris method. The new, neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods.

57,055 citations

Journal ArticleDOI
TL;DR: Which elements of this often-quoted strategy for graphical representation of multivariate (multi-species) abundance data have proved most useful in practical assessment of community change resulting from pollution impact are identified.
Abstract: In the early 1980s, a strategy for graphical representation of multivariate (multi-species) abundance data was introduced into marine ecology by, among others, Field, et al. (1982). A decade on, it is instructive to: (i) identify which elements of this often-quoted strategy have proved most useful in practical assessment of community change resulting from pollution impact; and (ii) ask to what extent evolution of techniques in the intervening years has added self-consistency and comprehensiveness to the approach. The pivotal concept has proved to be that of a biologically-relevant definition of similarity of two samples, and its utilization mainly in simple rank form, for example ‘sample A is more similar to sample B than it is to sample C’. Statistical assumptions about the data are thus minimized and the resulting non-parametric techniques will be of very general applicability. From such a starting point, a unified framework needs to encompass: (i) the display of community patterns through clustering and ordination of samples; (ii) identification of species principally responsible for determining sample groupings; (iii) statistical tests for differences in space and time (multivariate analogues of analysis of variance, based on rank similarities); and (iv) the linking of community differences to patterns in the physical and chemical environment (the latter also dictated by rank similarities between samples). Techniques are described that bring such a framework into place, and areas in which problems remain are identified. Accumulated practical experience with these methods is discussed, in particular applications to marine benthos, and it is concluded that they have much to offer practitioners of environmental impact studies on communities.

12,446 citations

Journal ArticleDOI
TL;DR: In this article, a non-parametric method for multivariate analysis of variance, based on sums of squared distances, is proposed. But it is not suitable for most ecological multivariate data sets.
Abstract: Hypothesis-testing methods for multivariate data are needed to make rigorous probability statements about the effects of factors and their interactions in experiments. Analysis of variance is particularly powerful for the analysis of univariate data. The traditional multivariate analogues, however, are too stringent in their assumptions for most ecological multivariate data sets. Non-parametric methods, based on permutation tests, are preferable. This paper describes a new non-parametric method for multivariate analysis of variance, after McArdle and Anderson (in press). It is given here, with several applications in ecology, to provide an alternative and perhaps more intuitive formulation for ANOVA (based on sums of squared distances) to complement the description pro- vided by McArdle and Anderson (in press) for the analysis of any linear model. It is an improvement on previous non-parametric methods because it allows a direct additive partitioning of variation for complex models. It does this while maintaining the flexibility and lack of formal assumptions of other non-parametric methods. The test- statistic is a multivariate analogue to Fisher's F-ratio and is calculated directly from any symmetric distance or dissimilarity matrix. P-values are then obtained using permutations. Some examples of the method are given for tests involving several factors, including factorial and hierarchical (nested) designs and tests of interactions.

12,328 citations

Journal ArticleDOI
22 Apr 2013-PLOS ONE
TL;DR: The phyloseq project for R is a new open-source software package dedicated to the object-oriented representation and analysis of microbiome census data in R, which supports importing data from a variety of common formats, as well as many analysis techniques.
Abstract: Background The analysis of microbial communities through DNA sequencing brings many challenges: the integration of different types of data with methods from ecology, genetics, phylogenetics, multivariate statistics, visualization and testing. With the increased breadth of experimental designs now being pursued, project-specific statistical analyses are often needed, and these analyses are often difficult (or impossible) for peer researchers to independently reproduce. The vast majority of the requisite tools for performing these analyses reproducibly are already implemented in R and its extensions (packages), but with limited support for high throughput microbiome census data. Results Here we describe a software project, phyloseq, dedicated to the object-oriented representation and analysis of microbiome census data in R. It supports importing data from a variety of common formats, as well as many analysis techniques. These include calibration, filtering, subsetting, agglomeration, multi-table comparisons, diversity analysis, parallelized Fast UniFrac, ordination methods, and production of publication-quality graphics; all in a manner that is easy to document, share, and modify. We show how to apply functions from other R packages to phyloseq-represented data, illustrating the availability of a large number of open source analysis techniques. We discuss the use of phyloseq with tools for reproducible research, a practice common in other fields but still rare in the analysis of highly parallel microbiome census data. We have made available all of the materials necessary to completely reproduce the analysis and figures included in this article, an example of best practices for reproducible research. Conclusions The phyloseq project for R is a new open-source software package, freely available on the web from both GitHub and Bioconductor.

11,272 citations

Book
21 Mar 2002
TL;DR: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data is as discussed by the authors, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced.
Abstract: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data The text begins with a revision of estimation and hypothesis testing methods, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced Special emphasis is placed on checking assumptions, exploratory data analysis and presentation of results The main analyses are illustrated with many examples from published papers and there is an extensive reference list to both the statistical and biological literature The book is supported by a website that provides all data sets, questions for each chapter and links to software

9,509 citations