scispace - formally typeset
Search or ask a question

Showing papers in "Methods in Ecology and Evolution in 2012"


Journal ArticleDOI
TL;DR: A new, multifunctional phylogenetics package, phytools, for the R statistical computing environment is presented, with a focus on phylogenetic tree-building in 2.1.
Abstract: Summary 1. Here, I present a new, multifunctional phylogenetics package, phytools, for the R statistical computing environment. 2. The focus of the package is on methods for phylogenetic comparative biology; however, it also includes tools for tree inference, phylogeny input/output, plotting, manipulation and several other tasks. 3. I describe and tabulate the major methods implemented in phytools, and in addition provide some demonstration of its use in the form of two illustrative examples. 4. Finally, I conclude by briefly describing an active web-log that I use to document present and future developments for phytools. I also note other web resources for phylogenetics in the R computational environment.

6,404 citations


Journal ArticleDOI
TL;DR: In this article, the authors conducted a comprehensive comparative analysis based on simple simulated species distributions to propose guidelines on how, where and how many pseudo-absences should be generated to build reliable species distribution models.
Abstract: Summary 1. Species distribution models are increasingly used to address questions in conservation biology, ecology and evolution. The most effective species distribution models require data on both species presence and the available environmental conditions (known as background or pseudo-absence data) in the area. However, there is still no consensus on how and where to sample these pseudo-absences and how many. 2. In this study, we conducted a comprehensive comparative analysis based on simple simulated species distributions to propose guidelines on how, where and how many pseudo-absences should be generated to build reliable species distribution models. Depending on the quantity and quality of the initial presence data (unbiased vs. climatically or spatially biased), we assessed the relative effect of the method for selecting pseudo-absences (random vs. environmentally or spatially stratified) and their number on the predictive accuracy of seven common modelling techniques (regression, classification and machine-learning techniques). 3. When using regression techniques, the method used to select pseudo-absences had the greatest impact on the model’s predictive accuracy. Randomly selected pseudo-absences yielded the most reliable distribution models. Models fitted with a large number of pseudo-absences but equally weighted to the presences (i.e. the weighted sum of presence equals the weighted sum of pseudo-absence) produced the most accurate predicted distributions. For classification and machine-learning techniques, the number of pseudo-absences had the greatest impact on model accuracy, and averaging several runs with fewer pseudo-absences than for regression techniques yielded the most predictive models. 4. Overall, we recommend the use of a large number (e.g. 10 000) of pseudo-absences with equal weighting for presences and absences when using regression techniques (e.g. generalised linear model and generalised additive model); averaging several runs (e.g. 10) with fewer pseudo-absences (e.g. 100) with equal weighting for presences and absences with multiple adaptive regression splines and discriminant analyses; and using the same number of pseudo-absences as available presences (averaging several runs if few pseudo-absences) for classification techniques such as boosted regression trees, classification trees and random forest. In addition, we recommend the random selection of pseudo-absences when using regression techniques and the random selection of geographically and environmentally stratified pseudo-absences when using classification and machine-learning techniques.

1,648 citations


Journal ArticleDOI
TL;DR: Betapart as mentioned in this paper is an R package for computing total dissimilarity as Sorensen or Jaccard indices, as well as their respective turnover and nestedness components.
Abstract: Summary 1. Beta diversity, that is, the variation in species composition among sites, can be the result of species replacement between sites (turnover) and species loss from site to site (nestedness). 2. We present betapart, an R package for computing total dissimilarity as Sorensen or Jaccard indices, as well as their respective turnover and nestedness components. 3. betapart allows the assessment of spatial patterns of beta diversity using multiple-site dissimilarity measures accounting for compositional heterogeneity across several sites or pairwise measures providing distance matrices accounting for the multivariate structure of dissimilarity. 4. betapart also allows computing patterns of temporal difference in assemblage composition, and its turnover and nestedness components. 5. Several example analyses are shown, using the data included in the package, to illustrate the relevance of separating the turnover and nestedness components of beta diversity to infer different mechanisms behind biodiversity patterns.

1,429 citations


Journal ArticleDOI
TL;DR: The Standardised Major Axis Tests and Routines (SMATR) software provides tools for estimation and inference about allometric lines, currently widely used in ecology and evolution.
Abstract: Summary 1. The Standardised Major Axis Tests and Routines (SMATR) software provides tools for estimation and inference about allometric lines, currently widely used in ecology and evolution. 2. This paper describes some significant improvements to the functionality of the package, now available on R in smatr version 3. 3. New inclusions in the package include sma and ma functions that accept formula input and perform the key inference tasks; multiple comparisons; graphical methods for visualising data and checking (S)MA assumptions; robust (S)MA estimation and inference tools.

1,204 citations


Journal ArticleDOI
TL;DR: The mvabund package for R provides tools for model-based analysis of multivariate abundance data in ecology, which includes methods for visualising data, fitting predictive models, checking model assumptions, as well as testing hypotheses about the community–environment association.
Abstract: Summary 1. The mvabund package for R provides tools for model-based analysis of multivariate abundance data in ecology. 2. This includes methods for visualising data, fitting predictive models, checking model assumptions, as well as testing hypotheses about the community–environment association. 3. This paper briefly introduces the package and demonstrates its functionality by example.

1,142 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider what is being implicitly assumed about the mean-variance relationship in distance-based analyses and what the effect is of any misspecification of the mean -variances relationship.
Abstract: Summary 1. A critical property of count data is its mean–variance relationship, yet this is rarely considered in multivariate analysis in ecology. 2. This study considers what is being implicitly assumed about the mean–variance relationship in distance-based analyses – multivariate analyses based on a matrix of pairwise distances – and what the effect is of any misspecification of the mean–variance relationship. 3. It is shown that distance-based analyses make implicit assumptions that are typically out-of-step with what is observed in real data, which has major consequences. 4. Potential consequences of this mean–variance misspecification are: confounding location and dispersion effects in ordinations; misleading results when trying to identify taxa in which an effect is expressed; failure to detect a multivariate effect unless it is expressed in high-variance taxa. 5. Data transformation does not solve the problem. 6. A solution is to use generalised linear models and their recent multivariate generalisations, which is shown here to have desirable properties.

883 citations


Journal ArticleDOI
TL;DR: The R package ‘diversitree’ contains a number of classical and contemporary comparative phylogenetic methods that are suitable for analysing trait evolution and estimating speciation/extinction rates independently.
Abstract: Summary 1. The R package ‘diversitree’ contains a number of classical and contemporary comparative phylogenetic methods. Key included methods are BiSSE (binary state speciation and extinction), MuSSE (a multistate extension of BiSSE), and QuaSSE (quantitative state speciation and extinction). Diversitree also includes methods for analysing trait evolution and estimating speciation/extinction rates independently. 2. In this note, I describe the features and demonstrate use of the package, using a new method, MuSSE (multistate speciation and extinction), to examine the joint effects of two traits on speciation. 3. Using simulations, I found that MuSSE could reliably detect that a binary trait that affected speciation rates when simultaneously accounting for additional thats that had no effect on speciation rates. 4. Diversitree is an open source and available on the Comprehensive R Archive Network (cran). A tutorial and worked examples can be downloaded from http://www.zoology.ubc.ca/prog/diversitree.

807 citations


Journal ArticleDOI
TL;DR: Phylogenetic signal is the tendency of related species to resemble each other more than species drawn at random from the same tree and various indices have been proposed for quantifying it.
Abstract: 1.ePhylogenetic signal is the tendency of related species to resemble each other more than species drawn at random from the same tree. This pattern is of considerable interest in a range of ecological and evolutionary research areas, and various indices have been proposed for quantifying it. Unfortunately, these indices often lead to contrasting results, and guidelines for choosing the most appropriate index are lacking. 2.eHere, we compare the performance of four commonly used indices using simulated data. Data were generated with numerical simulations of trait evolution along phylogenetic trees under a variety of evolutionary models. We investigated the sensitivity of the approaches to the size of phylogenies, the resolution of tree structure and the availability of branch length information, examining both the response of the selected indices and the power of the associated statistical tests. 3.eWe found that under a Brownian motion (BM) model of trait evolution, Abouheifrs Cmean and Pagelrs l performed well and substantially better than Moranrs I and Blombergrs K. Pagelrs l provided a reliable effect size measure and performed better for discriminating between more complex models of trait evolution, but was computationally more demanding than Abouheifrs Cmean. Blombergrs K was most suitable to capture the effects of changing evolutionary rates in simulation experiments. 4.eInterestingly, sample size influenced not only the uncertainty but also the expected values of most indices, while polytomies and missing branch length information had only negligible impacts. 5.eWe propose guidelines for choosing among indices, depending on (a) their sensitivity to true underlying patterns of phylogenetic signal, (b) whether a test or a quantitative measure is required and (c) their sensitivities to different topologies of phylogenies. 6.eThese guidelines aim to better assess phylogenetic signal and distinguish it from random trait distributions. They were developed under the assumption of BM, and additional simulations with more complex trait evolution models show that they are to a certain degree generalizable. They are particularly useful in comparative analyses, when requiring a proxy for niche similarity, and in conservation studies that explore phylogenetic loss associated with extinction risks of specific clades.

744 citations


Journal ArticleDOI
TL;DR: ABC as discussed by the authors is a R package that implements several approximate Bayesian computation (ABC) algorithms for parameter estimation and model selection, in particular the recently developed nonlinear heteroscedastic regression methods for ABC.
Abstract: Summary 1. Many recent statistical applications involve inference under complex models, where it is computationally prohibitive to calculate likelihoods but possible to simulate data. Approximate Bayesian computation (ABC) is devoted to these complex models because it bypasses the evaluation of the likelihood function by comparing observed and simulated data. 2. We introduce the R package ‘abc’ that implements several ABC algorithms for performing parameter estimation and model selection. In particular, the recently developed nonlinear heteroscedastic regression methods for ABC are implemented. The ‘abc’ package also includes a cross-validation tool for measuring the accuracy of ABC estimates and to calculate the misclassification probabilities when performing model selection. The main functions are accompanied by appropriate summary and plotting tools. 3. R is already widely used in bioinformatics and several fields of biology. The R package ‘abc’ will make the ABC algorithms available to a large number of R users. ‘abc’ is a freely available R package under the GPL license, and it can be downloaded at http://cran.r-project.org/web/packages/

622 citations


Journal ArticleDOI
TL;DR: In this paper, a set of interpolated climate surfaces at 10¢ and 30¢ resolution for global land areas excluding Antarctica was developed, with input data for the baseline climatology gathered from the WorldClim and CRU CL1AE0 and CL2AE0 data sets.
Abstract: Summary 1. Gridded climatologies have become an indispensable component of bioclimatic modelling, with a range of applications spanning conservation and pest management. Such globally conformal data sets of historical and future scenario climate surfaces are required to model species potential ranges under current and future climate scenarios. 2. We developed a set of interpolated climate surfaces at 10¢ and 30¢ resolution for global land areas excluding Antarctica. Input data for the baseline climatology were gathered from the WorldClim and CRU CL1AE0 and CL2AE0 data sets. A set of future climate scenarios were generated at 10¢ resolution. For each of the historical and future scenario data sets, the full set of 35 Bioclim variables was generated. Climate variables (including relative humidity at 0900 and 1500 hours) were also gener

600 citations


Journal ArticleDOI
TL;DR: It is demonstrated, for the first time, that metabarcoding allows for the precise estimation of pairwise community dissimilarity (beta diversity) and within-community phylogenetic diversity (alpha diversity), despite the inevitable loss of taxonomic information inherent to metabarcode.
Abstract: Summary 1. Traditional biodiversity assessment is costly in time, money and taxonomic expertise. Moreover, data are frequently collected in ways (e.g. visual bird lists) that are unsuitable for auditing by neutral parties, which is necessary for dispute resolution. 2. We present protocols for the extraction of ecological, taxonomic and phylogenetic information from bulk samples of arthropods. The protocols combine mass trapping of arthropods, mass-PCR amplification of the COI barcode gene, pyrosequencing and bioinformatic analysis, which together we call ‘metabarcoding’. 3. We construct seven communities of arthropods (mostly insects) and show that it is possible to recover a substantial proportion of the original taxonomic information. We further demonstrate, for the first time, that metabarcoding allows for the precise estimation of pairwise community dissimilarity (beta diversity) and within-community phylogenetic diversity (alpha diversity), despite the inevitable loss of taxonomic information inherent to metabarcoding. 4. Alpha and beta diversity metrics are the raw materials of ecology and the environmental sciences, facilitating assessment of the state of the environment with a broad and efficient measure of biodiversity.

Journal ArticleDOI
TL;DR: In this paper, the authors provide guidelines for determining the sample size (number of individuals and number of measurements per individual) required to accurately estimate the intraclass correlation coefficient (ICC).
Abstract: Summary 1. Researchers frequently take repeated measurements of individuals in a sample with the goal of quantifying the proportion of the total variation that can be attributed to variation among individuals vs. variation among measurements within individuals. The proportion of the variation attributed to variation among individuals is known as repeatability and is most frequently estimated as the intraclass correlation coefficient (ICC). The goal of our study is to provide guidelines for determining the sample size (number of individuals and number of measurements per individual) required to accurately estimate the ICC. 2. We report a range of ICCs from the literature and estimate 95% confidence intervals for these estimates. We introduce a predictive equation derived by Bonett (2002), and we test the assumptions of this equation through simulation. Finally, we create an R statistical package for the planning of experiments and estimation of ICCs. 3. Repeatability estimates were reported in 1·5% of the articles published in the journals surveyed. Repeatabilities tended to be highest when the ICC was used to estimate measurement error and lowest when it was used to estimate repeatability of behavioural and physiological traits. Few authors report confidence intervals, but our estimated 95% confidence intervals for published ICCs generally indicated a low level of precision associated with these estimates. This survey demonstrates the need for a protocol to estimate repeatability. 4. Analysis of the predictions from Bonett’s equation over a range of sample sizes, expected repeatabilities and desired confidence interval widths yields both analytical and intuitive guidelines for designing experiments to estimate repeatability. However, we find a tendency for the confidence interval to be underestimated by the equation when ICCs are high and overestimated when ICCs and the number of measurements per individual are low. 5. The sample size to use when estimating repeatability is a question pitting investigator effort against expected precision of the estimate. We offer guidelines that apply over a wide variety of ecological and evolutionary studies estimating repeatability, measurement error or heritability. Additionally, we provide the R package, icc, to facilitate analyses and determine the most economic use of resources when planning experiments to estimate repeatability.

Journal ArticleDOI
TL;DR: An intuitive method for evaluating transferability based on techniques currently in use in the area of species distribution modelling, which involves cross-validation in which data are assigned non-randomly to groups that are spatially, temporally or otherwise distinct, thus using heterogeneity in the data set as a surrogate for heterogeneity among data sets.
Abstract: Summary 1. Ecologists have long sought to distinguish relationships that are general from those that are idiosyncratic to a narrow range of conditions. Conventional methods of model validation and selection assess in- or out-of-sample prediction accuracy but do not assess model generality or transferability, which can lead to overestimates of performance when predicting in other locations, time periods or data sets. 2. We propose an intuitive method for evaluating transferability based on techniques currently in use in the area of species distribution modelling. The method involves cross-validation in which data are assigned non-randomly to groups that are spatially, temporally or otherwise distinct, thus using heterogeneity in the data set as a surrogate for heterogeneity among data sets. 3. We illustrate the method by applying it to distribution modelling of brook trout (Salvelinus fontinalis Mitchill) and brown trout (Salmo trutta Linnaeus) in western United States. We show that machine-learning techniques such as random forests and artificial neural networks can produce models with excellent in-sample performance but poor transferability, unless complexity is constrained. In our example, traditional linear models have greater transferability. 4. We recommend the use of a transferability assessment whenever there is interest in making inferences beyond the data set used for model fitting. Such an assessment can be used both for validation and for model selection and provides important information beyond what can be learned from conventional validation and selection techniques.

Journal ArticleDOI
TL;DR: In this paper, a variety of nonlinear models that are appropriate for modelling plant growth and, for each, calculate function-derived growth rates, which allow unbiased comparisons among species at a common time or size.
Abstract: Summary 1 Plant growth is a fundamental ecological process, integrating across scales from physiology to community dynamics and ecosystem properties Recent improvements in plant growth modelling have allowed deeper understanding and more accurate predictions for a wide range of ecological issues, including competition among plants, plant–herbivore interactions and ecosystem functioning 2 One challenge in modelling plant growth is that, for a variety of reasons, relative growth rate (RGR) almost universally decreases with increasing size, although traditional calculations assume that RGR is constant Nonlinear growth models are flexible enough to account for varying growth rates 3 We demonstrate a variety of nonlinear models that are appropriate for modelling plant growth and, for each, show how to calculate function-derived growth rates, which allow unbiased comparisons among species at a common time or size We show how to propagate uncertainty in estimated parameters to express uncertainty in growth rates Fitting nonlinear models can be challenging, so we present extensive worked examples and practical recommendations, all implemented in R 4 The use of nonlinear models coupled with function-derived growth rates can facilitate the testing of novel hypotheses in population and community ecology For example, the use of such techniques has allowed better understanding of the components of RGR, the costs of rapid growth and the linkage between host and parasite growth rates We hope this contribution will demystify nonlinear modelling and persuade more ecologists to use these techniques

Journal ArticleDOI
TL;DR:
Abstract: Summary 1. paleotree is a library of functions for the R statistical computing environment dedicated to analyses that combine paleontological and phylogenetic data sets, particularly the time-scaling of phylogenetic trees, which include extinct fossil lineages. 2. The functions included in this library focus on simulating paleontological data sets, measuring sampling rates, time-scaling cladograms of fossil taxa and plotting historical diversity curves. 3. I describe the capabilities and analytical basis of the functions in paleotree by presenting two examples. The first example showcases the simulation capabilities and plotting the output as diversity curves. The second example demonstrates time-scaling a cladogram of fossil taxa and estimating sampling rates and completeness from temporal ranges.

Journal ArticleDOI
TL;DR: In this article, the authors provide a critical review of maxent as applied to species distribution modelling and discuss how it can lead to inferential errors, and demonstrate that maxent produces a number of poorly defined indices that are not directly related to the actual parameter of interest.
Abstract: Summary 1. Understanding the factors affecting species occurrence is a pre-eminent focus of applied ecological research. However, direct information about species occurrence is lacking for many species. Instead, researchers sometimes have to rely on so-called presence-only data (i.e. when no direct information about absences is available), which often results from opportunistic, unstructured sampling. maxent is a widely used software program designed to model and map species distribution using presence-only data. 2. We provide a critical review of maxent as applied to species distribution modelling and discuss how it can lead to inferential errors. A chief concern is that maxent produces a number of poorly defined indices that are not directly related to the actual parameter of interest – the probability of occurrence (ψ). This focus on an index was motivated by the belief that it is not possible to estimate ψ from presence-only data; however, we demonstrate that ψ is identifiable using conventional likelihood methods under the assumptions of random sampling and constant probability of species detection. 3. The model is implemented in a convenient r package which we use to apply the model to simulated data and data from the North American Breeding Bird Survey. We demonstrate that maxent produces extreme under-predictions when compared to estimates produced by logistic regression which uses the full (presence/absence) data set. We note that maxent predictions are extremely sensitive to specification of the background prevalence, which is not objectively estimated using the maxent method. 4. As with maxent, formal model-based inference requires a random sample of presence locations. Many presence-only data sets, such as those based on museum records and herbarium collections, may not satisfy this assumption. However, when sampling is random, we believe that inference should be based on formal methods that facilitate inference about interpretable ecological quantities instead of vaguely defined indices.

Journal ArticleDOI
TL;DR: The background and prevalence of thinning is discussed, its consequences are illustrated, circumstances when it might be regarded as a reasonable option and recommend against routine thinning of chains unless necessitated by computer memory limitations.
Abstract: Summary 1. Markov chain Monte Carlo (MCMC) is a simulation technique that has revolutionised the analysis of ecological data, allowing the fitting of complex models in a Bayesian framework. Since 2001, there have been nearly 200 papers using MCMC in publications of the Ecological Society of America and the British Ecological Society, including more than 75 in the journal Ecology and 35 in the Journal of Applied Ecology. 2. We have noted that many authors routinely ‘thin’ their simulations, discarding all but every kth sampled value; of the studies we surveyed with details on MCMC implementation, 40% reported thinning. 3. Thinning is often unnecessary and always inefficient, reducing the precision with which features of the Markov chain are summarised. The inefficiency of thinning MCMC output has been known since the early 1990’s, long before MCMC appeared in ecological publications. 4. We discuss the background and prevalence of thinning, illustrate its consequences, discuss circumstances when it might be regarded as a reasonable option and recommend against routine thinning of chains unless necessitated by computer memory limitations.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the influence of different calibration methods on the accuracy of the latitudinal positions of geolocators, and demonstrated their effect on the measurement of day and night length, time of solar midnight ⁄ noon and the resulting position estimates using light measurements from stationary geolocalators at known places and from geocators mounted on birds.
Abstract: Summary 1. Geolocation by light allows for tracking animal movements, based on measurements of light intensity over time by a data-logging device (‘geolocator’). Recent developments of ultra-light devices (<2 g) broadened the range of target species and boosted the number of studies using geolocators. However, an inherent problem of geolocators is that any factor or process that changes the natural light intensity pattern also affects the positions calculated from these light patterns. Although the most important factors have been identified, estimation of their effect on the accuracy and precision of positions estimated has been lacking but is very important for the analyses and interpretation of geolocator data. 2. The ‘threshold method’ is mainly used to derive positions by defining sunrise and sunset times from the light intensity pattern for each recorded day. This method requires calibration: a predefined sun elevation angle for estimating latitude by fitting the recorded day ⁄ night lengths to theoretical values across latitudes. Therewith, almost constant shading can be corrected for by finding the appropriate sun elevation angle. 3. Weather, topography and vegetation are the most important factors that influence light intensities. We demonstrated their effect on the measurement of day ⁄ night length, time of solar midnight ⁄ noon and the resulting position estimates using light measurements from stationary geolocators at known places and from geolocators mounted on birds. Furthermore, we investigated the influence of different calibration methods on the accuracy of the latitudinal positions. 4. All three environmental factors can influence the light intensity pattern significantly. Weather and an animal’s behaviour result in increased noise in positioning, whereas topography and vegetation result in systematic shading and biased positions. Calibration can significantly shift the estimated latitudes and potentially increase the accuracy, but detailed knowledge about the particular confounding factors and the behaviour of the studied animal is crucial for the choice of the most appropriate calibration method.

Journal ArticleDOI
TL;DR: In this paper, aerial images of canopy gaps are used to assess floristic biodiversity of the forest understorey, which can be used as a coarse-filter approach to conservation in forests wherever light strongly limits regeneration and biodiversity.
Abstract: Summary 1. Structural diversity and niche differences within habitats are important for stabilizing species coexistence. However, land-use change leading to environmental homogenization is a major cause for the dramatic decline of biodiversity under global change. The difficulty in assessing large-scale biodiversity losses urgently requires new technological advances to evaluate land-use impact on diversity timely and efficiently across space. 2. While cost-effective aerial images have been suggested for potential biodiversity assessments in forests, correlation of canopy object variables such as gaps with plant or animal diversity has so far not been demonstrated using these images. 3. Here, we show that aerial images of canopy gaps can be used to assess floristic biodiversity of the forest understorey. This approach is made possible because we employed cutting-edge unmanned aerial vehicles and very high-resolution images (7 cm pixel )1 ) of the canopy properties. We demonstrate that detailed, spatially implicit information on gap shape metrics is sufficient to reveal strong dependency between disturbance patterns and plant diversity (R 2 up to 0AE74). This is feasible because opposing disturbance patterns such as aggregated and dispersed tree retention directly correspond to different functional and dispersal traits of species and ultimately to different species diversities. 4. Our findings can be used as a coarse-filter approach to conservation in forests wherever light strongly limits regeneration and biodiversity.

Journal ArticleDOI
TL;DR: Douglas Argos-filter can improve data accuracy by 50–90% and is an effective and flexible tool for preparing Argos data for direct biological interpretation or subsequent modelling.
Abstract: Summary The Argos System is used worldwide to satellite-track free-ranging animals, but location errors can range from tens of metres to hundreds of kilometres. Low-quality locations (Argos classes A, 0, B and Z) dominate animal tracking data. Standard-quality animal tracking locations (Argos classes 3, 2 and 1) have larger errors than those reported in Argos manuals. The Douglas Argos-filter (DAF) algorithm flags implausible locations based on user-defined thresholds that allow the algorithm's performance to be tuned to species' movement behaviours and study objectives. The algorithm is available in Movebank – a free online infrastructure for storing, managing, sharing and analysing animal movement data. We compared 21,044 temporally paired global positioning system (GPS) locations with Argos location estimates collected from Argos transmitters on free-ranging waterfowl and condors (13 species, 314 individuals, 54,895 animal-tracking days). The 95th error percentiles for unfiltered Argos locations 0, A, B and Z were within 35·8, 59·6, 163·2 and 220·2 km of the true location, respectively. After applying DAF with liberal thresholds, roughly 20% of the class 0 and A locations and 45% of the class B and Z locations were excluded, and the 95th error percentiles were reduced to 17·2, 15·0, 20·9 and 18·6 km for classes 0, A, B and Z, respectively. As thresholds were applied more conservatively, fewer locations were retained, but they possessed higher overall accuracy. Douglas Argos-filter can improve data accuracy by 50–90% and is an effective and flexible tool for preparing Argos data for direct biological interpretation or subsequent modelling.

Journal ArticleDOI
TL;DR: The R package GeoLight is developed, which provides basic functions for all steps of determining global positioning and a new approach in analysing movement pattern, and discusses the major functions of this package using example movement data of European hoopoe.
Abstract: Summary Determining global position by light measurements (‘geolocation’) has revolutionised the methods used to track migratory birds throughout their annual cycle. To date, there is no standard way of analysing geolocator data, making communication of analyses cumbersome and hampering the reproducibility of results. We have, therefore, developed the R package GeoLight, which provides basic functions for all steps of determining global positioning and a new approach in analysing movement pattern. Here, we briefly introduce and discuss the major functions of this package using example movement data of European hoopoe (Upupa epops).

Journal ArticleDOI
TL;DR: In this paper, the authors present a method to measure metabolites of steroid hormones from faeces, which can be used in wildlife conservation and ecology, without the necessity to capture the animals.
Abstract: Summary 1. Methods to measure metabolites of steroid hormones from faeces have become very popular in wildlife conservation and ecology, because they allow gathering physiological data without the necessity to capture the animals. However, this advantage comes at costs that are particularly relevant when studying free-living animals in their natural environments. Previous methodological reviews have stressed the importance of validations to prove that real metabolites of the hormone in question are measured, but the research community has largely ignored further caveats relating to sex, diet, metabolic rate and individual differences in hormone metabolite formation. 2. Often the sexes differ in how they metabolize hormones. As a consequence, one may not be able to compare hormone metabolite concentrations between males and females of one species. 3. Diet can alter the way hormones are metabolized, and different diets can change the amount of faecal bulk. Both phenomena can result in measurement artefacts that may seriously distort the estimation of hormone metabolite concentrations. As a consequence, comparisons of hormone metabolite concentrations, for example, between seasons or populations, may become problematic. 4. Changes in ambient temperature and food availability may trigger large fluctuations in metabolic rate of free-living animals. These fluctuations may then result in major distortions of faecal hormone metabolite concentrations without any change in bioactive hormone levels. 5. Bacteria metabolize hormones in the gut. Individual differences in bacterial composition can cause differences in how hormones are decomposed. Thus, individuals may differ with regard to what kind of hormone metabolites they form and with regard to the relative composition of these hormone metabolites. As only specific metabolites are measured, differences in metabolism may distort the results. 6. In summary, non-invasive hormone research measures various end products of a hormone after its clearance from the circulation and extensive modification by bacteria. Not only does this increase random variance, it may also generate systematic noise, which may seriously distort the signal (i.e. the hormonal status of the individual) in a non-random manner. Thus, we still need to learn much more about whether this widely used technique reliably measures the physiological status of animals in uncontrolled environments.

Journal ArticleDOI
TL;DR: In this article, the authors discuss the connection between spatial point, count, and presence-absence methods and how their parameter estimates and predictions should be interpreted and illustrate that under certain assumptions, each method can be motivated by the same underlying spatial inhomogeneous Poisson point process (IPP) model in which the intensity function is modelled as a log-linear function of covariates.
Abstract: 1. The need to understand the processes shaping population distributions has resulted in a vast increase in the diversity of spatial wildlife data, leading to the development of many novel analytical techniques that are fit-for-purpose. One may aggregate location data into spatial units (e.g. grid cells) and model the resulting counts or presence–absences as a function of environmental covariates. Alternatively, the point data may be modelled directly, by combining the individual observations with a set of random or regular points reflecting habitat availability, a method known as a use-availability design (or, alternatively a presence – pseudo-absence or case–control design). 2. Although these spatial point, count and presence–absence methods are widely used, the ecological literature is not explicit about their connections and how their parameter estimates and predictions should be interpreted. The objective of this study is to recapitulate some recent statistical results and illustrate that under certain assumptions, each method can be motivated by the same underlying spatial inhomogeneous Poisson point process (IPP) model in which the intensity function is modelled as a log-linear function of covariates. 3. The Poisson likelihood used for count data is a discrete approximation of the IPP likelihood. Similarly, the presence–absence design will approximate the IPP likelihood, but only when spatial units (i.e. pixels) are extremely small (Electric Journal of Statistics, 2010, 4, 1151–1201). For larger pixel sizes, presence–absence designs do not differentiate between one or multiple observations within each pixel, hence leading to information loss. 4. Logistic regression is often used to estimate the parameters of the IPP model using point data. Although the response variable is defined as 0 for the availability points, these zeros do not serve as true absences as is often assumed; rather, their role is to approximate the integral of the denominator in the IPP likelihood (The Annals of Applied Statistics, 2010, 4, 1383–1402). Because of this common misconception, the estimated exponential function of the linear predictor (i.e. the resource selection function) is often assumed to be proportional to occupancy. Like IPP and count models, this function is proportional to the expected density of observations. 5. Understanding these (dis-)similarities between different species distribution modelling techniques should improve biological interpretation of spatial models and therefore advance ecological and methodological cross-fertilization.

Journal ArticleDOI
TL;DR: “Models of trait macroevolution on trees (MOTMOT) is a new software package that tests for variation in the tempo and mode of continuous character evolution on phylogenetic trees.
Abstract: Summary 1 Models of trait macroevolution on trees (MOTMOT) is a new software package that tests for variation in the tempo and mode of continuous character evolution on phylogenetic trees MOTMOT provides tools to fit a range of models of trait evolution with emphasis on variation in the rate of evolution between clades and character states 2 We introduce a new method, trait MEDUSA, to identify the location of major changes in the rate of evolution of continuous traits on phylogenetic trees We demonstrate trait MEDUSA and the other main functions of MOTMOT, using body size of Anolis lizards 3 MOTMOT is open source software written in the R language and is freely available from CRAN (http://cranr-projectorg/web/packages/)

Journal ArticleDOI
TL;DR: An overview of time-ordered networks is presented, which provide a framework for analysing network dynamics that addresses multiple inferential issues and permits novel types of temporally informed network analyses.
Abstract: Summary 1. Network analysis is widely used in diverse fields and can be a powerful framework for studying the structure of biological systems. Temporal dynamics are a key issue for many ecological and evolutionary questions. These dynamics include both changes in network topology and flow on the network. Network analyses that ignore or do not adequately account for the temporal dynamics can result in inappropriate inferences. 2. We suggest that existing methods are currently under-utilized in many ecological and evolutionary network analyses and that the broader incorporation of these methods will considerably advance the current field. Our goal is to introduce ecologists and evolutionary biologists interested in studying network dynamics to extant ideas and methodological approaches, at a level appropriate for those new to the field. 3. We present an overview of time-ordered networks, which provide a framework for analysing network dynamics that addresses multiple inferential issues and permits novel types of temporally informed network analyses. We review available methods and software, discuss the utility and considerations of different approaches, provide a worked example analysis and highlight new research opportunities in ecology and evolutionary biology.

Journal ArticleDOI
TL;DR: This work presents a novel statistical approach to determine indicators of site groups using species data that allows indicators to be species combinations in addition to single species, and presents a simple algorithm that identifies the set of indicators that show high positive predictive value for the target site group.
Abstract: Summary 1. Indicator species are often determined using an analysis of the relationship between the species occurrence or abundance values from a set of sites and the classification of the same sites into site groups (habitat types, community types, disturbance states, etc.). It may happen, however, that a particular site group has no indicator species even if its sites have a community composition that is clearly distinct from the sites of other site groups. This motivates an exploration of the indicator value of not only individual species but also species combinations. 2. Here, we present a novel statistical approach to determine indicators of site groups using species data. Unlike traditional indicator value analysis, we allow indicators to be species combinations in addition to single species. We require that all the species forming the combination must occur in the site to use the combination as an indicator. We present a simple algorithm that identifies the set of indicators (each one being either a single species or a species combination) that show high positive predictive value for the target site group. Moreover, we demonstrate the use of the percentage of sites of the site group where at least one of its valid indicators occurs to determine whether the group can be reliably predicted throughout its range. 3. Using a simulation study, we show that if two species are not strongly correlated and their frequency in the data set is larger than the frequency of sites belonging to the site group, the joint occurrence of the two species has higher positive predictive value for the site group than the two species taken independently. 4. We illustrate the proposed method by determining which combinations of vascular plants can be used as indicators for 29 shrubland and forest vegetation types of New Zealand. 5. The proposed methodology extends traditional indicator value analyses and will be useful to develop multispecies ecological or environmental indicators. Further, it will allow newly surveyed sites to be reliably assigned to previously defined vegetation types.

Journal ArticleDOI
TL;DR: An open-source application for the storage, pattern extraction and pattern matching of digital images for the purposes of mark–recapture analysis is created and applied to a population of Masai giraffe in the Tarangire Ecosystem in northern Tanzania.
Abstract: Summary 1. Photographic mark–recapture is a cost-effective, non-invasive way to study populations. However, to efficiently apply photographic mark–recapture to large populations, computer software is needed for image manipulation and pattern matching. 2. We created an open-source application for the storage, pattern extraction and pattern matching of digital images for the purposes of mark–recapture analysis. The resulting software package is a stand-alone, multiplatform application implemented in Java. Our program employs the Scale Invariant Feature Transform (SIFT) operator that extracts distinctive features invariant to image scale and rotation. 3. We applied this system to a population of Masai giraffe (Giraffa camelopardalis tippelskirchi) in the Tarangire Ecosystem in northern Tanzania. Over 1200 images were acquired in the field during three primary sampling periods between September 2008 and December 2009. The pattern information in these images was extracted and matched resulting in capture histories for over 600 unique individuals. 4. Estimated error rates of the matching system were low based on a subset of test images that were independently matched by eye. 5. Encounter histories were subsequently analysed with open population models to estimate apparent survival rates and population size. 6. This new open-access tool allowed photographic mark–recapture to be applied successfully to this relatively large population.

Journal ArticleDOI
TL;DR: RNCEP as discussed by the authors is a package of functions in the open-source R language to access, organize and visualise freely available atmospheric data from two long-term high-quality data sets with global coverage.
Abstract: Atmospheric conditions strongly influence ecological systems, and tools that simplify the access and processing of atmospheric data can greatly facilitate ecological research. We have developed RNCEP, a package of functions in the open-source R language, to access, organise and visualise freely available atmospheric data from two long-term high-quality data sets with global coverage. These functions retrieve data, via the Internet, for either a desired spatiotemporal extent or interpolated to a point in space and time. The package also contains functions to temporally aggregate data, producing user-defined variables, and to visualise these data on a map. By making access to atmospheric data and integration with biological data easier and more flexible, we hope to facilitate and encourage the exploration of relationships between biological systems and atmospheric conditions.

Journal ArticleDOI
TL;DR:
Abstract: Summary 1. DNA barcoding studies use Kimura's two-parameter substitution model (K2P) as the de facto standard for constructing genetic distance matrices. Distances generated under this model then provide the basis for most downstream analyses, but uncertainty in model choice is rarely explored and could potentially affect how reliably DNA barcodes discriminate species. 2. Using information-theoretic approaches for a data set comprising 14 472 DNA barcodes from 14 published studies, we tested whether the K2P model was a good fit at the species level and whether applying a better fitting model biased error rates or changed overall identification success. 3. We report that the K2P was a poorly fitting model at the species level; it was never selected as the best model and very rarely selected as a credible alternative model. Despite the lack of support for the K2P model, differences in distance between best model and K2P model estimates were usually minimal, and importantly, identification success rates were largely unaffected by model choice even when interspecific threshold values were reassessed. 4. Although these conclusions may justify using the K2P model for specimen identification purposes, we found simpler metrics such as p distance performed equally well, perhaps obviating the requirement for model correction in DNA barcoding. Conversely, when incorporating genetic distance data into taxonomic studies, we advocate a more thorough examination of model uncertainty.

Journal ArticleDOI
TL;DR: In this article, the authors demonstrate a fast and easy way of generating standardised DNA templates, which are then used to balance the amplification success for the different targets and to determine the sensitivity of each primer pair in the multiplex PCR.
Abstract: 1. Multiplex PCR is a valuable tool in many biological studies but it is a multifaceted procedure that has to be planned and optimised thoroughly to achieve robust and meaningful results. In particular, primer concentrations have to be adjusted to assure an even amplification of all targeted DNA fragments. Until now, total DNA extracts were used for balancing primer efficiencies; however, the applicability for comparisons between taxa or different multiple-copy genes was limited owing to the unknown number of template molecules present per total DNA. 2. Based on a multiplex system developed to track trophic interactions in high Alpine arthropods, we demonstrate a fast and easy way of generating standardised DNA templates. These were then used to balance the amplification success for the different targets and to subsequently determine the sensitivity of each primer pair in the multiplex PCR. 3. In the current multiplex assay, this approach led to an even amplification success for all seven targeted DNA fragments. Using this balanced multiplex PCR, methodological bias owing to variation in primer efficiency will be avoided when analysing field-derived samples. 4. The approach outlined here allows comparing multiplex PCR sensitivity, independent of the investigated species, genome size or the targeted genes. The application of standardised DNA templates not only makes it possible to optimise primer efficiency within a given multiplex PCR, but it also offers to adjust and/or to compare the sensitivity between different assays. Along with other factors that influence the success of multiplex reactions, and which we discuss here in relation to the presented detection system, the adoption of this approach will allow for direct comparison of multiplex PCR data between systems and studies, enhancing the utility of this assay type.