scispace - formally typeset
Search or ask a question
Author

Jana M. McPherson

Bio: Jana M. McPherson is an academic researcher from Simon Fraser University. The author has contributed to research in topics: Habitat & Population. The author has an hindex of 18, co-authored 26 publications receiving 4582 citations. Previous affiliations of Jana M. McPherson include University of Oxford & Dalhousie University.

Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors describe six different statistical approaches to infer correlates of species distributions, for both presence/absence (binary response) and species abundance data (poisson or normally distributed response), while accounting for spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations.
Abstract: Species distributional or trait data based on range map (extent-of-occurrence) or atlas survey data often display spatial autocorrelation, i.e. locations close to each other exhibit more similar values than those further apart. If this pattern remains present in the residuals of a statistical model based on such data, one of the key assumptions of standard statistical analyses, that residuals are independent and identically distributed (i.i.d), is violated. The violation of the assumption of i.i.d. residuals may bias parameter estimates and can increase type I error rates (falsely rejecting the null hypothesis of no effect). While this is increasingly recognised by researchers analysing species distribution data, there is, to our knowledge, no comprehensive overview of the many available spatial statistical methods to take spatial autocorrelation into account in tests of statistical significance. Here, we describe six different statistical approaches to infer correlates of species’ distributions, for both presence/absence (binary response) and species abundance data (poisson or normally distributed response), while accounting for spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations. A comprehensive comparison of the relative merits of these methods is beyond the scope of this paper. To demonstrate each method’s implementation, however, we undertook preliminary tests based on simulated data. These preliminary tests verified that most of the spatial modeling techniques we examined showed good type I error control and precise parameter estimates, at least when confronted with simplistic simulated data containing

2,820 citations

Journal ArticleDOI
TL;DR: In this article, the authors examined the influence of range size on the sample size and sampling prevalence of data used to train and test distribution models for 32 bird species endemic to South Africa, Lesotho and Swaziland.
Abstract: Summary 1 Conservation scientists and resource managers increasingly employ empirical distribution models to aid decision-making. However, such models are not equally reliable for all species, and range size can affect their performance. We examined to what extent this effect reflects statistical artefacts arising from the influence of range size on the sample size and sampling prevalence (proportion of samples representing species presence) of data used to train and test models. 2 Our analyses used both simulated data and empirical distribution models for 32 bird species endemic to South Africa, Lesotho and Swaziland. Models were built with either logistic regression or non-linear discriminant analysis, and assessed with four measures of model accuracy: sensitivity, specificity, Cohen's kappa and the area under the curve (AUC) of receiver-operating characteristic (ROC) plots. Environmental indices derived from Fourier-processed satellite imagery served as predictors. 3 We first followed conventional modelling practice to illustrate how range size might influence model performance, when sampling prevalence reflects species’ natural prevalences. We then demonstrated that this influence is primarily artefactual. Statistical artefacts can arise during model assessment, because Cohen's kappa responds systematically to changes in prevalence. AUC, in contrast, is largely unaffected, and thus a more reliable measure of model performance. Statistical artefacts also arise during model fitting. Both logistic regression and discriminant analysis are sensitive to the sample size and sampling prevalence of training data. Both perform best when sample size is large and prevalence intermediate. 4 Synthesis and applications. Species’ ecological characteristics may influence the performance of distribution models. Statistical artefacts, however, can confound results in comparative studies seeking to identify these characteristics. To mitigate artefactual effects, we recommend careful reporting of sampling prevalence, AUC as the measure of accuracy, and fixed, intermediate levels of sampling prevalence in comparative studies.

521 citations

Journal ArticleDOI
TL;DR: A conceptual and cyber-infrastructure framework for refining species distributional knowledge that is novel in its ability to mobilize and integrate diverse types of data such that their collective strengths overcome individual weaknesses is proposed.
Abstract: Global knowledge about the spatial distribution of species is orders of magnitude coarser in resolution than other geographically-structured environmental datasets such as topography or land cover. Yet such knowledge is crucial in deciphering ecological and evolutionary processes and in managing global change. In this review, we propose a conceptual and cyber-infrastructure framework for refining species distributional knowledge that is novel in its ability to mobilize and integrate diverse types of data such that their collective strengths overcome individual weaknesses. The ultimate aim is a public, online, quality-vetted 'Map of Life' that for every species integrates and visualizes available distributional knowledge, while also facilitating user feedback and dynamic biodiversity analyses. First milestones toward such an infrastructure have now been implemented.

453 citations

Journal ArticleDOI
TL;DR: None of the ecological traits tested provides an obvious correlate for environmental niche breadth or intra-specific niche differentiation, and these analyses provide conservation scientists and resource managers with a rule of thumb that helps distinguish between species whose occurrence is reliably or less reliably predicted by distribution models.
Abstract: In the face of accelerating biodiversity loss and limited data, species distribution models - which statistically capture and predict species' occurrences based on environmental correlates - are increasingly used to inform conservation strategies. Additionally, distribution models and their fit provide insights on the broad-scale environmental niche of species. To investigate whether the performance of such models varies with species' ecological characteristics, we examined distribution models for 1329 bird species in southern and eastern Africa. The models were constructed at two spatial resolutions with both logistic and autologistic regression. Satellite-derived environmental indices served as predictors, and model accuracy was assessed with three metrics: sensitivity, specificity and the area under the curve (AUC) of receiver operating characteristics plots. We then determined the relationship between each measure of accuracy and ten ecological species characteristics using generalised linear models. Among the ecological traits tested, species' range size, migratory status, affinity for wetlands and endemism proved most influential on the performance of distribution models. The number of habitat types frequented (habitat tolerance), trophic rank, body mass, preferred habitat structure and association with sub-resolution habitats also showed some effect. In contrast, conservation status made no significant impact. These findings did not differ from one spatial resolution to the next. Our analyses thus provide conservation scientists and resource managers with a rule of thumb that helps distinguish, on the basis of ecological traits, between species whose occurrence is reliably or less reliably predicted by distribution models. Reasonably accurate distribution models should, however, be attainable for most species, because the influence ecological traits bore on model performance was only limited. These results suggest that none of the ecological traits tested provides an obvious correlate for environmental niche breadth or intra-specific niche differentiation.

317 citations

Journal ArticleDOI
30 Dec 2013-PLOS ONE
TL;DR: The quantitative delineation of biogeographical entities for reef fishes shows a global concordance with recent works based upon endemism, environmental factors, expert knowledge, or their combination and the similarity between the results and those from other phyla suggests that the approach may be of broad utility in describing and understanding global marine biodiversity patterns.
Abstract: Delineating regions is an important first step in understanding the evolution and biogeography of faunas. However, quantitative approaches are often limited at a global scale, particularly in the marine realm. Reef fishes are the most diversified group of marine fishes, and compared to most other phyla, their taxonomy and geographical distributions are relatively well known. Based on 169 checklists spread across all tropical oceans, the present work aims to quantitatively delineate biogeographical entities for reef fishes at a global scale. Four different classifications were used to account for uncertainty related to species identification and the quality of checklists. The four classifications delivered converging results, with biogeographical entities that can be hierarchically delineated into realms, regions and provinces. All classifications indicated that the Indo-Pacific has a weak internal structure, with a high similarity from east to west. In contrast, the Atlantic and the Eastern Tropical Pacific were more strongly structured, which may be related to the higher levels of endemism in these two realms. The “Coral Triangle”, an area of the Indo-Pacific which contains the highest species diversity for reef fishes, was not clearly delineated by its species composition. Our results show a global concordance with recent works based upon endemism, environmental factors, expert knowledge, or their combination. Our quantitative delineation of biogeographical entities, however, tests the robustness of the results and yields easily replicated patterns. The similarity between our results and those from other phyla, such as corals, suggests that our approach may be of broad utility in describing and understanding global marine biodiversity patterns.

183 citations


Cited by
More filters
Journal ArticleDOI

6,278 citations

01 Jan 2016
TL;DR: The modern applied statistics with s is universally compatible with any devices to read, and is available in the digital library an online access to it is set as public so you can download it instantly.
Abstract: Thank you very much for downloading modern applied statistics with s. As you may know, people have search hundreds times for their favorite readings like this modern applied statistics with s, but end up in harmful downloads. Rather than reading a good book with a cup of coffee in the afternoon, instead they cope with some harmful virus inside their laptop. modern applied statistics with s is available in our digital library an online access to it is set as public so you can download it instantly. Our digital library saves in multiple countries, allowing you to get the most less latency time to download any of our books like this one. Kindly say, the modern applied statistics with s is universally compatible with any devices to read.

5,249 citations

Journal ArticleDOI
TL;DR: Species distribution models (SDMs) as mentioned in this paper are numerical tools that combine observations of species occurrence or abundance with environmental estimates, and are used to gain ecological and evolutionary insights and to predict distributions across landscapes, sometimes requiring extrapolation in space and time.
Abstract: Species distribution models (SDMs) are numerical tools that combine observations of species occurrence or abundance with environmental estimates. They are used to gain ecological and evolutionary insights and to predict distributions across landscapes, sometimes requiring extrapolation in space and time. SDMs are now widely used across terrestrial, freshwater, and marine realms. Differences in methods between disciplines reflect both differences in species mobility and in “established use.” Model realism and robustness is influenced by selection of relevant predictors and modeling method, consideration of scale, how the interplay between environmental and geographic factors is handled, and the extent of extrapolation. Current linkages between SDM practice and ecological theory are often weak, hindering progress. Remaining challenges include: improvement of methods for modeling presence-only data and for model selection and evaluation; accounting for biotic interactions; and assessing model uncertainty.

5,076 citations

Journal ArticleDOI
TL;DR: In this article, the authors provide a theoretical explanation for the observed dependence of kappa on prevalence, and introduce an alternative measure of accuracy, the true skill statistic (TSS), which corrects for this dependence while still keeping all the advantages of Kappa.
Abstract: Summary 1In recent years the use of species distribution models by ecologists and conservation managers has increased considerably, along with an awareness of the need to provide accuracy assessment for predictions of such models. The kappa statistic is the most widely used measure for the performance of models generating presence–absence predictions, but several studies have criticized it for being inherently dependent on prevalence, and argued that this dependency introduces statistical artefacts to estimates of predictive accuracy. This criticism has been supported recently by computer simulations showing that kappa responds to the prevalence of the modelled species in a unimodal fashion. 2In this paper we provide a theoretical explanation for the observed dependence of kappa on prevalence, and introduce into ecology an alternative measure of accuracy, the true skill statistic (TSS), which corrects for this dependence while still keeping all the advantages of kappa. We also compare the responses of kappa and TSS to prevalence using empirical data, by modelling distribution patterns of 128 species of woody plant in Israel. 3The theoretical analysis shows that kappa responds in a unimodal fashion to variation in prevalence and that the level of prevalence that maximizes kappa depends on the ratio between sensitivity (the proportion of correctly predicted presences) and specificity (the proportion of correctly predicted absences). In contrast, TSS is independent of prevalence. 4When the two measures of accuracy were compared using empirical data, kappa showed a unimodal response to prevalence, in agreement with the theoretical analysis. TSS showed a decreasing linear response to prevalence, a result we interpret as reflecting true ecological phenomena rather than a statistical artefact. This interpretation is supported by the fact that a similar pattern was found for the area under the ROC curve, a measure known to be independent of prevalence. 5Synthesis and applications. Our results provide theoretical and empirical evidence that kappa, one of the most widely used measures of model performance in ecology, has serious limitations that make it unsuitable for such applications. The alternative we suggest, TSS, compensates for the shortcomings of kappa while keeping all of its advantages. We therefore recommend the TSS as a simple and intuitive measure for the performance of species distribution models when predictions are expressed as presence–absence maps.

3,518 citations

Journal ArticleDOI
TL;DR: In this paper, the authors describe six different statistical approaches to infer correlates of species distributions, for both presence/absence (binary response) and species abundance data (poisson or normally distributed response), while accounting for spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations.
Abstract: Species distributional or trait data based on range map (extent-of-occurrence) or atlas survey data often display spatial autocorrelation, i.e. locations close to each other exhibit more similar values than those further apart. If this pattern remains present in the residuals of a statistical model based on such data, one of the key assumptions of standard statistical analyses, that residuals are independent and identically distributed (i.i.d), is violated. The violation of the assumption of i.i.d. residuals may bias parameter estimates and can increase type I error rates (falsely rejecting the null hypothesis of no effect). While this is increasingly recognised by researchers analysing species distribution data, there is, to our knowledge, no comprehensive overview of the many available spatial statistical methods to take spatial autocorrelation into account in tests of statistical significance. Here, we describe six different statistical approaches to infer correlates of species’ distributions, for both presence/absence (binary response) and species abundance data (poisson or normally distributed response), while accounting for spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations. A comprehensive comparison of the relative merits of these methods is beyond the scope of this paper. To demonstrate each method’s implementation, however, we undertook preliminary tests based on simulated data. These preliminary tests verified that most of the spatial modeling techniques we examined showed good type I error control and precise parameter estimates, at least when confronted with simplistic simulated data containing

2,820 citations