scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Predicting species distribution: offering more than simple habitat models.

01 Sep 2005-Ecology Letters (Wiley/Blackwell (10.1111))-Vol. 8, Iss: 9, pp 993-1009
TL;DR: An overview of recent advances in species distribution models, and new avenues for incorporating species migration, population dynamics, biotic interactions and community ecology into SDMs at multiple spatial scales are suggested.
Abstract: In the last two decades, interest in species distribution models (SDMs) of plants and animals has grown dramatically. Recent advances in SDMs allow us to potentially forecast anthropogenic effects on patterns of biodiversity at different spatial scales. However, some limitations still preclude the use of SDMs in many theoretical and practical applications. Here, we provide an overview of recent advances in this field, discuss the ecological principles and assumptions underpinning SDMs, and highlight critical limitations and decisions inherent in the construction and evaluation of SDMs. Particular emphasis is given to the use of SDMs for the assessment of climate change impacts and conservation management issues. We suggest new avenues for incorporating species migration, population dynamics, biotic interactions and community ecology into SDMs at multiple spatial scales. Addressing all these issues requires a better integration of SDMs with ecological theory.
Citations
More filters
Journal ArticleDOI
TL;DR: This work compared 16 modelling methods over 226 species from 6 regions of the world, creating the most comprehensive set of model comparisons to date and found that presence-only data were effective for modelling species' distributions for many species and regions.
Abstract: Prediction of species' distributions is central to diverse applications in ecology, evolution and conservation science. There is increasing electronic access to vast sets of occurrence records in museums and herbaria, yet little effective guidance on how best to use this information in the context of numerous approaches for modelling distributions. To meet this need, we compared 16 modelling methods over 226 species from 6 regions of the world, creating the most comprehensive set of model comparisons to date. We used presence-only data to fit models, and independent presence-absence data to evaluate the predictions. Along with well-established modelling methods such as generalised additive models and GARP and BIOCLIM, we explored methods that either have been developed recently or have rarely been applied to modelling species' distributions. These include machine-learning methods and community models, both of which have features that may make them particularly well suited to noisy or sparse information, as is typical of species' occurrence data. Presence-only data were effective for modelling species' distributions for many species and regions. The novel methods consistently outperformed more established methods. The results of our analysis are promising for the use of data from museums and herbaria, especially as methods suited to the noise inherent in such data improve.

7,589 citations


Cites background from "Predicting species distribution: of..."

  • ...Finally, we stress that modelling can never provide a complete substitute for detailed, ongoing collection of field data, including data on species’ distribution, demography, abundance, and interactions (Guisan and Thuiller 2005)....

    [...]

Journal ArticleDOI
TL;DR: Species distribution models (SDMs) as mentioned in this paper are numerical tools that combine observations of species occurrence or abundance with environmental estimates, and are used to gain ecological and evolutionary insights and to predict distributions across landscapes, sometimes requiring extrapolation in space and time.
Abstract: Species distribution models (SDMs) are numerical tools that combine observations of species occurrence or abundance with environmental estimates. They are used to gain ecological and evolutionary insights and to predict distributions across landscapes, sometimes requiring extrapolation in space and time. SDMs are now widely used across terrestrial, freshwater, and marine realms. Differences in methods between disciplines reflect both differences in species mobility and in “established use.” Model realism and robustness is influenced by selection of relevant predictors and modeling method, consideration of scale, how the interplay between environmental and geographic factors is handled, and the extent of extrapolation. Current linkages between SDM practice and ecological theory are often weak, hindering progress. Remaining challenges include: improvement of methods for modeling presence-only data and for model selection and evaluation; accounting for biotic interactions; and assessing model uncertainty.

5,076 citations


Cites background from "Predicting species distribution: of..."

  • ...F or p er so na l u se o nl y. ANRV393-ES40-32 ARI 8 October 2009 12:26 Biotic Interactions Very few SDM studies explicitly include predictors describing biological interactions (Guisan & Thuiller 2005)....

    [...]

  • ...Reviews of SDM literature include those of Guisan & Zimmermann (2000), Stauffer (2002), Guisan & Thuiller (2005), Richards et al. (2007), and Schröder (2008)....

    [...]

  • ...This typifies the difficulty in making inferences about the relative importance of jointly fitted abiotic and biotic predictors (Guisan & Thuiller 2005), because in most data sets environmental effects are confounded with those of competitors and mutualists....

    [...]

  • ...Typical applications include global analyses of species distributions, mapping within a region for conservation planning or resource management, and identifying suitable habitat for rare species (Guisan & Thuiller 2005)....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors provide a theoretical explanation for the observed dependence of kappa on prevalence, and introduce an alternative measure of accuracy, the true skill statistic (TSS), which corrects for this dependence while still keeping all the advantages of Kappa.
Abstract: Summary 1In recent years the use of species distribution models by ecologists and conservation managers has increased considerably, along with an awareness of the need to provide accuracy assessment for predictions of such models. The kappa statistic is the most widely used measure for the performance of models generating presence–absence predictions, but several studies have criticized it for being inherently dependent on prevalence, and argued that this dependency introduces statistical artefacts to estimates of predictive accuracy. This criticism has been supported recently by computer simulations showing that kappa responds to the prevalence of the modelled species in a unimodal fashion. 2In this paper we provide a theoretical explanation for the observed dependence of kappa on prevalence, and introduce into ecology an alternative measure of accuracy, the true skill statistic (TSS), which corrects for this dependence while still keeping all the advantages of kappa. We also compare the responses of kappa and TSS to prevalence using empirical data, by modelling distribution patterns of 128 species of woody plant in Israel. 3The theoretical analysis shows that kappa responds in a unimodal fashion to variation in prevalence and that the level of prevalence that maximizes kappa depends on the ratio between sensitivity (the proportion of correctly predicted presences) and specificity (the proportion of correctly predicted absences). In contrast, TSS is independent of prevalence. 4When the two measures of accuracy were compared using empirical data, kappa showed a unimodal response to prevalence, in agreement with the theoretical analysis. TSS showed a decreasing linear response to prevalence, a result we interpret as reflecting true ecological phenomena rather than a statistical artefact. This interpretation is supported by the fact that a similar pattern was found for the area under the ROC curve, a measure known to be independent of prevalence. 5Synthesis and applications. Our results provide theoretical and empirical evidence that kappa, one of the most widely used measures of model performance in ecology, has serious limitations that make it unsuitable for such applications. The alternative we suggest, TSS, compensates for the shortcomings of kappa while keeping all of its advantages. We therefore recommend the TSS as a simple and intuitive measure for the performance of species distribution models when predictions are expressed as presence–absence maps.

3,518 citations

Journal ArticleDOI
01 Nov 2007-Ecology
TL;DR: High classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods are observed.
Abstract: Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature.

3,368 citations


Cites background or methods from "Predicting species distribution: of..."

  • ...Key words: additive logistic regression; classification trees; LDA; logistic regression; machine learning; partial dependence plots; random forests; species distribution models....

    [...]

  • ...Classification procedures are among the most widely used statistical methods in ecology, with applications including vegetation mapping by remote sensing (Steele 2000) and species distribution modeling (Guisan and Thuiller 2005)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors describe six different statistical approaches to infer correlates of species distributions, for both presence/absence (binary response) and species abundance data (poisson or normally distributed response), while accounting for spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations.
Abstract: Species distributional or trait data based on range map (extent-of-occurrence) or atlas survey data often display spatial autocorrelation, i.e. locations close to each other exhibit more similar values than those further apart. If this pattern remains present in the residuals of a statistical model based on such data, one of the key assumptions of standard statistical analyses, that residuals are independent and identically distributed (i.i.d), is violated. The violation of the assumption of i.i.d. residuals may bias parameter estimates and can increase type I error rates (falsely rejecting the null hypothesis of no effect). While this is increasingly recognised by researchers analysing species distribution data, there is, to our knowledge, no comprehensive overview of the many available spatial statistical methods to take spatial autocorrelation into account in tests of statistical significance. Here, we describe six different statistical approaches to infer correlates of species’ distributions, for both presence/absence (binary response) and species abundance data (poisson or normally distributed response), while accounting for spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations. A comprehensive comparison of the relative merits of these methods is beyond the scope of this paper. To demonstrate each method’s implementation, however, we undertook preliminary tests based on simulated data. These preliminary tests verified that most of the spatial modeling techniques we examined showed good type I error control and precise parameter estimates, at least when confronted with simplistic simulated data containing

2,820 citations


Cites background from "Predicting species distribution: of..."

  • ...This phenomenon has been noted before (McCullough and Nelder 1989), and remains relevant for species distribution models, where the majority of studies are based on the analysis of presence-absence data (Guisan and Zimmermann 2000, Guisan and Thuiller 2005)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: In this paper, the use of the maximum entropy method (Maxent) for modeling species geographic distributions with presence-only data was introduced, which is a general-purpose machine learning method with a simple and precise mathematical formulation.

13,120 citations

Journal ArticleDOI
08 Jan 2004-Nature
TL;DR: Estimates of extinction risks for sample regions that cover some 20% of the Earth's terrestrial surface show the importance of rapid implementation of technologies to decrease greenhouse gas emissions and strategies for carbon sequestration.
Abstract: Climate change over the past approximately 30 years has produced numerous shifts in the distributions and abundances of species and has been implicated in one species-level extinction. Using projections of species' distributions for future climate scenarios, we assess extinction risks for sample regions that cover some 20% of the Earth's terrestrial surface. Exploring three approaches in which the estimated probability of extinction shows a power-law relationship with geographical range size, we predict, on the basis of mid-range climate-warming scenarios for 2050, that 15-37% of species in our sample of regions and taxa will be 'committed to extinction'. When the average of the three methods and two dispersal scenarios is taken, minimal climate-warming scenarios produce lower projections of species committed to extinction ( approximately 18%) than mid-range ( approximately 24%) and maximum-change ( approximately 35%) scenarios. These estimates show the importance of rapid implementation of technologies to decrease greenhouse gas emissions and strategies for carbon sequestration.

7,089 citations


"Predicting species distribution: of..." refers background in this paper

  • ...The application of SDMs to climate change analyses was highlighted by a recent, massive study assessing global species extinction risk (Thomas et al. 2004)....

    [...]

  • ...Second, in most projections, species dispersal is inappropriately taken into consideration, relying either on a no dispersal , an unlimited dispersal scenarios, or both (e.g. Thomas et al. 2004; Thuiller 2004)....

    [...]

Journal ArticleDOI
TL;DR: A review of predictive habitat distribution modeling is presented, which shows that a wide array of models has been developed to cover aspects as diverse as biogeography, conservation biology, climate change research, and habitat or species management.

6,748 citations


"Predicting species distribution: of..." refers background or methods in this paper

  • ...Environmental predictors can exert direct or indirect effects on species, arranged along a gradient from proximal to distal predictors (Austin 2002), and are optimally chosen to reflect the three main types of influences on the species (modified from Guisan & Zimmermann 2000; Huston 2002; Fig....

    [...]

  • ...Species distribution models are empirical models relating field observations to environmental predictor variables, based on statistically or theoretically derived response surfaces (Guisan & Zimmermann 2000)....

    [...]

  • ...A striking characteristic of SDMs is their reliance on the niche concept (Guisan & Zimmermann 2000)....

    [...]

  • ...For more details on the different steps of SDM building, we refer readers to Guisan & Zimmermann (2000)....

    [...]

  • ...The procedure of SDM building ideally follows six steps (modified from Guisan & Zimmermann 2000; see Table 2): (i) conceptualization, (ii) data preparation, (iii) model fitting, (iv) model evaluation, (v) spatial predictions, and (vi) assessment of model applicability....

    [...]

Journal ArticleDOI
TL;DR: This work shows that this seemingly mysterious phenomenon of boosting can be understood in terms of well-known statistical principles, namely additive modeling and maximum likelihood, and develops more direct approximations and shows that they exhibit nearly identical results to boosting.
Abstract: Boosting is one of the most important recent developments in classification methodology. Boosting works by sequentially applying a classification algorithm to reweighted versions of the training data and then taking a weighted majority vote of the sequence of classifiers thus produced. For many classification algorithms, this simple strategy results in dramatic improvements in performance. We show that this seemingly mysterious phenomenon can be understood in terms of well-known statistical principles, namely additive modeling and maximum likelihood. For the two-class problem, boosting can be viewed as an approximation to additive modeling on the logistic scale using maximum Bernoulli likelihood as a criterion. We develop more direct approximations and show that they exhibit nearly identical results to boosting. Direct multiclass generalizations based on multinomial likelihood are derived that exhibit performance comparable to other recently proposed multiclass generalizations of boosting in most situations, and far superior in some. We suggest a minor modification to boosting that can reduce computation, often by factors of 10 to 50. Finally, we apply these insights to produce an alternative formulation of boosting decision trees. This approach, based on best-first truncated tree induction, often leads to better performance, and can provide interpretable descriptions of the aggregate decision rule. It is also much faster computationally, making it more suitable to large-scale data mining applications.

6,598 citations


"Predicting species distribution: of..." refers background or methods in this paper

  • ...Nevertheless, combina- 2005 Blackwell Publishing Ltd/CNRS tions of different modelling approaches can be used to identify significant interactions, as implemented in generalized boosting models (Friedman et al. 2000)....

    [...]

  • ...tions of different modelling approaches can be used to identify significant interactions, as implemented in generalized boosting models (Friedman et al. 2000)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a hierarchical modeling framework is proposed through which some of these limitations can be addressed within a broader, scale-dependent framework, and it is proposed that, although the complexity of the natural system presents fundamental limits to predictive modelling, the bioclimate envelope approach can provide a useful first approximation as to the potentially dramatic impact of climate change on biodiversity.
Abstract: Modelling strategies for predicting the potential impacts of climate change on the natural distribution of species have often focused on the characterization of a species’ bioclimate envelope. A number of recent critiques have questioned the validity of this approach by pointing to the many factors other than climate that play an important part in determining species distributions and the dynamics of distribution changes. Such factors include biotic interactions, evolutionary change and dispersal ability. This paper reviews and evaluates criticisms of bioclimate envelope models and discusses the implications of these criticisms for the different modelling strategies employed. It is proposed that, although the complexity of the natural system presents fundamental limits to predictive modelling, the bioclimate envelope approach can provide a useful first approximation as to the potentially dramatic impact of climate change on biodiversity. However, it is stressed that the spatial scale at which these models are applied is of fundamental importance, and that model results should not be interpreted without due consideration of the limitations involved. A hierarchical modelling framework is proposed through which some of these limitations can be addressed within a broader, scale-dependent

3,847 citations


"Predicting species distribution: of..." refers background in this paper

  • ...There is an ongoing debate concerning the inclusion of interspecific interactions into SDMs, particularly in a global change and conservation contexts (Davis et al. 1998; Pearson & Dawson 2003)....

    [...]

  • ...At broad extent and coarse resolution, we expect competition or facilitation should have a lesser effect on species distribution than at more local extent and finer resolution (Huston 2002; Pearson & Dawson 2003), although local abundance may still be strongly affected at larger scale....

    [...]

  • ...…correlations between distributions of species and climate seems to be those of Johnston (1924), predicting the invasive spread of a cactus species in Australia, and Hittinka (1963) assessing the climatic determinants of the distribution of several European species (quoted in Pearson & Dawson 2003)....

    [...]