What contributions have the authors mentioned in the paper "Presence-only and presence-absence data for comparing species distribution modeling methods" ?

A particularly interesting characteristic of this dataset is that independent presence-absence survey data are available for evaluation alongside the presence-only species occurrence data intended for modeling. The authors of this current paper are the subset of the NCEAS working group who gathered and processed the data described here, alongside suppliers of those data ; referred to here as “ the NCEAS data group. ” The data come from six regions of the world ( Fig. 1 ). The authors generated random locations for each study region, referred to as “ background ” ( or elsewhere “ pseudo-absence ” ) samples. The NCEAS working group designed a “ baseline ” study to compare 16 modeling algorithms ( Elith Jane Elith et al. – Presence-only and Presence-absence Data for Comparing Species Distribution Modeling Methods 70 et al. 2006 ), and also several experimental treatments that manipulated the datasets to explore the effects of sample size ( Wisz et al. 2008 ), spatial resolution ( grain ) of environmental data ( Guisan et al. 2007 ), error in PO location ( Graham et al. 2008 ), bias in records ( Dudik and Phillips 2009 ; Phillips et al. 2009 ) and treatment of BG data ( Phillips et al. 2009 ) on model performance. The environmental data for PA sites were provided, so modelers could predict environmental suitability for all species at these sites. 2008 ; Amano and Sutherland 2013 ; Isaac and Pocock 2015 ) and thus, may not be representative of the species distribution in the study area. Jane Elith et al. – Presence-only and Presence-absence Data for Comparing Species Distribution Modeling Methods 71 wrongly emphasising the suitability of some environments and under-reporting the suitability of others. Some of the data preparation methods were reported in the original baseline modeling paper ( Elith et al. 2006 ), but the authors describe them here in full detail, to gather all the information in one place, and to ensure the descriptions are adequate for data re-use. This manuscript and the accompanying metadata should be treated as the authoritative description of the data supplied here. All datasets were cleaned by JE and CG to these common properties agreed to by the group: ( a ) all data projected to a common projection for that region ; ( b ) all raster data for a region aligned to the same extent and resolution, and only rasters with close to complete coverage in the region of interest retained ; ( c ) species records reduced to a maximum of one record per raster cell using the following protocol: for PO data: if there is at least one presence record in a cell, retain one presence record for that cell ; for PA data: reduce to one record per cell using the rule: if presence ( s ) and absence ( s ) both occur in the same cell, retain one presence ; ( d ) records checked and rectified if necessary to ensure that PO and PA locations do not co-occur in a grid cell ; ( e ) species records from locations with no environmental data removed. Many SDMs contrast the environment at locations of known occurrence of a species to that at a set of random locations in the study region ( background, quadrature, or pseudo-absence points: ( Phillips et al. The authors sourced datasets from six regions of the world ( Figure 1 and Table 1 ) ; the regions are hereafter referred to by the initials provided in Figure 1 and in column 1, Table 1. This provided a diverse and representative data set for the NCEAS studies ( Supplementary Information 1 ), and a benchmark set that the authors anticipate being broadly useful into the future. The different data sources used different sampling designs and methods which can provide insights into how data quality influences model outcomes/accuracy. These variations are typical of what is seen in ecological datasets further making this dataset a useful benchmark for SDM modelers.

What was the purpose of the PA evaluation data?

In the publications shown in the table in Supplementary Information 1, the PA evaluation (test) data were kept independent as a “blind evaluation” set, that is, they were not used to tune models.

What is the name of the txt file?

txt file adds authors responsible for data preparation, and details of coordinate reference systems, units and raster cell sizes.

What is the purpose of this article?

The authors kindly request that each user (even students within teaching exercises) download the data or R package individually because some data providers would like to track data downloads, to enable reporting on data usage as required by their funding agencies.

(Open Access) Presence-only and presence-absence data for comparing species distribution modeling methods (2020) | Jane Elith

Biodiversity Informatics, 15, 2020, pp. 69-80

PRESENCE-ONLY AND PRESENCE-ABSENCE DATA FOR COMPARING

SPECIES DISTRIBUTION MODELING METHODS











































School of BioSciences, University of Melbourne, Australia.

Swiss Federal Research Institute

WSL, CH-8903 Birmensdorf, Switzerland.

CSIRO Land and Water, Cairns, Queensland, Australia.

CSIRO Land and Water, Canberra, Australian Capital Territory (ACT), Australia.

CSIRO Land and Water, Tropical Forest Research Centre, Atherton, Queensland, Australia.

University of Lausanne, 1015 Lausanne, Switzerland.

University of California, Davis, USA.

EWHALE Lab, Institute of Arctic Biology, Biology & Wildlife Department, University of Alaska

Fairbanks, Fairbanks Alaska 99775 USA.

Universidade de São Paulo, Brazil.

College of

Agricultural and Life Sciences, University of Florida, USA.

Research School of Biology & Center

for Biodiversity Analysis, Australian National University, Australia.

Manaaki Whenua—Landcare

Research, Hamilton, New Zealand (current address: PANTHERA, Floor 18, 8 West 40 St, New

York, USA 10018.

Biodiversity Institute, University of Kansas, Lawrence, Kansas 66045, USA.

Center for Biodiversity and Conservation, American Museum of Natural History, New York,

USA.

Department of Geography, Planning and Environment, Concordia University, Montreal,

Canada.

Centre for Tropical Environmental and Sustainability Science, James Cook University,

Townsville, Australia.

Manaaki Whenua—Landcare Research, Lincoln, New Zealand

*Corresponding Author: j.elith@unimelb.edu.au

Abstract. 



-

















      

      

-

-



environmental data and can be used to predict distri-





-

  







     

     

       -

-

    

-

     

    

       



      -



Jane Elith et al. – Presence-only and Presence-absence Data for Comparing Species Distribution Modeling Methods



  



 

-



         





for PA sites were provided, so modelers could predict





  

  





-



(detailed in Supplementary Information 1,



        







more transparent and repeatable (National Academy

-





   

-

-













 

  

  

-



-



-









Jane Elith et al. – Presence-only and Presence-absence Data for Comparing Species Distribution Modeling Methods

-



model trained and



      











none are problem-free, and SDM evaluation remains



-



in some areas simply leads to a more or less precise

     

-









be used to calculate a broader suite of evaluation sta-











-





-







-





-





-







  

       -





















-



   

 









-



spatial resolution (smallest raster cell size) available,





-











    -





 





-

  





-





      -









 





       

     -

       









Jane Elith et al. – Presence-only and Presence-absence Data for Comparing Species Distribution Modeling Methods



      







       

-



    









-

marised in Table 1, and details of variables in Sup-



Records span species of





-





to 19 120 PA evaluation sites (Table 1 and detailed



 

-







  

      

      

-

-











-













-

      

     -

http://hdl.handle.net/1808/30582





below, are available openly

 OSF data

   -

 



 561 MB in total and many users will not want to

-







-























1. Environmental rasters

  /data/Environment folder at

       



  



 /data/Environment   



and details of coordinate reference systems, units and



2. Presence-only data—locations and envi-

ronmental samples

/data/Records/train_po fold-

-

        



-





3. Background data—locations and environ-

mental samples

-



/data/Records/train_bg folder at



Jane Elith et al. – Presence-only and Presence-absence Data for Comparing Species Distribution Modeling Methods

Code

Region details

Area

(‘000 km

)

Area location – red polygons

show locations within countries

/ continents

No. env vars

(no.

categorical)

Approx.

grid cell

resolution

(m)

Biological groups & number

species

Mean no.

records per

species

No.sites:

AWT

Australian Wet

Tropics,

Queensland,

Australia

23.97

13 (0)

b: birds: 20

155

340

p: vascular plants: 20

102

CAN

Ontario, Canada

979.34

11 (1)

1 000

birds: 30

253

1 282

14 571

NSW

North-east New

South Wales,

Australia

76.18

13 (1)

100

ba: bats: 7

570

db: diurnal birds: 8

189

702

nb: nocturnal birds: 2

134

142

1 137

ot: open-forest trees: 8

164

2 075

ou: open-forest understorey

vascular plants: 8

358

1 309

rt: rainforest trees: 7

212

1 036

ru: rainforest understorey

vascular plants: 6

909

sr: small reptiles: 8

1 008



Presence-only and presence-absence data for comparing species distribution modeling methods

Figures

Citations

Predictive performance of presence-only species distribution models: a benchmark study with reproducible code

Modelling species presence-only data with random forests

Species Distribution Modeling for Machine Learning Practitioners: A Review

Predicted range shifts of invasive giant hogweed (Heracleum mantegazzianum) in Europe.

Modelling species presence-only data with random forests

References

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Novel methods improve prediction of species' distributions from occurrence data

Regression modeling strategies : with applications to linear models, logistic regression, and survival analysis

A statistical explanation of MaxEnt for ecologists

Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis

Related Papers (5)

Selecting pseudo-absences for species distribution models: how, where and how many?

Is my species distribution model fit for purpose? Matching data and models to applications

Novel methods improve prediction of species' distributions from occurrence data

Spatial complexity, informatics, and wildlife conservation

Macrocognition Metrics and Scenarios: Design and Evaluation for Real-World Teams

Frequently Asked Questions (4)

Q1. What contributions have the authors mentioned in the paper "Presence-only and presence-absence data for comparing species distribution modeling methods" ?

Q2. What was the purpose of the PA evaluation data?

Q3. What is the name of the txt file?

Q4. What is the purpose of this article?