scispace - formally typeset
Open AccessJournal ArticleDOI

Methods to account for spatial autocorrelation in the analysis of species distributional data : a review

TLDR
In this paper, the authors describe six different statistical approaches to infer correlates of species distributions, for both presence/absence (binary response) and species abundance data (poisson or normally distributed response), while accounting for spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations.
Abstract
Species distributional or trait data based on range map (extent-of-occurrence) or atlas survey data often display spatial autocorrelation, i.e. locations close to each other exhibit more similar values than those further apart. If this pattern remains present in the residuals of a statistical model based on such data, one of the key assumptions of standard statistical analyses, that residuals are independent and identically distributed (i.i.d), is violated. The violation of the assumption of i.i.d. residuals may bias parameter estimates and can increase type I error rates (falsely rejecting the null hypothesis of no effect). While this is increasingly recognised by researchers analysing species distribution data, there is, to our knowledge, no comprehensive overview of the many available spatial statistical methods to take spatial autocorrelation into account in tests of statistical significance. Here, we describe six different statistical approaches to infer correlates of species’ distributions, for both presence/absence (binary response) and species abundance data (poisson or normally distributed response), while accounting for spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations. A comprehensive comparison of the relative merits of these methods is beyond the scope of this paper. To demonstrate each method’s implementation, however, we undertook preliminary tests based on simulated data. These preliminary tests verified that most of the spatial modeling techniques we examined showed good type I error control and precise parameter estimates, at least when confronted with simplistic simulated data containing

read more

Content maybe subject to copyright    Report

Methods to account for spatial autocorrelation in the analysis
of species distributional data: a review
Carsten F. Dormann, Jana M. McPherson, Miguel B. Arau
´
jo, Roger Bivand, Janine Bolliger,
Gudrun Carl, Richard G. Davies, Alexandre Hirzel, Walter Jetz, W. Daniel Kissling,
Ingolf Ku
¨
hn, Ralf Ohlemu
¨
ller, Pedro R. Peres-Neto, Bjo
¨
rn Reineking, Boris Schro
¨
der,
Frank M. Schurr and Robert Wilson
C. F. Dormann (carsten.dormann@ufz.de), Dept of Computational Landscape Ecology, UFZ Helmholtz Centre for Environmental
Research, Permoserstr. 15, DE-04318 Leipzig, Germany. J. M. McPherson, Dept of Biology, Dalhousie Univ., 1355 Oxford Street
Halifax NS, B3H 4J1 Canada. M. B. Arau
´
jo, Dept de Biodiversidad y Biologı
´
a Evolutiva, Museo Nacional de Ciencias Naturales,
CSIC, C/ Gutie
´
rrez Abascal, 2, ES-28006 Madrid, Spain, and Centre for Macroecology, Inst. of Biology, Universitetsparken 15, DK-
2100 Copenhagen Ø, Denmark. R. Bivand, Economic Geography Section, Dept of Economics, Norwegian School of Economics and
Business Administration, Helleveien 30, NO-5045 Bergen, Norway. J. Bolliger, Swiss Federal Research Inst. WSL, Zu
¨
rcherstrasse
111, CH-8903 Birmensdorf, Switzerland. G. Carl and I. Ku
¨
hn, Dept of Community Ecology (BZF), UFZ Helmholtz Centre for
Environmental Research, Theodor-Lieser-Strasse 4, DE-06120 Halle, Germany, and Virtual Inst. Macroecology, Theodor-Lieser-
Strasse 4, DE-06120 Halle, Germany. R. G. Davies, Biodiversity and Macroecology Group, Dept of Animal and Plant Sciences,
Univ. of Sheffield, Sheffield S10 2TN, U.K. A. Hirzel, Ecology and Evolution Dept, Univ. de Lausanne, Biophore Building, CH-
1015 Lausanne, Switzerland. W. Jetz, Ecology Behavior and Evolution Section, Div. of Biological Sciences, Univ. of California, San
Diego, 9500 Gilman Drive, MC 0116, La Jolla, CA 92093-0116, USA. W. D. Kissling, Community and Macroecology Group,
Inst. of Zoology, Dept of Ecology, Johannes Gutenberg Univ. of Mainz, DE-55099 Mainz, Germany, and Virtual Inst. Macroecology,
Theodor-Lieser-Strasse 4, DE-06120 Halle, Germany. R. Ohlemu
¨
ller, Dept of Biology, Univ. of York, PO Box 373, York YO10
5YW, U.K. P. R. Peres-Neto, Dept of Biology, Univ. of Regina, SK, S4S 0A2 Canada, present address: Dept of Biological Sciences,
Univ. of Quebec at Montreal, CP 8888, Succ. Centre Ville, Montreal, QC, H3C 3P8, Canada. B. Reineking, Forest Ecology, ETH
Zurich CHN G 75.3, Universita
¨
tstr. 16, CH-8092 Zu
¨
rich, Switzerland. B. Schro
¨
der, Inst. for Geoecology, Univ. of Potsdam, Karl-
Liebknecht-Strasse 24-25, DE-14476 Potsdam, Germany. F. M. Schurr, Plant Ecology and Nature Conservation, Inst. of
Biochemistry and Biology, Univ. of Potsdam, Maulbeerallee 2, DE-14469 Potsdam, Germany. R. Wilson, A
´
rea de Biodiversidad y
Conservacio
´
n, Escuela Superior de Ciencias Experimentales y Tecnologı
´
a, Univ. Rey Juan Carlos, Tulipa
´
n s/n, Mo
´
stoles, ES-28933
Madrid, Spain.
Species distributional or trait data based on range map (extent-of-occurrence) or atlas survey data often display
spatial autocorrelation, i.e. locations close to each other exhibit more similar values than those further apart. If
this pattern remains present in the residuals of a statistical model based on such data, one of the key assumptions
of standard statistical analyses, that residuals are independent and identically distributed (i.i.d), is violated. The
violation of the assumption of i.i.d. residuals may bias parameter estimates and can increase type I error rates
(falsely rejecting the null hypothesis of no effect). While this is increasingly recognised by researchers analysing
species distribution data, there is, to our knowledge, no comprehensive overview of the many available spatial
statistical methods to take spatial autocorrelation into account in tests of statistical significance. Here, we
describe six different statistical approaches to infer correlates of species’ distributions, for both presence/absence
(binary response) and species abundance data (poisson or normally distributed response), while accounting for
spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised
least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations. A
comprehensive comparison of the relative merits of these methods is beyond the scope of this paper. To
demonstrate each method’s implementation, however, we undertook preliminary tests based on simulated data.
These preliminary tests verified that most of the spatial modeling techniques we examined showed good type I
error control and precise parameter estimates, at least when confronted with simplistic simulated data containing
Ecography 30: 609628, 2007
doi: 10.1111/j.2007.0906-7590.05171.x
# 2007 The Authors. Journal compilation # 2007 Ecography
Subject Editor: Carsten Rahbek. Accepted 3 August 2007
609

spatial autocorrelation in the errors. However, we found that for presence/absence data the results and
conclusions were very variable between the different methods. This is likely due to the low information content
of binary maps. Also, in contrast with previous studies, we found that autocovariate methods consistently
underestimated the effects of environmental controls of species distributions. Given their widespread use, in
particular for the modelling of species presence/absence data (e.g. climate envelope models), we argue that this
warrants further study and caution in their use. To aid other ecologists in making use of the methods described,
code to implement them in freely available software is provided in an electronic appendix.
Species distributional data such as species range maps
(extent-of-occurrence), breeding bird surveys and bio-
diversity atlases are a common source for analyses of
species-environment relationships. These, in turn, form
the basis for conservation and management plans for
endangered species, for calculating distributions under
future climate and land-use scenarios and other forms
of environmental risk assessment.
The analysis of spatial data is complicated by a
phenomenon known as spatial autocorrelation. Spatial
autocorrelation (SAC) occurs when the values of vari-
ables sampled at nearby locations are not independent
from each other (Tobler 1970). The causes of spatial
autocorrelation are manifold, but three factors are
particularly common (Legendre and Fortin 1989,
Legendre 1993, Legendre and Legendre 1998): 1)
biological processes such as speciation, extinction,
dispersal or species interactions are distance-related; 2)
non-linear relationships between environment and spe-
cies are modelled erroneously as linear; 3) the statistical
model fails to account for an important environmental
determinant that in itself is spatially structured and thus
causes spatial structuring in the response (Besag 1974).
The second and third points are not always referred to as
spatial autocorrelation, but rather spatial dependency
(Legendre et al. 2002). Since they also lead to auto-
correlated residuals, these are equally problematic. A
fourth source of spatial autocorrelation relates to spatial
resolution, because coarser grains lead to a spatial
smoothing of data. In all of these cases, SAC may
confound the analysis of species distribution data.
Spatial autocorrelation may be seen as both an
opportunity and a challenge for spatial analysis. It is an
opportunity when it provides useful information for
inference of process from pattern (Palma et al. 1999)
by, for example, increasing our understanding of
contagious biotic processes such as population growth,
geographic dispersal, differential mortality, social
organization or competition dynamics (Griffith and
Peres-Neto 2006). In most cases, however, the presence
of spatial autocorrelation is seen as posing a serious
shortcoming for hypothesis testing and prediction
(Lennon 2000, Dormann 2007b), because it violates
the assumption of independently and identically dis-
tributed (i.i.d.) errors of most standard statistical
procedures (Anselin 2002) and hence inflates type I
errors, occasionally even inverting the slope of relation-
ships from non-spatial analysis (Ku
¨
hn 2007).
A variety of methods have consequently been devel-
oped to correct for the effects of spatial autocorrelation
(partially reviewed in Keitt et al. 2002, Miller et al. 2007,
see below), but only a few have made it into the
ecological literature. The aims of this paper are to 1)
present and explain methods that account for spatial
autocorrelation in analyses of spatial data; the app-
roaches considered are: autocovariate regression, spatial
eigenvector mapping (SEVM), generalised least squares
(GLS), conditional autoregressive models (CAR), simul-
taneous autoregressive models (SAR), generalised linear
mixed models (GLMM) and generalised estimation
equations (GEE); 2) describe which of these methods
can be used for which error distribution, and discuss
potential problems with implementation; 3) illustrate
how to implement these methods using simulated data
sets and by providing computing code (Anon. 2005).
Methods for dealing with spatial
autocorrelation
Detecting and quantifying spatial autocorrelation
Before considering the use of modelling methods that
account for spatial autocorrelation, it is a sensible first
step to check whether spatial autocorrelation is in fact
likely to impact the planned analyses, i.e. if model
residuals indeed display spatial autocorrelation. Check-
ing for spatial autocorrelation (SAC) has become a
commonplace exercise in geography and ecology (Sokal
and Oden 1978a, b, Fortin and Dale 2005). Established
procedures include (Isaaks and Shrivastava 1989, Perry
et al. 2002): Moran’s I plots (also termed Moran’s I
correlogram by Legendre and Legendre 1998), Geary’s
c correlograms and semi-variograms. In all three cases a
measure of similarity (Moran’s I, Geary’s c) or variance
(variogram) of data points (i and j) is plotted as a
function of the distance between them (d
ij
). Distances
are usually grouped into bins. Moran’s I-based correlo-
grams typically show a decrease from some level of SAC
to a value of 0 (or below; expected value in the absence
of SAC: E(I)1/(n1), where nsample size),
indicating no SAC at some distance between locations.
Variograms depict the opposite, with the variance
610

between pairs of points increasing up to a certain
distance, where variance levels off. Variograms are more
commonly employed in descriptive geostatistics, while
correlograms are the prevalent graphical presentation in
ecology (Fortin and Dale 2005).
Values of Moran’s I are assessed by a test statistic
(the Moran’s I standard deviate) which indicates the
statistical significance of SAC in e.g. model residuals.
Additionally, model residuals may be plotted as a map
that more explicitly reveals particular patterns of spatial
autocorrelation (e.g. anisotropy or non-stationarity of
spatial autocorrelation). For further details and for-
mulae see e.g. Isaaks and Shrivastava (1989) or Fortin
and Dale (2005).
Assumptions common to all modelling
approaches considered
All methods assume spatial stationarity, i.e. spatial
autocorrelation and effects of environmental correlates
to be constant across the region, and there are very few
methods to deal with non-stationarity (Osborne et al.
2007). Stationarity may or may not be a reasonable
assumption, depending, among other things, on the
spatial extent of the study. If the main cause of spatial
autocorrelation is dispersal (for example in research on
animal distributions), stationarity is likely to be
violated, for example when moving from a floodplain
to the mountains, where movement may be more
restricted. One method able to accommodate spatial
variation in autocorrelation is geographically weighted
regression (Fotheringham et al. 2002), a method not
considered here because of its limited use for hypothesis
testing (coefficient estimates depend on spatial position)
and because it was not designed to remove spatial
autocorrelation (see e.g. Kupfer and Farris 2007, for a
GWR correlogram).
Another assumption is that of isotropic spatial
autocorrelation. This means that the process causing
the spatial autocorrelation acts in the same way in all
directions. Environmental factors that may cause
anisotropic spatial autocorrelation are wind (giving a
wind-dispersed organism a preferential direction), water
currents (e.g. carrying plankton), or directionality in
soil transport (carrying seeds) from mountains to plains.
He et al. (2003) as well as Worm et al. (2005) provide
examples of analyses accounting for anisotropy in
ecological data, and several of the methods described
below can be adapted for such circumstances.
Description of spatial statistical modelling
methods
The methods we describe in the following fall broadly
into three groups. 1) Autocovariate regression and
spatial eigenvector mapping seek to capture the spatial
configuration in additional covariates, which are then
added into a generalised linear model (GLM). 2)
Generalised least squares (GLS) methods fit a var-
iance-covariance matrix based on the non-independence
of spatial observations. Simultaneous autoregressive
models (SAR) and conditional autoregressive models
(CAR) do the same but in different ways to GLS, and
the generalised linear mixed models (GLMM) we
employ for non-normal data are a generalisation of
GLS. 3) Generalised estimating equations (GEE) split
the data into smaller clusters before also modelling the
variance-covariance relationship. For comparison, the
following non-spatial models were also employed:
simple GLM and trend-surface generalised additive
models (GAM: Hastie and Tibshirani 1990, Wood
2006), in which geographical location was fitted using
splines as a trend-surface (as a two-dimensional spline
on geographical coordinates). Trend surface GAM does
not address the problem of spatial autocorrelation, but
merely accounts for trends in the data across larger
geographical distances (Cressie 1993). A promising tool
which became available only recently is the use of
wavelets to remove spatial autocorrelation (Carl and
Ku
¨
hn 2007b). However, the method was published too
recently to be included here and hence awaits further
testing.
We also did not include Bayesian spatial models in
this review. Several recent publications have employed
this method and provide a good coverage of its
implementation (Osborne et al. 2001, Hooten et al.
2003, Thogmartin et al. 2004, Gelfand et al. 2005,
Ku
¨
hn et al. 2006, Latimer et al. 2006). The Bayesian
approach to spatial models used in these studies is based
either on a CAR or an autologistic implementation
similar to the one we used as a frequentist method. The
Bayesian framework allows for a more flexible incor-
poration of other complications (observer bias, missing
data, different error distributions) but is much more
computer-intensive then any of the methods presented
here.
Beyond the methods mentioned above, there are
also those which correct test statistics for spatial auto-
correlation. These include Dutilleul’s modified t-test
(Dutilleul 1993) or the CRH-correction for correla-
tions (Clifford et al. 1989), randomisation tests such as
partial Mantel tests (Legendre and Legendre 1998), or
strategies employed by Lennon (2000), Liebhold and
Gurevitch (2002) and Segurado et al. (2006) which are
all useful as a robust assessment of correlation between
environmental and response variables. As these methods
do not allow a correction of the parameter estimates,
however, they are not considered further in this study.
In the following sections we present a detailed descrip-
tion of all methods employed here.
611

1. Autocovariate models
Autocovariate models address spatial autocorrelation by
estimating how much the response variable at any one
site reflects response values at surrounding sites. This is
achieved through a simple extension of generalised
linear models by adding a distance-weighted function of
neighbouring response values to the model’s explana-
tory variables. This extra parameter is known as the
autocovariate. The autocovariate is intended to capture
spatial autocorrelation originating from endogenous
processes such as conspecific attraction, limited dis-
persal, contagious population growth, and movement
of censused individuals between sampling sites (Smith
1994, Keitt et al. 2002, Yamaguchi et al. 2003).
Adding the autocovariate transforms the linear
predictor of a generalised linear model from its usual
form, yXbo,toyXbrAo, where b is a
vector of coefficients for intercept and explana-
tory variables X;andr is the coefficient of the autoco-
variate A.
A at any site i may be calculated as:
A
i
X
j k
i
w
ij
y
j
(the weighted sum) or
A
i
X
j k
i
w
ij
y
j
X
j k
i
w
ij
(the weighted average);
where y
j
is the response value of y at site j among site i’s
set of k
i
neighbours; and w
ij
is the weight given to site
j’s influence over site i (Augustin et al. 1996, Gumpertz
et al. 1997). Usually, weight functions are related to
geographical distance between data points (Augustin
et al. 1996, Arau
´
jo and Williams 2000, Osborne et al.
2001, Brownstein et al. 2003) or environmental
distance (Augustin et al. 1998, Ferrier et al. 2002).
The weighting scheme and neighbourhood size (k) are
often chosen arbitrarily, but may be optimised (by trial
and error) to best capture spatial autocorrelation
(Augustin et al. 1996). Alternatively, if the cause of
spatial autocorrelation is known (or at least suspected),
the choice of neighbourhood configuration may be
informed by biological parameters, such as the species’
dispersal capacity (Knapp et al. 2003).
Autocovariate models can be applied to binomial
data (‘‘autologistic regression’’, Smith 1994, Augustin
et al. 1996, Klute et al. 2002, Knapp et al. 2003), as
well as normally and Poisson-distributed data (Luoto
et al. 2001, Kaboli et al. 2006).
Where spatial autocorrelation is thought to be
anisotropic (e.g. because seed dispersal follows prevail-
ing winds or downstream run-off), multiple autoco-
variates can be used to capture spatial autocorrelation in
different geographic directions (He et al. 2003).
2. Spatial eigenvector mapping (SEVM)
Spatial eigenvector mapping is based on the idea that
the spatial arrangement of data points can be translated
into explanatory variables, which capture spatial effects
at different spatial resolutions. During the analysis,
those eigenvectors that reduce spatial autocorrelation in
the residuals best are chosen explicitly as spatial
predictors. Since each eigenvector represents a particu-
lar spatial patterning, SAC is effectively allowed to vary
in space, relaxing the assumption of both spatial
isotropy and stationarity. Plotting these eigenvectors
reveals the spatial patterning of the spatial autocorrela-
tion (see Diniz-Filho and Bini 2005, for an example).
This method could thus be very useful for data with
SAC stemming from larger scale observation bias.
The method is based on the eigenfunction decom-
position of spatial connectivity matrices, a relatively
new and still unfamiliar method for describing spatial
patterns in complex data (Griffith 2000b, Borcard and
Legendre 2002, Griffith and Peres-Neto 2006, Dray
et al. 2006). A very similar approach, called eigenvector
filtering, was presented by Diniz-Filho and Bini (2005)
based on their method to account for phylogenetic non-
independence in biological data (Diniz-Filho et al.
1998). Eigenvectors from these matrices represent the
decompositions of Moran’s I statistic into all mutually
orthogonal maps that can be generated from a given
connectivity matrix (Griffith and Peres-Neto 2006).
Either binary or distance-based connectivity matrices
can be decomposed, offering a great deal of flexibility
regarding topology and transformations. Given the
non-Euclidean nature of the spatial connectivity ma-
trices (i.e. not all sampling units are connected), both
positive and negative eigenvalues are produced. The
non-Euclidean part is introduced by the fact that only
certain connections among sampling units, and not all,
are considered. Eigenvectors with positive eigenvalues
represent positive autocorrelation, whereas eigenvectors
with negative eigenvalues represent negative autocorre-
lation. For the sake of presenting a general method that
will work for either binary or distance matrices, we used
a distance-based eigenvector procedure (after Dray
et al. 2006) which can be summarized as follows:
1) compute a pairwise Euclidean (geographic) distance
matrix among sampling units: D
[d
ij
]; 2) choose a
threshold value t and construct a connectivity matrix
using the following rule:
W [w
ij
]
0ifij
0ifd
ij
t
[1(d
ij
=4t)
2
] if d
ij
5 t
8
<
:
where t is chosen as the maximum distance that
maintains connections among all sampling units being
connected using a minimum spanning tree algorithm
612

(Legendre and Legendre 1998). Because the example
data we use represent a regular grid (see below), t 1 and
thus w
ij
is either 0 or 11/4
2
0.9375 in our analysis.
Note that we can change 0.9375 to 1 without affecting
eigenvector extraction. This would make the matrix fully
compatible with a binary matrix which is the case for a
regular grid. 3) Compute the eigenvectors of the centred
similarity matrix: (I11
T
/n)W(I
11
T
/n), where I is the
identity matrix. Due to numerical precision regarding
the eigenvector extraction of large matrices (Bai et al.
1996) the method is limited to ca 7000 observations
depending on platform and software (but see Griffith
2000a, for solutions based on large binary connectivity
matrices). 4) Select eigenvectors to be included as spatial
predictors in a linear or generalised linear model. Here, a
model selection procedure that minimizes the amount of
spatial autocorrelation in residuals was used (see Griffith
and Peres-Neto 2006 and Appendix for computational
details). In this approach, eigenvectors are added to a
model until the spatial autocorrelation in the residuals,
measured by Moran’s I, is non-significant. Our selection
algorithm considered global Moran’s I (i.e. autocorrela-
tion across all residuals), but could be easily amended to
target spatial autocorrelation within certain distance
classes. The significance of Moran’s I was tested using a
permutation test as implemented in Lichstein et al.
(2002). This potentially renders the selection procedure
computationally intensive for large data sets (200 or
more observations), because a permutation test must be
performed for each new eigenvector entered into the
model. Once the location-dependent, but data-inde-
pendent eigenvectors are selected, they are incorporated
into the ordinary regression model (i.e. linear or
generalized linear model) as covariates. Since their
relevance has been assessed during the filtering process
model simplification is not indicated (although some
eigenvectors will not be significant).
3. Spatial models based on generalised least
squares regression
In linear models of normally distributed data, spatial
autocorrelation can be addressed by the related ap-
proaches of generalised least squares (GLS) and auto-
regressive models (conditional autoregressive models
(CAR) and simultaneous autoregressive models (SAR)).
GLS directly models the spatial covariance structure in
the variance-covariance matrix a, using parametric
functions. CAR and SAR, on the other hand, model
the error generating process and operate with weight
matrices that specify the strength of interaction between
neighbouring sites.
Although models based on generalised least squares
have been known in the statistical literature for
decades (Besag 1974, Cliff and Ord 1981), their
application in ecology has been very limited so far
(Jetz and Rahbek 2002, Keitt et al. 2002, Lichstein
et al. 2002, Dark 2004, Tognelli and Kelt 2004). This
is most likely due to the limited availability of
appropriate software that easily facilitates the applica-
tion of these kinds of models (Lichstein et al. 2002).
With the recent development of programs that fit a
variety of GLS (Littell et al. 1996, Pinheiro and Bates
2000, Venables and Ripley 2002) and autoregressive
models (Kaluzny et al. 1998, Bivand 2005, Rangel
et al. 2006), however, the range of available tools for
ecologists to analyse spatially autocorrelated normal
data has been greatly expanded.
Generalised least squares (GLS)
As before, the underlying model is YXbo, with the
error vector o N(0,aa). aa is called the variance-
covariance matrix. Instead of fitting individual values
for the variance-covariance matrix aa, a parametric
correlation function is assumed. Correlation functions
are isotropic, i.e. they depend only on the distance s
ij
between locations i and j, but not on the direction.
Three frequently used examples of correlation functions
C(s) also used in this study are exponential (C(s)s
2
exp(r/s)), Gaussian (C(s)s
2
exp(r/s))
2
) and sphe-
rical (C(s)s
2
(12=p(r=s
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1r
2
=s
2
p
sin
1
r=s));
where r is a scaling factor that is estimated from the
data).
Some restrictions are placed upon the resulting
variance-covariance matrix a: a) it must be symmetric,
and b) it must be positive definite. This guarantees that
the matrix is invertible, which is necessary for
the fitting process (see below). The choice of correlation
function is commonly based on a visual investigation of
the semi-variogram or correlogram of the residuals.
Parameter estimation is a two-step process. First, the
parameters of the correlation function (i.e. scaling
factor r in the examples used here) are found by
optimizing the so called profiled log-likelihood, which
is the log-likelihood where the unknown values for b
and s
2
are replaced by their algebraic maximum
likelihood estimators. Secondly, given the parameter-
ization of the variance-covariance matrix, the values for
b and s
2
are found by solving a weighted ordinary least
square problem:
XX
1=2
T
y
XX
1=2
T
Xb
XX
1=2
T
o
where the error term (aa
1=2
)
T
o is now normally
distributed with mean 0 and variance s
2
I.
Autoregressive models
Both CAR and SAR incorporate spatial autocorrelation
using neighbourhood matrices which specify the
613

Figures
Citations
More filters
Journal ArticleDOI

Why did bluetongue spread the way it did? Environmental factors influencing the velocity of bluetongue virus serotype 8 epizootic wave in France.

TL;DR: Environmental factors associated with vector abundance and activity, as well as with host availability, were important drivers of the spread of bluetongue spread in France during the 2007–2008 epizootic wave and added substantially to the understanding of BT spread in a temperate climate.
Journal ArticleDOI

Local and landscape drivers of biodiversity of four groups of ants in coffee landscapes

TL;DR: In this article, the authors examined impacts of local and landscape characteristics on four groups of ants in an agricultural landscape in Chiapas, Mexico comprised of forest fragments and coffee agroecosystems varying in habitat quality.
Journal ArticleDOI

Trophic niches, diversity and community composition of invertebrate top predators (Chilopoda) as affected by conversion of tropical lowland rainforest in Sumatra (Indonesia)

TL;DR: The results suggest that the ability to utilize alternative prey is a key feature enabling invertebrate predators to persist in ecosystems undergoing major structural changes due to anthropogenic land use change.
Journal ArticleDOI

Finding the right fit: Comparative cetacean distribution models using multiple data sources and statistical approaches

TL;DR: HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not, which may come from teaching and research institutions in France or abroad, or from public or private research centers.
Journal ArticleDOI

Increased energy maize production reduces farmland bird diversity

TL;DR: In this article, the potential impact of an increase in maize fields on the diversity of farmland birds by means of high resolution (25 9 25 m) land-use scenarios was quantified.
References
More filters
Book

Generalized Linear Models

TL;DR: In this paper, a generalization of the analysis of variance is given for these models using log- likelihoods, illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables), and gamma (variance components).
Book

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

TL;DR: In this paper, the authors describe the important ideas in these areas in a common conceptual framework, and the emphasis is on concepts rather than mathematics, with a liberal use of color graphics.
BookDOI

Modern Applied Statistics with S

TL;DR: A guide to using S environments to perform statistical analyses providing both an introduction to the use of S and a course in modern statistical methods.
Journal ArticleDOI

Longitudinal data analysis using generalized linear models

TL;DR: In this article, an extension of generalized linear models to the analysis of longitudinal data is proposed, which gives consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence.
Book

Mixed-Effects Models in S and S-PLUS

TL;DR: Linear Mixed-Effects and Nonlinear Mixed-effects (NLME) models have been studied in the literature as mentioned in this paper, where the structure of grouped data has been used for fitting LME models.
Related Papers (5)
Frequently Asked Questions (12)
Q1. What contributions have the authors mentioned in the paper "Methods to account for spatial autocorrelation in the analysis of species distributional data: a review" ?

In this paper, Dormann et al. present a set of statistical methods for species distribution analysis. 

Typical examples of ecological data with normally distributed errors include abundance, species richness, or functional diversity per unit area, crop yield and catch per unit effort. 

i.e. the prediction of values within the parameter and spatial range, can be achieved by several of the presented methods. 

Either binary or distance-based connectivity matrices can be decomposed, offering a great deal of flexibility regarding topology and transformations. 

Bayesian methods are also a generally more suitable tool for inference in data sets with many missing values, or when accounting for detection probabilities (Gelfand et al. 2005, Kühn et al. 2006). 

Due to numerical precision regarding the eigenvector extraction of large matrices (Bai et al. 1996) the method is limited to ca 7000 observations depending on platform and software (but see Griffith 2000a, for solutions based on large binary connectivity matrices). 

A weight matrix W was used to simulate the spatially correlated errors oi using weights according to the distance between data points. 

CAR and SAR, on the other hand, model the error generating process and operate with weight matrices that specify the strength of interaction between neighbouring sites. 

Bayesian methods for the analyses of species distribution data are more flexible; they can be more easily extended to include more complex structures (Latimer et al. 2006). 

While Lennon (2000) and others (Tognelli and Kelt 2004, Jetz et al. 2005, Dormann 2007b, Kühn 2007) argue that spatial autocorrelation in species distribution models may well bias coefficient estimation, Diniz-Filho et al. (2003) and Hawkins et al. (2007) found non-spatial model to be robust and unbiased for several data sets. 

Some restrictions are placed upon the resulting variance-covariance matrix a: a) it must be symmetric, and b) it must be positive definite. 

One might therefore argue that, while taking the autocorrelation structure as constant adds one more assumption, the use of spatial parameters at least helps to derive better models.