scispace - formally typeset
Open AccessJournal ArticleDOI

Spatial autocorrelation and red herrings in geographical ecology

TLDR
In this article, the authors analyzed the species richness of the birds of western/central Europe, north Africa and the Middle East using Moran's I coefficients and multiple regression, using both ordinary least-squares (OLS) and generalized least squares (GLS) assuming a spatial structure in the residuals, to identify the strongest predictors of richness.
Abstract
Aim Spatial autocorrelation in ecological data can inflate Type I errors in statistical analyses. There has also been a recent claim that spatial autocorrelation generates 'red herrings', such that virtually all past analyses are flawed. We consider the origins of this phenomenon, the implications of spatial autocorrelation for macro-scale patterns of species diversity and set out a clarification of the statistical problems generated by its presence. Location To illustrate the issues involved, we analyse the species richness of the birds of western/central Europe, north Africa and the Middle East. Methods Spatial correlograms for richness and five environmental variables were generated using Moran's I coefficients. Multiple regression, using both ordinary least-squares (OLS) and generalized least squares (GLS) assuming a spatial structure in the residuals, were used to identify the strongest predictors of richness. Autocorrelation analyses of the residuals obtained after stepwise OLS regression were undertaken, and the ranks of variables in the full OLS and GLS models were compared. Results Bird richness is characterized by a quadratic north-south gradient. Spatial correlograms usually had positive autocorrelation up to c. 1600 km. Including the environmental variables successively in the OLS model reduced spatial autocorrelation in the residuals to non-detectable levels, indicating that the variables explained all spatial structure in the data. In principle, if residuals are not autocorrelated then OLS is a special case of GLS. However, our comparison between OLS and GLS models including all environmental variables revealed that GLS de-emphasized predictors with strong autocorrelation and long-distance clinal structures, giving more importance to variables acting at smaller geographical scales. Conclusion Although spatial autocorrelation should always be investigated, it does not necessarily generate bias. Rather, it can be a useful tool to investigate mechanisms operating on richness at different spatial scales. Claims that analyses that do not take into account spatial autocorrelation are flawed are without foundation.

read more

Content maybe subject to copyright    Report

UC Irvine
UC Irvine Previously Published Works
Title
Spatial autocorrelation and red herrings in geographical ecology
Permalink
https://escholarship.org/uc/item/6jg661wx
Journal
Global Ecology and Biogeography, 12(1)
ISSN
0960-7447
Authors
Diniz, JAF
Bini, L M
Hawkins, Bradford A.
Publication Date
2003
Peer reviewed
eScholarship.org Powered by the California Digital Library
University of California

RESEARCH PAPER
© 2003 Blackwell Publishing Ltd. http://www.blackwellpublishing.com/journals/geb
Global Ecology & Biogeography
(2003)
12
, 53–64
Blackwell Science, Ltd
Spatial autocorrelation and red herrings in geographical
ecology
JOSÉ ALEXANDRE FELIZOLA DINIZ-FILHO*, LUIS MAURICIO BINI* and BRADFORD A. HAWKINS†
*
Departamento de Biologia Geral, ICB, Universidade Federal de Goiás, CP 131, 74 001–970, Goiânia, GO, Brazil; and
Department of Ecol-
ogy and Evolutionary Biology, University of California, Irvine, CA 92697, U.S.A., E-mail: diniz@icb1.ufg.br; bhawkins@uci.edu
ABSTRACT
Aim
Spatial autocorrelation in ecological data can inflate
Type I errors in statistical analyses. There has also been a
recent claim that spatial autocorrelation generates ‘red
herrings’, such that virtually all past analyses are flawed. We
consider the origins of this phenomenon, the implications of
spatial autocorrelation for macro-scale patterns of species
diversity and set out a clarification of the statistical problems
generated by its presence.
Location
To illustrate the issues involved, we analyse the
species richness of the birds of western/central Europe, north
Africa and the Middle East.
Methods
Spatial correlograms for richness and five environ-
mental variables were generated using Moran’s I coefficients.
Multiple regression, using both ordinary least-squares (OLS)
and generalized least squares (GLS) assuming a spatial struc-
ture in the residuals, were used to identify the strongest
predictors of richness. Autocorrelation analyses of the residuals
obtained after stepwise OLS regression were undertaken, and
the ranks of variables in the full OLS and GLS models were
compared.
Results
Bird richness is characterized by a quadratic north–
south gradient. Spatial correlograms usually had positive
autocorrelation up to
c
. 1600 km. Including the environmen-
tal variables successively in the OLS model reduced spatial
autocorrelation in the residuals to non-detectable levels,
indicating that the variables explained all spatial structure in
the data. In principle, if residuals are not autocorrelated then
OLS is a special case of GLS. However, our comparison
between OLS and GLS models including all environmental
variables revealed that GLS de-emphasized predictors with
strong autocorrelation and long-distance clinal structures,
giving more importance to variables acting at smaller
geographical scales.
Conclusion
Although spatial autocorrelation should always
be investigated, it does not necessarily generate bias. Rather,
it can be a useful tool to investigate mechanisms operating on
richness at different spatial scales. Claims that analyses that
do not take into account spatial autocorrelation are flawed
are without foundation.
Key words
birds, generalized least squares, latitudinal gradients,
multiple regression, Palearctic, spatial autocorrelation, species
richness.
INTRODUCTION
The latitudinal gradient in species richness has been known
for almost 200 years (von Humboldt, 1808). Over the years,
many hypotheses have been developed to explain this pattern.
Many are redundant, vague or untestable, and some are
simply not supported by empirical evidence. Consequently, the
focus is now on a much reduced subset of hypotheses (Currie,
1991; O’Brien, 1993, 1998; Rosenzweig, 1995; Hawkins &
Porter 2001; Rahbek & Graves, 2001).
Tests of mechanisms driving species diversity are usually
performed using multiple regression and related statistical
approaches (e.g. path analysis), in which species richness is
regressed against sets of environmental variables, sometimes
at different spatial scales (see Badgley & Fox, 2000; Rahbek
& Graves, 2001; for recent examples). However, as is becom-
ing widely appreciated by ecologists, patterns of spatial
autocorrelation in data can create false positive results in
the analyses. Autocorrelation is the lack of independence
Correspondence: José Alexandre F. Diniz-Filho, Department de
Biologia Geral, ICB, Universidade Federal de Goiás, Goiânia, GO
74001–970, Brazil. E-mail: diniz@icbl.ufg.br

54
J. A. Felizola Diniz-Filho
et al.
© 2003 Blackwell Publishing Ltd,
Global Ecology & Biogeography
,
12
, 53–64
between pairs of observations at given distances in time or
space and is found commonly in ecological data (Legendre,
1993). Recent papers have discussed the importance of
measuring spatial autocorrelation when evaluating problems
in geographical ecology, including latitudinal gradients in
species richness (Badgley & Fox, 2000; Jetz & Rahbek, 2001;
Rahbek & Graves, 2001), the relationship between local and
regional richness (Bini
et al
., 2000; Fox
et al
., 2000), spatial
patterns in community structure (Leduc
et al
., 1992) and
spatial synchrony in population dynamics (Koenig & Knops,
1998; Koenig, 1998, 1999). The problem is also potentially
important in metapopulation studies and analyses of species–
area relationships.
When testing statistical hypotheses using standard methods
(e.g.
anova
, correlation and regression), the standard errors
are usually underestimated when positive autocorrelation
is present and, consequently, Type I errors may be strongly
inflated (Legendre, 1993). However, Lennon (2000) recently
argued that, beyond difficulties in hypothesis testing due
to inflated Type I errors, there would also be a systematic
bias toward particular kinds of mechanisms associated with
variables that have greater spatial autocorrelation. This is
potentially a much more serious issue, and we will return to
this later in this paper.
Our goal is to discuss the implications of spatial autocorre-
lation in geographical ecology, especially when using multiple
regression models to choose between alternative mechanisms
driving macro-scale patterns of species richness. We then
apply spatial analysis to evaluate the role of climate in driving
richness gradients, using western Palearctic birds as an
example. Finally, we compare results from standard multiple
regression analyses and a spatial generalized least squares
approach that incorporates autocorrelation in the residuals to
illustrate that changes in regression coefficients cannot be
considered a ‘red herring’, as argued by Lennon (2000), but
simply reflect the well-known scale-dependence of explana-
tions for diversity patterns.
THEORETICAL BACKGROUND
Assessing spatial autocorrelation
Spatial autocorrelation measures the similarity between
samples for a given variable as a function of spatial distance
(Sokal & Oden, 1978a,b; Griffith, 1987; Legendre, 1993;
Rossi & Quénéhervé, 1998). For quantitative or continuous
variables, such as species richness, the Moran’s
I
coefficient is
the most commonly used coefficient in univariate autocorre-
lation analyses and is given as:
where
n
is the number of samples (quadrats),
y
i
and
y
j
are the
values of the species richness in quadrats
i
and
j
,
Y
is the average
of
y
and
w
ij
is an element of the matrix
W
. In this matrix,
w
ij
= 1
if the pair
i
,
j
of quadrats is within a given distance class inter-
val (indicating quadrats that are ‘connected’ in this class), and
w
ij
= 0 otherwise.
S
indicates the number of entries (connec-
tions) in the
W
matrix. The value expected under the null
hypothesis of the absence of spatial autocorrelation is
1/(
n
1).
Detailed computations of the standard error of this coefficient
are given in Griffith (1987) and Legendre & Legendre (1998).
Moran’s
I
usually varies between
1.0 and 1.0 for maxi-
mum negative and positive autocorrelation, respectively.
Non-zero values of Moran’s
I
indicate that richness values in
quadrats connected at a given geographical distance are more
similar (positive autocorrelation) or less similar (negative
autocorrelation) than expected for randomly associated pairs
of quadrats. The geographical distances can be partitioned
into discrete classes, creating then successive
W
matrices and
allowing computation of different Moran’s
I
-values for the
same variable. This allows one to evaluate the behaviour of
autocorrelation as a function of spatial distance, in a graph
called a spatial correlogram, that furnishes a descriptor of
the spatial pattern in the data. In this case, the correlogram as
a whole can be considered significant at a given significance
level
α
if at least one of its coefficients is significant at
α
/
k
,
where
k
is the number of distance classes used (Bonferroni
criterion — Oden, 1984).
The number and definition of the distance classes to be
used in the correlograms is arbitrary, but a general metho-
dological criterion is to try to maximize the similarity in the
S
-values (number of connections) for the different Moran’s
I
coefficients, so that they are more comparable. The other
possible solution is to use constant intervals, but in this case
some of the Moran’s
I
coefficients in the correlograms may be
based on a much smaller number of connections, and this can
sometimes disturb the interpretation of the entire correlo-
gram (see van Rensburg
et al
., 2002). The arbitrariness in the
number of distance classes is not important in most cases,
because the purpose of the analysis is to describe a continuous
spatial process.
Three basic correlogram profiles are usually found in
ecological data. The first is obtained when there is positive
autocorrelation in short distance classes, coupled with negative
spatial autocorrelation at large distance classes. In this case,
the correlogram profile can be interpreted as a linear gradient
at macro-scales. A second common type occurs when only
small distance autocorrelation is found, indicating that
spatial variation is structured in patches. In this case, the
distance up to which spatial autocorrelation is observed can be
interpreted as the average patch size in the variable (see Diniz-
Filho & Telles, 2002). Thirdly, if no Moran’s
I
coefficients are
significant, there is no spatial pattern in the data. Of course,
other correlogram profiles are possible (Legendre & Fortin,
I
n
S
yyw
y
ijij
ji
i
i
( )( )
( )
=
−−
YY
Y
2

Spatial autocorrelation in geographical ecology
55
© 2003 Blackwell Publishing Ltd,
Global Ecology & Biogeography
,
12
, 53–64
1989; Rossi & Quénéhervé, 1998). For example, clinal
patterns can reverse at large geographical distances.
Ecological interpretations and implications: the scale
dependence of species richness–environmental
relationships
For species richness measured in a grid system or in latitud-
inal bands, positive autocorrelation across short distances can
originate in two ways. First, it can be a simple consequence
of geographical range extension beyond the limits of a single
grid cell, so that nearby cells are similar in species richness
because they share most of the same species (i.e. low species
turnover). Alternatively, the adjacent cells could be similar
because the environmental factors that drive diversity are also
spatially autocorrelated, but adjacent cells do not necessarily
have the same species composition (i.e. high turnover among
adjacent cells). Thus, patterns of spatial autocorrelation in
species richness at small scales are linked strongly with the
statistical distribution of the geographical range sizes in rela-
tion to grid cell size. For example, if most of the species have
small geographical ranges (i.e. smaller than the size of a single
cell), then similarity among adjacent cells must be a function
of similarity of environments. In practice, the two causes of
spatial pattern are expected to be found simultaneously in
most datasets, and both appear as significant correlograms.
However, as we will discuss below, they have completely
different ecological and statistical implications.
For simplicity, let us assume that a single environmental
factor is driving species richness. In the case of low turnover
(large geographical range sizes relative to cell sizes), adjacent
cells are pseudo-replicated units in space (
sensu
Hurlbert,
1984) and correlation or regression analyses between species
richness and this environmental factor should be, in principle,
tested with a reduced number of degrees of freedom. This
is necessary because the adjacent cells do not represent
independent realizations of the same ecological process, i.e.
response of the species richness to variation in the environ-
mental factor. This is equivalent to saying that only more
distant cells furnish independent information about the
relationship between richness and the driving environmental
factor.
More importantly, because this environmental factor varies
continuously throughout the geographical space and adjacent
cells have very similar species compositions (because geo-
graphical ranges of most species extend beyond the cell size),
then clearly it cannot explain variation in richness at these
very small scales. So, when regressing species richness
patterns that are generated under this process against this
environmental factor, a positive autocorrelation is expected in
the residuals of the fitted model. In contrast, if similar species
richness even among adjacent cells is caused by similar
responses of different groups of species to the environmental
factor studied, this indicates that even these adjacent cells
are independent realizations of the same ecological processes
of interest, and so the environmental factor does in fact
explain species richness between adjacent cells, and no auto-
correlation is expected in the residuals after fitting the
environmental model. This latter process is the environmental
control model recently simulated by Legendre
et al
. (2002).
We emphasize that it would be difficult to distinguish
between the causes of spatial autocorrelation based only on
an analysis of the original variable (species richness), even if
the statistical distribution of geographical range sizes is
known. Positive autocorrelation in the residuals at small
distances can also be caused by not taking into account another
environmental factor that would explain small-scale variation
in species richness if it was included in the model. If different
environmental factors act at different spatial scales (see Willis
& Whittaker, 2002), the inclusion of the relevant environ-
mental factors acting at each scale in the regression model
should be sufficient to completely remove autocorrelation
from the residuals at all scales. This is a spatially hierarchical
version of Legendre
et al
.’s (2002) environmental control
model.
Thus, this interactive modelling approach, i.e. including
environmental variables to explain and evaluate the spatial
autocorrelation in the residuals, allows us to treat variation in
species richness at all spatial scales and, at the same time, to
identify any statistical biases caused by pseudoreplication at
smaller distances. By this reasoning, if no spatial autocorrela-
tion is found in the residuals after including environmental
factors in the multiple regression model, then there is no
statistical bias in the overall regression analysis. Note that
this is true independently of the patterns of autocorrelation
in the original variables (for both species richness and the
environmental factors).
However, as stressed frequently in the statistical literature
(Philippi, 1993), caution is always needed when interpreting
the results of any multiple regression. Even if the residuals
from the model are not autocorrelated, there may still be
potential problems related to multicollinearity among pre-
dictor variables, confounding correlation and cause-and-effect,
and biasing model parameter estimation. In fact, macro-scale
analyses are always correlative and can only suggest potential
explanatory factors; they are not strict inferential tests of
causality (Levin, 1992). For example, if a given environmental
factor is correlated strongly with species richness across all
scales (i.e. there is no autocorrelation in the residuals), this
does not ensure that the actual causal factor explaining rich-
ness has been found. It is possible that the environmental
factor is simply correlated with the real causal environmental
factor. It must also be remembered that including many
highly correlated environmental factors in a multiple regres-
sion model hoping to stumble across the ultimate causal
effects will cause instability in the estimation of the partial

56
J. A. Felizola Diniz-Filho
et al.
© 2003 Blackwell Publishing Ltd,
Global Ecology & Biogeography
,
12
, 53–64
regression coefficients. In this case, a plausible solution would
be to include in the model only variables associated with
theoretical models predicting species richness. Another approach
is to use multivariate techniques (such as principal com-
ponents analyses) to reduce the dimensionality and, conse-
quently, the collinearity among the predictors (Vetaas, 1997;
Badgley & Fox, 2000). The main point is that these problems
associated with the interpretation of multiple regression
models are inherent to the technique itself and have nothing
to do with spatial autocorrelation.
A slightly different problem with observational data is that
it may be difficult to know if the environmental factor indeed
drives species richness throughout the geographical space or
if the two variables (species richness and the environmental
factor) are driven independently by an unique, unmeasured
spatially patterned factor created, for example, by a dynamic
spatial process (e.g. the diffusion of both organisms and envir-
onmental components by vectorial processes driven by water
flow in aquatic ecosystems — Legendre & Troussellier, 1988;
Velho
et al
., 2001). In this case, ecological interpretations for
the spatial correlation could be completely spurious, because
there is a strong possibility of no causal link between environ-
ment and richness. Thus, macro-scale spatial patterns should
be controlled for, or taken into account, since explanation for
the correlation between the variables occurs, in fact, at a local
scale. Methods such as partial regression and trend surfaces
analyses, spatial generalized least-squares (GLS) or auto-
regressive models (Legendre & Legendre, 1998) can then be
used and will shift the explanation from macro to local scales
(see below).
When modelling species richness as a function of multiple
environmental factors, different combinations of the prob-
lems discussed above may appear in a single analysis. If
species richness is strongly patterned in space (i.e. possess a
strong pattern of spatial autocorrelation), then the relative
importance of the environmental factors in the multiple
regression (relative magnitude of partial regression coeffi-
cients) could, in principle, be related to the magnitude of
spatial autocorrelation in these environmental factors. This
could occur because some environmental factors, such as
annual temperature (see ‘application’ below) and richness,
are usually correlated mainly when dealing with macro-
spatial scales (not at local scales). Standard errors of these regres-
sion coefficients could be underestimated and, consequently,
there would be an increase in the significance level of the
t
-
values associated with partial regression coefficients of these
variables, as discussed previously. Therefore, their relative
importance in the multiple regression model would be over-
estimated. This is what Lennon (2000), using a terminology
derived from time-series analyses, refers to as a ‘red shift’
toward autocorrelated environmental effects, creating a ‘red
herring’ in the interpretation of partial regression coefficients.
He pointed out that ‘… the environmental factors selected by
many studies as explanations for ecological patterns are “red
shifted” relative to the set of potential explanatory factors:
environmental factors with less spatial autocorrelation and
hence bluer spectra are much more likely to be rejected’.
Thus, on one hand, some environmental factors have only
long-distance spatial patterns and can generate spatially auto-
correlated residuals at short scales but, at the same time,
statistical testing should be based on a reduced number of
independent points in the grid (i.e. there is a much lower
statistical power see Dutilleul, 1993). On the other hand,
other variables that affect species richness only at the local
level will explain only short distance variation, and so long-
distance structures will not be taken into account (thus creat-
ing spatial autocorrelation in the residuals at macro-scales).
It is difficult to predict how multiple regression deals with
this combination of different spatially structured effects,
because coefficients (and standard errors) are all partials.
Lennon’s (2000) simulations, although demonstrating this
‘red shift’, were based on independently generated predictors,
not dealing with the complications created by strong multi-
collinearity of real environmental data.
Detection of the ‘red shift’ proposed by Lennon (2000)
could be based on bias in the standard errors of partial regres-
sion coefficients (indicating a relative bias in the Type I errors
of the different environmental factors), and not on changes in
the standardized partial coefficients (indicating only a scale
shift), after taken into account the spatial structure in data.
As will be demonstrated below, if there is no autocorrelation
in the residuals and no large differential underestimation of
standard errors (after comparing spatial and nonspatial
regression models), Lennon’s (2000) ‘red shift’ reflects only
that highly autocorrelated climatic factors may indeed be
more important at the overall spatial scale of the study.
Controlling for macro-scale autocorrelation, as suggested
by Lennon (2000) and others (see Selmi & Boulinier, 2001),
is not necessarily the solution to the problem, because this
will shift the explanation towards factors that drive species
richness at smaller spatial scales. This should be performed
if one has reason to believe that macro-scale correlations
between environmental factors and species richness are
spurious, and that other ecological processes, acting at small
spatial scales, are the ultimate factors driving species richness
(such as in the diffusion example mentioned previously).
However, our present understanding of the mechanisms driving
species richness in terrestrial ecosystems clearly indicates that
different environmental factors are involved hierarchically
as explanations at different spatial scales (Whittaker
et al
.,
2001; Willis & Whittaker, 2002), and no passive diffusion
process could explain the high correlations between climate
and species richness.
We now illustrate the ideas developed above to understand
the factors influencing species richness patterns of western
Palearctic birds. This represents part of a larger dataset being

Citations
More filters
Journal ArticleDOI

Methods to account for spatial autocorrelation in the analysis of species distributional data : a review

TL;DR: In this paper, the authors describe six different statistical approaches to infer correlates of species distributions, for both presence/absence (binary response) and species abundance data (poisson or normally distributed response), while accounting for spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations.
Journal ArticleDOI

LATITUDINAL GRADIENTS OF BIODIVERSITY:Pattern,Process,Scale,and Synthesis

TL;DR: An extensive survey of the literature is conducted and a synthetic assessment of the degree to which variation in patterns is a consequence of characteristics of scale or taxon is provided.
Journal ArticleDOI

The role of spatial scale and the perception of large‐scale species‐richness patterns

TL;DR: For example, a hump-shaped altitudinal species-richness pattern is the most typical (c. 50%), with a monotonic decreasing pattern also frequently reported, but the relative distribution of patterns changes readily with spatial grain and extent.
Journal ArticleDOI

Conservation biogeography: assessment and prospect

TL;DR: The role played by biogeographical science in the emergence of conservation guidance is examined and the case for the recognition of Conservation Biogeography as a key subfield of conservation biology delimited as both a substantial body of theory and analysis is made.
References
More filters
Book

SAS System for Mixed Models

Journal ArticleDOI

Pseudoreplication and the Design of Ecological Field Experiments

TL;DR: Suggestions are offered to statisticians and editors of ecological journals as to how ecologists' under- standing of experimental design and statistics might be improved.
Journal ArticleDOI

The Problem of Pattern and Scale in Ecology: The Robert H. MacArthur Award Lecture

TL;DR: The second volume in a series on terrestrial and marine comparisons focusing on the temporal complement of the earlier spatial analysis of patchiness and pattern was published by Levin et al..
Book

Species Diversity in Space and Time

TL;DR: In this article, the authors present a hierarchical dynamic puzzle to understand the relationship between habitat diversity and species diversity and the evolution of the relationships between habitats diversity and diversity in evolutionary time.

Handbook of the birds of Europe, the Middle East and North Africa

Stanley Cramp
TL;DR: The Handbook of the Birds of Europe, the Middle East and North Africa: The Birds of the Western Palearctic by CRAMP, Stanley et al. as mentioned in this paper is a great selection.
Related Papers (5)