scispace - formally typeset
Search or ask a question

CANOCO - a FORTRAN program for canonical community ordination by [partial] [etrended] [canonical] correspondence analysis, principal components analysis and redundancy analysis (version 2.1)

01 Jan 1988-
About: The article was published on 1988-01-01 and is currently open access. It has received 2594 citations till now. The article focuses on the topics: Redundancy (engineering) & Ordination.

Content maybe subject to copyright    Report

Ministerie
van
Landbouw
en
Visserij
Directoraat-Generaal
Landbouw
en
Voedselvoorziening
Directie
Landbouwkundig
Onde:rzoek
GROEP
LANDBOUWWISKUNDE
CANOCO
- a
FORTRAN
program
for
canonical
community
ordination
by
(partia~
fdetrended1
[canonical]
correspondence
analysis,
principal
components
analysis
and redundancy
analysis
(version
2.1).
Cajo
J.F.
Ter Braak
Agricultural
Mathematics Group
Box
100,
6700
AC
Wageningen
The
Netherlands
This
report
is
reprinted
(with
permission,
and with
corrections
and
some
additions)
from
the
technical
report
with
number
87
!TI
A
11
of
the
TNO
Institute
of
Applied Computer
Science,
Statistics
Department Wageningen, which
is
the
former
affiliation
of
the
author.
Technical
report:
LWA-88-02
January
1988
GLW
Postbus
100
6700
AC
Wageningen

Copyright
Agricultural
Mathematics Group, Wageningen, 1988.
No
part
of
this
publication,
apart
from
bibliographic
data
and
brief
quotations
in
critical
reviews,
may
be
reproduced,
re-recorded
or
published
in
any form
including
print,
photocopy,
microfilm,
electronic
or
electromagnetic
record
without
written
permission
from
the
Agricultural
Mathematics Group, P.O.Box 100,
6700
AC
Wageningen,
The
Netherlands.

- i -
CONTENT
OVERVIEW
1
INTRODUCTION
1.1
General
objective
1.2
Models, methods and
algorithm
1.3 Terminology
1.~
CANOC0
1
s
efficiency
for
ordination
of
community
data
1.5
Outline
of
the
manual
2.
DATA
INPUT
2.1
Cornell
condensed format
2.2
Full
format
2.3
Presence/absence
data
and nominal
data
for
ordination
2.~
Linking
up
samples
in
different
data
files
3.
TERMINAL
DIALOGUE
3.1
How
to
activate
CANOCO
3.2
Input
and
output
3.3
Ways
to
answer
the
questions
3.~
Questions
to
specify
the
type
of
analysis
and
in-
and
output
files
3.5
Questions
to
omit samples and
to
manipulate
environmental
variables
and
covariables
3.6
Questions
to
specify
transformation
of
species
data
3.7
Questions
to
specify
the
output
3.8
Questions
to
specify
additional
analyses
3.9
Example
~.
OUTPUT
~.1
Samples and
species
in
the
analysis
~.2
Iteration
report,
eigenvalue
and
length
of
gradient
~.3
Correlation
matrix,
means,
standard
deviations
and
inflation
factors
4.~
Percentage
variance
accounted
for
by
firsts
axes
of
species-
environment
biplot
~.5
Species
scores
~.
6 Samples
scores
~.7
Regression/canonical
coefficients,
t-values
and
linear
combinations
of
environmental
variables
~.8
Inter-set
correlations
of
environmental
variables
with
axes
~.9
Biplot
scores
of
environmental
variables
~.10
Centroids
of
environmental
variables
in
the
ordination
diagram
~.11
Monte
Carlo
permutation
test
5.
NONSTANDARD
ANALYSIS
6.
EXAMPLES
6.1
Dune
meadow
data
6.2
Weeds
in
summer
barley
6.3
Gene
frequency
data
7.
MISCELLANEOUS
TOPICS
7.1
Percentage
data/compositional
data
7.2
Nominal
response
data
..
7.3
Multiple
regression,
redundancy
analysis,
principal
components
analysis
and
.canonical
correlation
analysis
7.4
Principal
coordinates
analysis
(PCO)
7. 5
Interchanging
species
and samples; weighted
averaging
ordination
7.6 Weighting samples and
species
7.7
Calibration
by
CANOCO
7.8 Canonical
variates
analysis
(CVA)

8.
ITERATIVE
ORDINATION
ALGORITHM
9.
TECHNICAL
DETAILS
9.1 Dimensioning
-
ii
-
9.
2
Structure
of· the main program
9.3
Scaling
of
the
axes
9.4
Monte
carlo
permutation
test.
9.5
Some
points
concerning
CVA
10.
INSTALLATION
NOTES
11.
ACKNOWLEDGEMENTS
12.
REFERENCES
APPENDIX
A:
Theorem
on
the
eigenvalue
equation
solved
by
CANOCO
APPENDIX
B:
Constrained
principal
coordinates
analysis
APPENDIX
C:
Trace and
short-cut
formulae
(4.17)
and
(4.19)

- 1 -
OVERVIEW
Aim
A
common
problem
in
community ecology and
ecotoxicology
is
to
discover
how
a
multitude
of
species
respond
to
external
factors
such
as
environmental
variables,
pollutants
and management regime, Data
are
collected
on
species
composition
and
the
external
variables
at
a number
of
points
in
space
and
time.
Statistical
methods
available
so
far
to
analyse
such
data
either
assumed
linear
relationships
or
were
restricted
to
regression
analysis
of
the
response
of
each
species
separately.
To
analyse
the
generally
non-linear,
non
monotone
response
of
a community
of
species,
one had
to
resort
to
the
data-analytic
methods
of
ordination
and
cluster
analysis
-
"indirect"
methods
that
are
generally
less
powerful than
the
"direct"
statistical
method
of
regression
analysis.
Recently,
regression
and
ordination
have been
integrated
into
techniques
of
multivariate
direct
gradient
analysis,
called
canonical
(or
constrained)
ordination.
The
use
of
canonical
ordination
greatly
improves
the
power
to
detect
the
specific
effects
one
is
interested
in.
One
of
these
techniques,
canonical
correspondence
analysis,
escapes
the
assumption
of
linearity
and
is
able
to
detect
unimodal
relationships
between
species
and
external
variables.
The
computer program
CANOCO
is
designed
to
make
these
techniques
available
to
ecologists
studying
community
responses.
CANOCO
can
carry
out
most
of
the
multivariate
techniques
described
inTer
Braak (1987)
and Ter Braak and
Prentice
(1988)
using
a
general
iterative
ordination
algorithm.
Researchers
in
other
fields
may
find
CANOCO
useful
as
well,
for
example,
to
analyse
percentage
data/compositional
data,
nominal
data
or
(dis)-
similarity
data
in
relation
to
external
explanatory
variables.
such use
is
explained
in
separate
sections
in
the
manual.
CANOCO
is
particularly
suited
if
the
number
of
response
variables
is
large
compared
to
the
number
of
objects.
Techniques
covered
1.
CANOCO
is
an
extension
of
DECORANA
(Hill,
1979).
CANOCO
formerly
stood
for
canonical
correspondence
analysis
(Ter Braak, 1986a, b) and
included
weighted
averaging,
reciprocal
averaging/[multiple)
correspondence
analysis,
detrended
correspondence
analysis
and
canonical
correspondence
analysis.
The
program has been
extended
to
cover
also
principal
components
analysis
(PCA)
and
the
canonical
form
of
PCA,
called
redundancy
analysis
(RDA).
Redundancy
analysis
(Van
den Wollenberg, 1977;
Isra~ls,
1984)
is
also
known
under
the
names
of
reduced-rank
regression.
(Davies and Tso,
1982),
PCA
of
y
with
respect
to
x (Robert and
Escoufier,
1976) and
mode
C
partial
least
squares
(Wold, 1982), For
these
linear
methods
there
are
options
for
centring/standardization
by
species
and
by
sites
and
for
the
method
of
scaling
the
species
and
site
scores
for
use
in
the
biplot.
The
eigenvalues
reported
in
PCA/RDA
are
fractions
of
the
total
variance
in
the
species
data
(percentage
variance
accounted
for).
Principal
coordinates
analysis
and
canonical
variates
analysis
are
also
available.

Citations
More filters
Journal ArticleDOI
TL;DR: A new and simple method to find indicator species and species assemblages characterizing groups of sites, and a new way to present species-site tables, accounting for the hierarchical relationships among species, is proposed.
Abstract: This paper presents a new and simple method to find indicator species and species assemblages characterizing groups of sites The novelty of our approach lies in the way we combine a species relative abundance with its relative frequency of occurrence in the various groups of sites This index is maximum when all individuals of a species are found in a single group of sites and when the species occurs in all sites of that group; it is a symmetric indicator The statistical significance of the species indicator values is evaluated using a randomization procedure Contrary to TWINSPAN, our indicator index for a given species is independent of the other species relative abundances, and there is no need to use pseudospecies The new method identifies indicator species for typologies of species releves obtained by any hierarchical or nonhierarchical classification procedure; its use is independent of the classification method Because indicator species give ecological meaning to groups of sites, this method provides criteria to compare typologies, to identify where to stop dividing clusters into subsets, and to point out the main levels in a hierarchical classification of sites Species can be grouped on the basis of their indicator values for each clustering level, the heterogeneous nature of species assemblages observed in any one site being well preserved Such assemblages are usually a mixture of eurytopic (higher level) and stenotopic species (characteristic of lower level clusters) The species assemblage approach demonstrates the importance of the ''sampled patch size,'' ie, the diversity of sampled ecological combinations, when we compare the frequencies of core and satellite species A new way to present species-site tables, accounting for the hierarchical relationships among species, is proposed A large data set of carabid beetle distributions in open habitats of Belgium is used as a case study to illustrate the new method

7,449 citations

Book ChapterDOI
TL;DR: In this article, the authors present a theory of gradient analysis, in which the heuristic techniques are integrated with regression, calibration, ordination and constrained ordination as distinct, well-defined statistical problems.
Abstract: Publisher Summary This chapter concerns data analysis techniques that assist the interpretation of community composition in terms of species' responses to environmental gradients in the broadest sense. All species occur in a characteristic, limited range of habitats; and within their range, they tend to be most abundant around their particular environmental optimum. The composition of biotic communities thus changes along environmental gradients. Direct gradient analysis is a regression problem—fitting curves or surfaces to the relation between each species' abundance, probability of occurrence, and one or more environmental variables. Ecologists have independently developed a variety of alternative techniques. Many of these techniques are essentially heuristic, and have a less secure theoretical basis. This chapter presents a theory of gradient analysis, in which the heuristic techniques are integrated with regression, calibration, ordination and constrained ordination as distinct, well-defined statistical problems. The various techniques used for each type of problem are classified in families according to their implicit response model and the method used to estimate parameters of the model. Three such families are considered. The treatment shown here unites such apparently disparate data analysis techniques as linear regression, principal components analysis, redundancy analysis, Gaussian ordination, weighted averaging, reciprocal averaging, detrended correspondence analysis, and canonical correspondence analysis in a single theoretical framework.

2,289 citations


Cites background from "CANOCO - a FORTRAN program for cano..."

  • ...Ordination axes can be considered as latent variables, or hypothetical environmental variables, constructed in such a way as to optimize the fit of the species data to a particular (linear or unimodal) statistical model of how species abundance varies along gradients (Ter Braak, 1985, 1987a)....

    [...]

  • ...Heiser (1987) and Ter Braak (1985, 1987a) develop rationales for correspondence analysis that are particularly relevant to ecological applications....

    [...]

  • ...Redundancy analysis can summarize the species–environment relationships in such an informative way, because the gradients are short ( 2SD: Ter Braak, 1987b)....

    [...]

  • ...…displays (a) the main patterns of community variations, as far as these reflect environmental variation, and (b) the main pattern in the weighted averages (not correlations as in redundancy analysis) of each of the species with respect to the environmental variables (Ter Braak, 1986, 1987a)....

    [...]

  • ...Detrending-by-segments does not work very well here for technical reasons; detrending-by-polynomials is better founded and more appropriate (see Appendix and Ter Braak, 1987b)....

    [...]

Journal ArticleDOI
01 Sep 2008-Ecology
TL;DR: This paper proposes a new way of using forward selection of explanatory variables in regression or canonical redundancy analysis, and proposes a two-step procedure to prevent overestimation of the amount of explained variance.
Abstract: This paper proposes a new way of using forward selection of explanatory variables in regression or canonical redundancy analysis. The classical forward selection method presents two problems: a highly inflated Type I error and an overestimation of the amount of explained variance. Correcting these problems will greatly improve the performance of this very useful method in ecological modeling. To prevent the first problem, we propose a two-step procedure. First, a global test using all explanatory variables is carried out. If, and only if, the global test is significant, one can proceed with forward selection. To prevent overestimation of the explained variance, the forward selection has to be carried out with two stopping criteria: (1) the usual alpha significance level and (2) the adjusted coefficient of multiple determination (Ra(2)) calculated using all explanatory variables. When forward selection identifies a variable that brings one or the other criterion over the fixed threshold, that variable is rejected, and the procedure is stopped. This improved method is validated by simulations involving univariate and multivariate response data. An ecological example is presented using data from the Bryce Canyon National Park, Utah, U.S.A.

1,720 citations


Cites methods from "CANOCO - a FORTRAN program for cano..."

  • ...Since the introduction of the canonical ordination program CANOCO (ter Braak 1988), ecologists have used forward selection to choose environmental variables to obtain a parsimonious subset of environmental variables to model multivariate community structure (Legendre and Legendre 1998)....

    [...]

  • ...CANOCO—a FORTRAN program for canonical community ordination by [partial] [detrended] [canonical] correspondence analysis, principal component analysis and redundancy analysis....

    [...]

Journal ArticleDOI
TL;DR: After pointing out the key assumptions underlying CCA, the paper focuses on the interpretation of CCA ordination diagrams and some advanced uses, such as ranking environmental variables in importance and the statistical testing of effects are illustrated on a typical macroinvertebrate data-set.
Abstract: Canonical correspondence analysis (CCA) is a multivariate method to elucidate the relationships between biological assemblages of species and their environment. The method is designed to extract synthetic environmental gradients from ecological data-sets. The gradients are the basis for succinctly describing and visualizing the differential habitat preferences (niches) of taxa via an ordination diagram. Linear multivariate methods for relating two set of variables, such as twoblock Partial Least Squares (PLS2), canonical correlation analysis and redundancy analysis, are less suited for this purpose because habitat preferences are often unimodal functions of habitat variables. After pointing out the key assumptions underlying CCA, the paper focuses on the interpretation of CCA ordination diagrams. Subsequently, some advanced uses, such as ranking environmental variables in importance and the statistical testing of effects are illustrated on a typical macroinvertebrate data-set. The paper closes with comparisons with correspondence analysis, discriminant analysis, PLS2 and co-inertia analysis. In an appendix a new method, named CCA-PLS, is proposed that combines the strong features of CCA and PLS2.

1,715 citations


Cites background or methods from "CANOCO - a FORTRAN program for cano..."

  • ...This can be achieved by a partial canonical correspondence analysis (partial CCA: ter Braak, 1988 a) with the five class variables representing sampling months as covariables....

    [...]

  • ...Valid statistical tests can be based on Monte Carlo permutation of sites (instead of individuals) and are standard in the computer program CANOCO (ter Braak, 1988b)....

    [...]

  • ...For example, if the eigenvalue of canonical correspondence analysis is £, then the corresponding eigenvalue of the discriminant analysis if £/(1-£) (ter Braak, 1988 b: section 9.5); the scores of species and of the sites are linearly related....

    [...]

  • ...Originally, CCA was derived as an approximation to maximum likelihood Gaussian ordination with linear external constraints (ter Braak, 1986, 1988a)....

    [...]

  • ...It was the default in older versions of the computer program CANOCO (version 2.1; ter Braak, 1988b)....

    [...]

Journal ArticleDOI
TL;DR: This article examined the distribution and abundance of bird species across an urban gradient, and concomitant changes in community structure, by censusing summer resident bird populations at six sites in Santa Clara County, California (all former oak woodlands).
Abstract: I examined the distribution and abundance of bird species across an urban gradient, and concomitant changes in community structure, by censusing summer resident bird populations at six sites in Santa Clara County, California (all former oak woodlands). These sites represented a gradient of urban land use that ranged from relatively undisturbed to highly developed, and included a biological preserve, recreational area, golf course, residential neighborhood, office park, and business district. The composition of the bird community shifted from predominantly native species in the undisturbed area to invasive and exotic species in the business district. Species richness, Shannon diversity, and bird biomass peaked at moderately disturbed sites. One or more species reached maximal densities in each of the sites, and some species were restricted to a given site. The predevelopment bird species (assumed to be those found at the most undisturbed site) dropped out gradually as the sites became more urban. These patterns were significantly related to shifts in habitat structure that occurred along the gradient, as determined by canonical correspondence analysis (CCA) using the environmental variables of percent land covered by pavement, buildings, lawn, grasslands, and trees or shrubs. I compared each formal site to four additional sites with similar levels of development within a two-county area to verify that the bird communities at the formal study sites were rep- resentative of their land use category.

1,308 citations