CANOCO - a FORTRAN program for canonical community ordination by [partial] [etrended] [canonical] correspondence analysis, principal components analysis and redundancy analysis (version 2.1)

Ministerie

van

Landbouw

en

Visserij

Directoraat-Generaal

Landbouw

en

Voedselvoorziening

Directie

Landbouwkundig

Onde:rzoek

GROEP

LANDBOUWWISKUNDE

CANOCO

- a

FORTRAN

program

for

canonical

community

ordination

by

(partia~

fdetrended1

[canonical]

correspondence

analysis,

principal

components

analysis

and redundancy

analysis

(version

2.1).

Cajo

J.F.

Ter Braak

Agricultural

Mathematics Group

Box

100,

6700

AC

Wageningen

The

Netherlands

This

report

is

reprinted

(with

permission,

and with

corrections

and

some

additions)

from

the

technical

report

with

number

87

!TI

A

11

of

the

TNO

Institute

of

Applied Computer

Science,

Statistics

Department Wageningen, which

is

the

former

affiliation

of

the

author.

Technical

report:

LWA-88-02

January

1988

GLW

Postbus

100

6700

AC

Wageningen

Copyright

Agricultural

Mathematics Group, Wageningen, 1988.

No

part

of

this

publication,

apart

from

bibliographic

data

and

brief

quotations

in

critical

reviews,

may

be

reproduced,

re-recorded

or

published

in

any form

including

print,

photocopy,

microfilm,

electronic

or

electromagnetic

record

without

written

permission

from

the

Agricultural

Mathematics Group, P.O.Box 100,

6700

AC

Wageningen,

The

Netherlands.

- i -

CONTENT

OVERVIEW

1 •

INTRODUCTION

1.1

General

objective

1.2

Models, methods and

algorithm

1.3 Terminology

1.~

CANOC0

1

s

efficiency

for

ordination

of

community

data

1.5

Outline

of

the

manual

2.

DATA

INPUT

2.1

Cornell

condensed format

2.2

Full

format

2.3

Presence/absence

data

and nominal

data

for

ordination

2.~

Linking

up

samples

in

different

data

files

3.

TERMINAL

DIALOGUE

3.1

How

to

activate

CANOCO

3.2

Input

and

output

3.3

Ways

to

answer

the

questions

3.~

Questions

to

specify

the

type

of

analysis

and

in-

and

output

files

3.5

Questions

to

omit samples and

to

manipulate

environmental

variables

and

covariables

3.6

Questions

to

specify

transformation

of

species

data

3.7

Questions

to

specify

the

output

3.8

Questions

to

specify

additional

analyses

3.9

Example

~.

OUTPUT

~.1

Samples and

species

in

the

analysis

~.2

Iteration

report,

eigenvalue

and

length

of

gradient

~.3

Correlation

matrix,

means,

standard

deviations

and

inflation

factors

4.~

Percentage

variance

accounted

for

by

firsts

axes

of

species-

environment

biplot

~.5

Species

scores

~.

6 Samples

scores

~.7

Regression/canonical

coefficients,

t-values

and

linear

combinations

of

environmental

variables

~.8

Inter-set

correlations

of

environmental

variables

with

axes

~.9

Biplot

scores

of

environmental

variables

~.10

Centroids

of

environmental

variables

in

the

ordination

diagram

~.11

Monte

Carlo

permutation

test

5.

NONSTANDARD

ANALYSIS

6.

EXAMPLES

6.1

Dune

meadow

data

6.2

Weeds

in

summer

barley

6.3

Gene

frequency

data

7.

MISCELLANEOUS

TOPICS

7.1

Percentage

data/compositional

data

7.2

Nominal

response

data

..

7.3

Multiple

regression,

redundancy

analysis,

principal

components

analysis

and

.canonical

correlation

analysis

7.4

Principal

coordinates

analysis

(PCO)

7. 5

Interchanging

species

and samples; weighted

averaging

ordination

7.6 Weighting samples and

species

7.7

Calibration

by

CANOCO

7.8 Canonical

variates

analysis

(CVA)

8.

ITERATIVE

ORDINATION

ALGORITHM

9.

TECHNICAL

DETAILS

9.1 Dimensioning

-

ii

-

9.

2

Structure

of· the main program

9.3

Scaling

of

the

axes

9.4

Monte

carlo

permutation

test.

9.5

Some

points

concerning

CVA

10.

INSTALLATION

NOTES

11.

ACKNOWLEDGEMENTS

12.

REFERENCES

APPENDIX

A:

Theorem

on

the

eigenvalue

equation

solved

by

CANOCO

APPENDIX

B:

Constrained

principal

coordinates

analysis

APPENDIX

C:

Trace and

short-cut

formulae

(4.17)

and

(4.19)

- 1 -

OVERVIEW

Aim

A

common

problem

in

community ecology and

ecotoxicology

is

to

discover

how

a

multitude

of

species

respond

to

external

factors

such

as

environmental

variables,

pollutants

and management regime, Data

are

collected

on

species

composition

and

the

external

variables

at

a number

of

points

in

space

and

time.

Statistical

methods

available

so

far

to

analyse

such

data

either

assumed

linear

relationships

or

were

restricted

to

regression

analysis

of

the

response

of

each

species

separately.

To

analyse

the

generally

non-linear,

non

monotone

response

of

a community

of

species,

one had

to

resort

to

the

data-analytic

methods

of

ordination

and

cluster

analysis

-

"indirect"

methods

that

are

generally

less

powerful than

the

"direct"

statistical

method

of

regression

analysis.

Recently,

regression

and

ordination

have been

integrated

into

techniques

of

multivariate

direct

gradient

analysis,

called

canonical

(or

constrained)

ordination.

The

use

of

canonical

ordination

greatly

improves

the

power

to

detect

the

specific

effects

one

is

interested

in.

One

of

these

techniques,

canonical

correspondence

analysis,

escapes

the

assumption

of

linearity

and

is

able

to

detect

unimodal

relationships

between

species

and

external

variables.

The

computer program

CANOCO

is

designed

to

make

these

techniques

available

to

ecologists

studying

community

responses.

CANOCO

can

carry

out

most

of

the

multivariate

techniques

described

inTer

Braak (1987)

and Ter Braak and

Prentice

(1988)

using

a

general

iterative

ordination

algorithm.

Researchers

in

other

fields

may

find

CANOCO

useful

as

well,

for

example,

to

analyse

percentage

data/compositional

data,

nominal

data

or

(dis)-

similarity

data

in

relation

to

external

explanatory

variables.

such use

is

explained

in

separate

sections

in

the

manual.

CANOCO

is

particularly

suited

if

the

number

of

response

variables

is

large

compared

to

the

number

of

objects.

Techniques

covered

1.

CANOCO

is

an

extension

of

DECORANA

(Hill,

1979).

CANOCO

formerly

stood

for

canonical

correspondence

analysis

(Ter Braak, 1986a, b) and

included

weighted

averaging,

reciprocal

averaging/[multiple)

correspondence

analysis,

detrended

correspondence

analysis

and

canonical

correspondence

analysis.

The

program has been

extended

to

cover

also

principal

components

analysis

(PCA)

and

the

canonical

form

of

PCA,

called

redundancy

analysis

(RDA).

Redundancy

analysis

(Van

den Wollenberg, 1977;

Isra~ls,

1984)

is

also

known

under

the

names

of

reduced-rank

regression.

(Davies and Tso,

1982),

PCA

of

y

with

respect

to

x (Robert and

Escoufier,

1976) and

mode

C

partial

least

squares

(Wold, 1982), For

these

linear

methods

there

are

options

for

centring/standardization

by

species

and

by

sites

and

for

the

method

of

scaling

the

species

and

site

scores

for

use

in

the

biplot.

The

eigenvalues

reported

in

PCA/RDA

are

fractions

of

the

total

variance

in

the

species

data

(percentage

variance

accounted

for).

Principal

coordinates

analysis

and

canonical

variates

analysis

are

also

available.

CANOCO - a FORTRAN program for canonical community ordination by [partial] [etrended] [canonical] correspondence analysis, principal components analysis and redundancy analysis (version 2.1)

Citations

Cites background from "CANOCO - a FORTRAN program for cano..."

Cites methods from "CANOCO - a FORTRAN program for cano..."

Cites background or methods from "CANOCO - a FORTRAN program for cano..."

Related Papers (5)