scispace - formally typeset
Open AccessJournal ArticleDOI

A Baseline Category Logit Model for Assessing Competing Strains of Rhizobium Bacteria

Reads0
Chats0
TLDR
In this article, the authors describe a methodology for evaluating competition among strains of rhizobium bacteria which can be found naturally occurring in or can be introduced into soil and propose an extension of multinomial baseline category logit models that includes multiple offsets and random terms to allow for correlation among clustered responses.
Abstract
In this paper we describe novel methodology for evaluating competition among strains of Rhizobium bacteria which can be found naturally occurring in or can be introduced into soil. Rhizobia can occupy nodules on the roots of legume plants allowing the plant to ‘fix’ atmospheric nitrogen. Our model defines competitive outcomes for a community (the multinomial count of nodules occupied by each strain at the end of a time period) relative to the past state of the community (the proportion of each strain present at the beginning of the time period) and incorporates this prior information in the analysis. Our approach for assessing competition provides an analogy to multivariate methods for continuous responses in competition studies and an alternative to univariate methods for discrete responses that respects the multivariate nature of the data. It can also handle zero values in the multinomial response providing an alternative to compositional data analysis methods, which traditionally have not been able to facilitate zero values. The proposed experimental design is based on the simplex design and the model is an extension of multinomial baseline category logit models that includes multiple offsets and random terms to allow for correlation among clustered responses. Supplemental materials for this article are available from the journal website.

read more

Content maybe subject to copyright    Report

Supplementary materials for this article are available at 10.1007/s13253-011-0058-6.
A Baseline Category Logit Model for Assessing
Competing Strains of Rhizobium Bacteria
C. BROPHY,J.CONNOLLY,I.L.FAGERLI,S.DUODU,and
M. M. S
VENNING
In this paper we describe novel methodology for evaluating competition among
strains of Rhizobium bacteria which can be found naturally occurring in or can be in-
troduced into soil. Rhizobia can occupy nodules on the roots of legume plants allowing
the plant to ‘fix’ atmospheric nitrogen. Our model defines competitive outcomes for a
community (the multinomial count of nodules occupied by each strain at the end of a
time period) relative to the past state of the community (the proportion of each strain
present at the beginning of the time period) and incorporates this prior information
in the analysis. Our approach for assessing competition provides an analogy to mul-
tivariate methods for continuous responses in competition studies and an alternative
to univariate methods for discrete responses that respects the multivariate nature of the
data. It can also handle zero values in the multinomial response providing an alternative
to compositional data analysis methods, which traditionally have not been able to facil-
itate zero values. The proposed experimental design is based on the simplex design and
the model is an extension of multinomial baseline category logit models that includes
multiple offsets and random terms to allow for correlation among clustered responses.
Supplemental materials for this article are available from the journal website.
Key Words: Competition with discrete response; Compositional data analysis; Dis-
crete multivariate analysis; Random effects; Simplex design; Zero values.
1. INTRODUCTION
Competition occurs among species when a required resource is limited and the species
‘compete’ to each obtain the resource. Competition has been widely studied experimen-
tally across many organisms (Nicol and Thornton 1941; Connell 1983; Schoener 1983;
C. Brophy (
) is a Lecturer in Statistics, Department of Mathematics & Statistics, National University of Ire-
land Maynooth, Maynooth, Co. Kildare, Ireland (E-mail: caroline.brophy@nuim.ie). J. Connolly is Associate
Professor in Statistics, UCD School of Mathematical Sciences, Environmental & Ecological Modelling Group,
University College Dublin, Belfield, Dublin 4, Ireland. I. L. Fagerli was a Research Student and M. M. Svenning
is Professor and Head, Department of Arctic and Marine Biology, University of Tromsø, 9037 Tromsø, Norway.
S. Duodu is a Researcher, National Veterinary Institute, P.O. Box 750, Sentrum, 0106 Oslo, Norway.
© 2011 International Biometric Society
Journal of Agricultural, Biological, and Environmental Statistics, Volume 16, Number 3, Pages 409–421
DOI: 10.1007/s13253-011-0058-6
409

410 C. BROPHY ET AL.
Firbank and Watkinson 1985; Goldberg and Barton 1992; Iwasa, Nakamaru, and Levin
1998). The analytical approaches for assessing effects range from multivariate models for
continuous responses (Connolly and Wayne 2005) to univariate approaches for discrete re-
sponses (May 2001) to compositional methods (Aitchison 1986; Aitchison and Ng 2005).
Here we develop a modeling approach for discrete multinomial response data that extends
the current competition literature in three ways: (1) it is analogous to a competition model
derived for continuous responses by Connolly and Wayne (2005) that defines competitive
outcomes relative to the past state of the community and incorporates this prior information
in the analysis, (2) it allows for the multivariate nature of the response data, (3) it will han-
dle zero response values. Our model is a baseline category logit model extended to include
random effects (Hartzel, Agresti, and Caffo 2001) to allow for correlated responses and
multiple offset terms to allow for initial starting values of species. Offsets have previously
been used with models for discrete responses (logistic regression in Agresti 2002)butmul-
tiple offsets have not been used with multinomial models or for the purpose of assessing
competition among species.
The models developed in this paper are motivated by a study of competition among
strains of rhizobia bacteria, which are found naturally occurring in soil or can be introduced
deliberately into soil. Rhizobia can occupy nodules on the root of legume plant species
resulting in atmospheric nitrogen fixation and thereby supply the host plant with N and
provide additional N in the legume environment. This natural source of N can be beneficial
to the productivity of grassland systems and can reduce the cost of running the system. It
is possible that some strains of rhizobia are superior at occupying nodules and at fixing N.
Does the proportion of strains of rhizobia present in the soil at a given point in time affect
the proportion of nodules that the strain will occupy at a later time? To answer this question
we applied three strains of rhizobia to the roots of a legume species in a range of initial
proportions and after a period of time counted the number of nodules each strain occupied.
There were a limited number of available sites for nodulation and the strains competed to
occupy them. For each community (root section) we have a vector of initial proportions
and a final multinomial response vector. We modeled the change from initial proportion
applied to final proportion of nodules occupied for each strain.
In a community, a good competitor is one that gains proportionately more over time
than other species (Connolly, Wayne, and Bazzaz 2001). Connolly and Wayne (2005) and
Ramseier, Connolly, and Bazzaz (2005) developed a multivariate modeling approach to
assessing the effects of the species identity, environment and species initial relative abun-
dance on the outcome of competition. The continuous and multivariate response measured
was the relative growth rate of each species in a community over a period of time. The
variable(s) modeled were the differences in relative growth rates between pairs of species
in a community, giving the name RGRD (relative growth rate difference) to the models.
The RGRD model does not currently facilitate discrete responses.
When the response for each species in a community is a discrete whole number each
experimental community provides a multinomial response vector. There is a long his-
tory of modeling approaches to community dynamics for such discrete responses (May
2001) and these can been related to a discrete version of the Lotka–Volterra model

MODELING MULTINOMIAL DATA 411
(Leslie 1958). However, these approaches rarely deal with the multivariate nature of these
types of data. Other approaches have been to use compositional data analysis methods
for changing compositions (Aitchison 1986; Aitchison and Ng 2005), but these meth-
ods break down when species with zero compositions occur in the response. Some ap-
proaches to facilitate zero methods have been developed (e.g. Aitchison and Kay 2003;
Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn 2003; Butler and Glasbey 2008)
but these rely on assumptions about the type of zero or are suited only to analysis for
specific hypotheses e.g. to compare compositions of different groups.
In a simplex design (Scheffe 1963; Cornell 2002), the initial relative abundances of
competing species are manipulated so that not all experimental communities have all
species equally present to begin with. This design has been used in a range of multi-
species competition studies (e.g. Ramseier, Connolly, and Bazzaz 2005; Kirwan et al. 2007;
Suter et al. 2007) as it allows a broad coverage of the design space and facilitates the
simultaneous assessment of species identity, the effect of species on each other and, if
required, environmental effects (Connolly, Wayne, and Bazzaz 2001). Ideally, in compe-
tition studies, the simplex design would comprise a wide range of compositions in the
simplex space at a number of overall densities (Ramseier, Connolly, and Bazzaz 2005;
Kirwan et al. 2007).
In this paper we propose an experimental and analytical framework for assessing com-
petition among species where the outcome is discrete. The experimental response in our
motivating example is the number of nodules acquired by each strain of rhizobia in each
community and this is a multinomial vector. We describe a multinomial modeling frame-
work for discrete responses from this multi-strain competition experiment and the experi-
mental design needed to estimate model parameters, and we detail how to predict and test
predictions from the models. The novel features are the marrying of simplex designs with
multinomial responses in a discrete modeling framework that defines competitive outcomes
for a community of species relative to a previous state of the community and incorporates
this prior information in the analysis.
2. METHODS
We propose a multinomial baseline category logit model (Agresti 2002) to measure
the competition between J species (categories) that will allow the assessment of compet-
itive relationships among species and consequences for community structure. The cate-
gorical response vector is (y
i1
,...,y
iJ
) for i =1,...,c (the number of communities) and
j = 1,...,J (the number of species) and represents the number of ‘success counts’ for
each species at time t with
J
j=1
y
ij
= n
i
being the total number of success counts for
community i. A multinomial baseline category logit model is a series of J 1 models
relating the jth to the J th species where the J th species is called the baseline category.
The ordering of the j = 1toJ species and the use of a particular species as the ‘base-
line’ is arbitrary and independent of interpretation. We can model the vector of parameters
i1
,...,π
iJ
), the proportion of success counts for each species in the ith community at

412 C. BROPHY ET AL.
time t, with
J
j=1
π
ij
=1, as
log
π
ij
π
iJ
=x
i
β
j
for j =1,...,J 1 (2.1)
where x
i
denotes the vector of K explanatory variables for the ith community, β
j
is the
parameter vector of coefficients for the j th model and could include abiotic effects such
as an environmental treatment. If β
j
= 0, then
π
ij
π
iJ
= 1 and we conclude that species j
and J have the same proportion of success counts at time t. While model (2.1) can assess
proportion of success counts by species relative to the baseline species at a given point in
time (t), it can not address questions of competitive relations or consequences for commu-
nity dynamics without incorporating information on the proportions of each species in the
community at time 0 (or some other reference time) (Connolly, Wayne, and Bazzaz 2001).
If the proportion of each species initially present in the ith community at time 0 is given
by the vector (p
i1
,...,p
iJ
), then we propose the model:
log
π
ij
/p
ij
π
iJ
/p
iJ
=x
i
β
j
for j =1,...,J 1 (2.2)
which can be rewritten as
log
π
ij
π
iJ
=x
i
β
j
+log
p
ij
p
iJ
for j =1,...,J 1 (2.3)
where log(
p
ij
p
iJ
) is an offset term, i.e. a regression term with known coefficient equal to 1.
If β
j
=0, it indicates no change in relative abundance from time 0 to time t between the
two competing species j and J and implies that the two species are equally competitive.
This model is analogous to the specification of the RGRD model in Connolly and Wayne
(2005, Equation (4)).
We extend this model to include a community specific random effect to allow for varia-
tion from community-to-community (Hartzel, Agresti, and Caffo 2001). The model com-
paring the j th to the J th species is
log
π
ij
π
iJ
=x
i
β
j
+log
p
ij
p
iJ
+z
i
u
ij
for j =1,...,J 1 (2.4)
where z
i
denotes the design vector for the random effect for the ith community and u
ij
is
assumed multivariate normal with an unstructured covariance matrix () to keep indepen-
dence of the choice of baseline category (Hartzel, Agresti, and Caffo 2001).
We can fit model (2.4) using maximum likelihood. Denoting the linear predictor, lp
ij
=
x
i
β
j
+log(
p
ij
p
iJ
) +z
i
u
ij
, the likelihood function for the ith response vector is, integrating
out the random effects and omitting a fixed constant:
i
−∞
···
−∞
J 1
j
exp(lp
ij
)
1 +
J 1
j=1
exp(lp
ij
)
y
ij
1
1 +
J 1
j=1
exp(lp
ij
)
y
iJ
×f(u
ij
;) du
ij
. (2.5)

MODELING MULTINOMIAL DATA 413
We predict (denoted by the ˆsymbol, which is also used to denote the maximum likelihood
estimate of model parameters) the proportion of success counts for the j th species from
the model at the median value of the random effect using the equations:
ˆπ
ij
=
exp
x
i
ˆ
β
j
+log
p
ij
p
iJ

1 +
J 1
j=1
exp
x
i
ˆ
β
j
+log
p
ij
p
iJ

for j =1,...,J 1,
ˆπ
iJ
=1
J 1
j=1
ˆπ
ij
for J.
(2.6)
While this model may be applied to a wide range of count data it is particularly relevant
to data from experiments based on a simplex design (Scheffe 1963; Cornell 2002) in which
the initial p
ij
values and overall initial density of species are deliberately manipulated. The
relative abundance of each species at time 0, (p
ij
,...,p
iJ
), may be important determinants
of species relative competitiveness and hence of the final composition
ij
,...,π
iJ
) of the
ith community. At its simplest, the x matrix in model (2.4) would include the relative
abundances p
ij
,...,p
iJ
giving:
log
π
ij
π
iJ
=
J
k=1
β
jk
p
ik
+β
jD
D
i
+log
p
ij
p
iJ
+u
ij
for j =1,...,J 1 (2.7)
where p
ik
is the initial proportion of the kth species for k =1,...,J, D
i
is the total density
of the ith community and u
ij
is a random effect with variance σ
2
j
and may be correlated
with the other J 2 random effects. Interactions among the p
ik
s and between the p
ik
’s
and other independent variables, such as a treatment factor or community density (D) may
also be included in the model specification.
For model (2.7), if β
jk
=0 for all k = 1,...,J and β
jD
=0, then the relative propor-
tions of the j th and J th species are the same at times 0 and t, and species j and J are
equally competitive i.e. (
π
ij
π
iJ
) =(
p
ij
p
iJ
). When these parameters are not zero and interaction
effects are present, the number of competition coefficients may mean it is difficult to see
their combined impact on community relative composition. To interpret the model the fi-
nal proportions of success counts for each species can be predicted for a range of initial
communities and these predictions used to determine the outcome of competition. Predic-
tions can be displayed graphically using ternary diagrams (where there are three competing
species), and we distinguish between two numerical comparisons. Compositional change
measure (1): ˆπ
ij
/p
ij
compares the predicted proportion of success counts relative to ini-
tial proportion present for an individual species. This measure determines how a species
performs relative to its own expectation (p
ij
) but even a species that performs better than
expected may not be the most competitive species. Compositional change measure (2):
ˆπ
ij
/p
ij
ˆπ
ij
/p
ij
for j =j
, compares two species and determines which is the more competitive of
the two.

Citations
More filters

Probability and Measure

P.J.C. Spreij

Rhizobial competition and enhancing rhizobial colonization in the legume rhizosphere using a systemic fungicide

TL;DR: The inoculation of legumes with effective rhizobia or bradyrhizobia represents an inexpensive alternative to the use of chemical nitrogen fertilizers, whose prices have risen due to the high cost of energy involved in their production.
References
More filters
Journal ArticleDOI

Categorical Data Analysis.

Dennis Lendrem, +1 more
- 01 Jan 1991 - 
Journal ArticleDOI

Field experiments on interspecific competition

TL;DR: Competition was found in 90% of the studies and 76% of their species, indicating its pervasive importance in ecological systems, and the Hairston-Slobodkin-Smith hypothesis concerning variation in the importance of competition between trophic levels was strongly supported.
Journal ArticleDOI

Probability and Measure.

Journal ArticleDOI

On the prevalence and relative importance of interspecific competition: evidence from field experiments

TL;DR: The present survey illustrates how difficult it is to produce a clear and unambiguous demonstration of interspecific competition.
Related Papers (5)
Frequently Asked Questions (1)
Q1. What contributions have the authors mentioned in the paper "A baseline category logit model for assessing competing strains of rhizobium bacteria" ?

In this paper the authors describe novel methodology for evaluating competition among strains of Rhizobium bacteria which can be found naturally occurring in or can be introduced into soil. Their approach for assessing competition provides an analogy to multivariate methods for continuous responses in competition studies and an alternative to univariate methods for discrete responses that respects the multivariate nature of the data. Supplemental materials for this article are available from the journal website.