scispace - formally typeset
Open AccessJournal ArticleDOI

A Flexible Parametric Family for the Modeling and Simulation of Yield Distributions

Reads0
Chats0
TLDR
This article introduced a system of distributions that can span the entire mean-variance-skewness-kurtosis (MVSK) space and assesses its potential to serve as a more comprehensive parametric crop yield model, improving the breadth of distributional choices available to researchers.
Abstract
The distributions currently used to model and simulate crop yields are unable to accommodate a substantial subset of the theoretically feasible mean-variance-skewness-kurtosis (MVSK) hyperspace. Because these first four central moments are key determinants of shape, the available distributions might not be capable of adequately modeling all yield distributions that could be encountered in practice. This study introduces a system of distributions that can span the entire MVSK space and assesses its potential to serve as a more comprehensive parametric crop yield model, improving the breadth of distributional choices available to researchers and the likelihood of formulating proper parametric models.

read more

Content maybe subject to copyright    Report

A Flexible Parametric Family for
the Modeling and Simulation of
Yield Distributions
Octavio A. Ramirez, Tanya U. McDonald, and Carlos E. Carpio
The distributions currently used to model and simulate crop yields are unable to accom-
modate a substantial subset of the theoretically feasible mean-variance-skewness-kurtosis
(MVSK) hyperspace. Because these first four central moments are key determinants of shape,
the available distributions might not be capable of adequately modeling all yield distributions
that could be encountered in practice. This study introduces a system of distributions that can
span the entire MVSK space and assesses its potential to serve as a more comprehensive
parametric crop yield model, improving the breadth of distributional choices available to
researchers and the likelihood of formulating proper parametric models.
Key Words:
risk analysis, parametric methods, yield distributions, yield modeling and
simulation, yield nonnormality
JEL Classifications:
C15, C16, C46, C63
Agricultural economists have long recognized
that the choice of an appropriate probability
distribution to represent crop yields is critical
for an accurate measurement of the risks as-
sociated with crop production. Anderson
(1974) first emphasized the importance of ac-
counting for nonnormality in crop yield distri-
butions for the purpose of economic risk anal-
ysis. Since then, numerous authors have
focused on this issue, including Gallagher
(1987), Nelson and Preckel (1989), Moss and
Shonkwiler (1993), Ramirez, Moss, and Boggess
(1994), Coble et al. (1996), and Ramirez (1997).
These authors have provided strong statistical
evidence of nonnormality and heteroskedasticity
in crop yield distributions—specifically, the ex-
istence of kurtosis and negative skewness in
a variety of cases. The possibility of positive
skewness has been documented as well (Ra
´
rez,
Misra, and Field, 2003).
The three general statistical procedures that
have been used for the modeling and simulating
of crop yield distributions are the parametric,
the nonparametric, and semiparametric. All have
distinct advantages and disadvantages. The para-
metric procedures assume that the data-generat-
ing process can be adequately represented by a
particular parametric probability distribution
function. For this reason, the main disadvantage
of this method is the potential error resulting
from assuming a probability distribution that is
not flexible enough to properly represent the
data. The main advantage of this method is that,
if the assumed distribution can adequately
Octavio A. Ramirez is professor and head, Department
of Agricultural and Applied Economics, University of
Georgia, Athens, GA. Tanya U. McDonald is research
specialist, Department of Agricultural Economics and
Agricultural Business, New Mexico State University,
Las Cruces, NM. Carlos E. Carpio is assistant pro-
fessor, Department of Applied Economics and Statis-
tics, Clemson University, Clemson, SC.
This study was supported by the National Research
Initiative of the Cooperative State Research, Education and
Extension Service, USDA, Grant #2004-35400-14194,
and by the Agricultural Experiment Stations of the Uni-
versity of Georgia and New Mexico State University.
Journal of Agricultural and Applied Economics, 42,2(May 2010):303–319
Ó 2010 Southern Agricultural Economics Association

represent the data-generating process, it performs
relatively well even in small sample applications.
This is important because crop yield datasets do
not often span long periods. Distributions that
have been used as a basis for parametric pro-
cedures include the Normal, the Log-normal, the
Logistic, the Weibull, the Beta, the Gamma, and
the Inverse Hyperbolic Sine.
The nonparametric approaches also have
strengths and weaknesses. Because these
methods are free of a functional form assump-
tion, they are generally more flexible. However,
they can be inefficient relative to parametric
procedures under certain conditions. Specifi-
cally, according to Ker and Coble (2003), ‘it is
possible, perhaps likely, for very small samples
such as those corresponding to farm-level yield
data, that an incorrect parametric form—say
Normal—is more efficient than the standard
nonparametric kernel estimator. Other authors
cite theoretical complexity and intensive com-
putational requirements as another disadvantage
of nonparametric procedures (Yatchew, 1998).
Semiparametric methods show significant po-
tential because they encapsulate the advantages
of the parametric and nonparametric approaches
while mitigating their disadvantages (Ker and
Coble, 2003; Norwood, Roberts, and Lusk,
2004). Because the semiparametric approach is
based on nonparametrically ‘correcting’ a par-
ticular parametric estimate, the availability of
more flexible distributions such as the ones ad-
vanced in this study should improve the potential
efficiency of semiparametric procedures as well.
Extensive efforts have been devoted to the
issue of the most appropriate probability dis-
tribution to be used as a basis for parametric or
semiparametric methods. Gallagher (1987) used
the well-known Gamma density as a parametric
model for soybean yields. Nelson and Preckel
(1989) proposed a conditional Beta distribution
to model corn yields. Taylor (1990) estimated
multivariate nonnormal densities through a con-
ditional distribution approach based on the
Hyperbolic Tangent transformation. Ramirez
(1997) introduced a modified Inverse Hyper-
bolic Sine transformation (also known in the
statistics literature as the S
U
Distribution) as
a possible multivariate nonnormal and hetero-
skedastic crop yield distribution model. Ker and
Coble (2003) proposed a semiparametric model
based on the Normal and Beta densities. Em-
pirical comparisons of leading parametric
models have been recently attempted (Norwood,
Roberts, and Lusk, 2004). Despite such ad-
vances, the potential of different probability
density functions (pdfs) to serve as suitable yield
and price distribution models has not been
assessed in the context of a rigorous theoretical
framework, and this critical research methods
issue remains unsettled.
According to basic statistical theory (Mood,
Graybill, and Boes, 1974), the first four central
moments of a pdf are the main descriptors of its
shape. Although there are other means for
characterizing and comparing distributions, this
suggests that the flexibility of a pdf to accom-
modate a variety of empirical shapes (i.e., data-
generating processes) should be closely related to
the Mean-Variance-Skewness-Kurtosis (MVSK)
combinations that are allowed by it.
Unfortunately, in this regard, all of the pdfs
that have been used as a basis for parametric
methods suffer from two significant range re-
strictions: (1) they can only accommodate lim-
ited subsets of all of the theoretically feasible
Skewness-Kurtosis (SK) combinations and,
therefore, might only be capable of adequately
modeling underlying data distributions where
third and fourth central moments are within that
subset; and (2) their variance, skewness, and
kurtosis are controlled by only two parameters
and, therefore, are arbitrarily interrelated (i.e.,
only two of these three key moments are free to
vary independently). Although the expanded
form of the S
U
advanced by Ramı
´
rez, Misra, and
Field (2003) allows for any mean and variance
to be freely associated with the SK values per-
mitted by the original parameterization of this
family, it is far from encompassing all theoret-
ically feasible SK combinations.
This research contributes to the yield and
price distribution modeling literature through
the following measures:
1. Introduce a family of distributions (S
B
) that
perfectly complements the S
U
in its coverage
of the SK space, and that spans significant
regions of the space not covered by any other
distribution that has been used for the mod-
eling and simulation of crop yields.
Journal of Agricultural and Applied Economics, May 2010304

2. Derive expanded forms of this S
B
family and
of the Beta distribution, which are analogous
to Ramı
´
rez, Misra, and Field’s (2003) repar-
ameterization of the S
U
, so that they, too, can
allow for any mean and variance to be freely
associated with their possible SK values.
3. Assess the importance of MVSK coverage in
determining a model’s flexibility to adequately
represent a wide range of distributional shapes.
The S
U
-S
B
System
Johnson (1949) introduced the S
U
and the S
B
families of distributions and showed that, to-
gether, they span the entire SK space. Figure 1
is constructed on the basis of the formulas for
the skewness and kurtosis of the S
U
and S
B
distributions, which were also first derived by
Johnson (1949). Specifically, SK pairs are
computed and plotted for very fine grids of the
parameter spaces corresponding to these two
distributions. The same procedure is followed
for the Beta, Gamma, and Log-normal (S
L
). It
is observed that the lower bound of the S
B
is at
the boundary for the theoretically feasible SK
space (K 5 S
2
22). This plotting illustrates
Johnson’s claim of a comprehensive coverage
of the theoretically feasible SK space by the
S
U
-S
B
families.
However, in the parameterizations proposed
by Johnson, each of those SK combinations is
arbitrarily associated with a fixed set of mean-
variance values. Ramirez and McDonald (2006)
outline a reparameterization technique that
expands any probability distribution by two
parameters that specifically and uniquely con-
trol the mean and variance without affecting the
range of skewness and kurtosis values that can
be accommodated. The expanded distribution
obtained through this reparameterization can
therefore model any conceivable mean and
variance in conjunction with the set of SK
combinations allowed by the original distribu-
tion. For the purposes of this study, the tech-
nique is first applied to the S
U
and S
B
families.
This yields a system that can model all theo-
retically feasible MVSK combinations. The
reparameterization begins with the original
two-parameter families (Johnson, 1949):
(1) Z 5 g 1 d sinh
1
Y for the S
U
distribution
(2) Z 5 g 1 d ln½Y=ð1 YÞfor the S
B
distribution
where Y is a nonnormally distributed random
variable based on a standard normal variable
(Z). In other words, the original S
U
and S
B
distributions are derived from transformations
of a standard normal density. Their pdfs, which
are also provided by Johnson (1949), are
obtained by substituting Z in Equation (1) (for
the S
U
) or Equation (2) (for the S
B
) into
a standard normal density and multiplying the
resulting equation by the derivative of Equation
(1) (for the S
U
) or Equation (2) (for the S
B
) with
respect to Y. In the mathematical statistics
Figure 1.
S
U
,S
L
,S
B
, Beta, and Gamma Distributions in the SK Plane; the S
B
Distribution Allows
all SK Combinations in the Beta and Gamma Areas as Well
Ramirez, McDonald, and Carpio: Flexible Yield Distribution Models 305

literature, this is commonly known as the
transformation technique for deriving pdfs.
Note that from Equations (1) and (2) it fol-
lows that:
(3)
Y 5 sinh
Z g
d

5 sinhðNÞ for the
S
U
distribution
(4)
Y5
exp
Z g
d

1 1 exp
Z g
d

5
e
N
ð1 1 e
N
Þ
for the
S
B
distribution
where N is a normal random variable with
mean
g
d
and variance
1
d
2
. The above equations
express the S
U
and S
B
random variables (Y) as
a function of a normal, and can be used for
simulating draws from their probability distri-
butions. Johnson (1949) also provides the for-
mulas for computing their means and variances,
which will be denoted by F
SU
and F
SB
(for the
means) and G
SU
and G
SB
(for the variances).
These formulas are simple but lengthy trigono-
metric functions of g and d. The skewness and
kurtosis of the S
U
and S
B
distributions are
functions of g and d as well. All formulas and
a Gauss program to compute the first four cen-
tral moments of both distributions for given
values for g and d are available from the authors.
The random variables (Y) corresponding to
each of the two distributions given in Equations (3)
and (4) are then standardized by subtracting their
means and dividing by their standard deviations:
(5) Y
S
5
sinhðNÞF
SU
G
1=2
SU
for the S
U
distribution
(6) Y
S
5
e
N
1 1 e
N

F
SB
G
1=2
SB
for the S
B
distribution
Note that the standardized S
U
and S
B
variables
(Y
S
) will always have a mean of zero and
a variance of one, i.e., the parameters g and d no
longer affect the mean or the variance of their
distributions. However, because standardization
only involves subtracting from and dividing the
original random variables (Y) by constants, the
distributions corresponding to these standard-
ized variables (Y
S
) can still accommodate the
same sets of skewness-kurtosis combinations
allowed by the original S
U
and S
B
families. The
final step in the reparameterization process is to
expand the Y
S
distributions so that, instead of
being zero and one, their means and variances
can be controlled by parameters or by para-
metric functions of explanatory variables. This
is accomplished by multiplying Y
S
times the
parameter or parametric function representing
the variance and then subtracting the parametric
function representing the mean:
(7) Y
F
t
5s
t
Y
S
M
t
5ðZ
t
sÞY
S
ðX
t
bÞ
where Y
S
is as defined in Equations (5) and (6)
for the S
U
and S
B
distributions, t 5 1, ...,T
denotes the observations on the explanatory
variables, and Y
t
F
represents the final random
variables of interest. From Equation (7), note
that for both reparameterized variables:
(8) E½Y
F
t
5M
t
5X
t
b and V½Y
F
t
5 s
2
t
5 ðZ
t
sÞ
2
where X
t
and Z
t
represent vectors of explanatory
variables believed to affect the means and vari-
ances of the distributions, and b and s are con-
formable parameter vectors. That is, the mean
and variance of the reparameterized S
U
and S
B
random variables (Y
t
F
) are uniquely controlled
by M
t
and s
t
2
,whileg and d determine their
skewness and kurtosis according to the formulas
provided by Johnson (1949) for the original S
U
and S
B
distributions. Therefore, the reparame-
terized S
U
-S
B
system can accommodate any
theoretically possible MVSK combination.
As noted previously, Figure 1 illustrates the
SK regions covered by the S
U
and S
B
, as well
as three other commonly used distributions.
The S
L
or Log-normal distribution, which is
also a part of the original Johnson system, only
spans the curvilinear boundary between the S
U
and S
B
. The Gamma distribution only spans
a curvilinear segment on the upper right
quadrant of the SK plane as well. Like the S
L
,
the Gamma can be adapted to cover the mirror
image of this segment on the upper left quad-
rant. However, the combinations of SK values
allowed by it are still extremely limited.
Although the Beta covers a nonnegligible
area of the SK plane, note that the S
B
can ac-
commodate all SK combinations allowed by it.
Journal of Agricultural and Applied Economics, May 2010306

In fact, the Beta region is quite narrower than the
S
B
s (i.e., the Beta only covers a subset of the SK
area spanned by the S
B
). Thus, in general, one
might expect the S
B
to be a more applicable
model than the Beta. However, because higher-
order moments also affect distributional shape,
it is possible that, in some applications, a simi-
larly parameterized Beta would provide for
a better model than the S
B
. Another difference
that could affect their relative performance in
a particular case is the fact that the Beta is
a bounded distribution, whereas the S
B
is not.
In short, even if the Beta distribution is rep-
arameterized using the previously discussed
technique, because it only spans a relatively
small subset of the empirically possible SK
space, it may not serve as an acceptable model in
some cases. However, it is also possible that it
would provide for a better model than the S
B
and
the S
U
in other applications. Considering its
significant coverage of the SK space and the
potential role of the higher-order moments and
support characteristics of a distribution on
goodness-of-fit, the Beta is selected as the alter-
native candidate model for the comparative
evaluation of the S
U
-S
B
system conducted in this
study. The expanded parameterization of the Beta
distribution is derived in the following section.
The Expanded Beta Distribution
An expanded parameterization of the Beta dis-
tribution that can accommodate any mean and
variance in conjunction with all SK combina-
tions allowed by the original Beta is obtained by
applying the same technique used for the S
U
and
S
B
. In the case of the Beta distribution:
(9)
E½Y5d= d 1 lðÞ5 F
B
and
V½Y5
dl
d 1 l 1 1ðÞd 1 lðÞ
2
5 G
B
Thus, the transformation from the original Beta-
distributed variable (Y) into the random variable
exhibiting the expanded Beta distribution (Y
t
F
)is:
(10) Y
F
t
5s
t
Y F
B
ðÞ=G
1=2
B
1 M
t
The pdf for Y
t
F
is obtained through a
straightforward application of the transformation
technique, which leads to the following log-
likelihood function:
(11)
LL
B
5
X
T
t51
ln
ffiffiffiffiffi
G
B
p
s
2
t
1 n ln G d 1 lðÞ
n ln G dðÞn ln G lðÞ
1 ðd 1Þ
X
T
t51
ln P
t
!
1 l 1ðÞ
X
T
t51
ln 1 P
t
ðÞ
!
,
where P
t
5
d
d 1 l
1
ðY
F
t
M
t
Þ
ffiffiffiffi
G
B
p
s
2
t
and G repre-
sent the Gamma function. As in the case of the
expanded S
U
and S
B
, E[Y
t
F
] 5 M
t
and V[Y
t
F
] 5
s
t
2
and M
t
and s
t
2
can be specified as linear
functions of relevant explanatory variables.
Estimation of the S
U
-S
B
System
Estimation of the S
U
-S
B
system can also be
accomplished by maximum likelihood. Since
both originate from normal random variables
(N), the transformation technique (Mood,
Graybill, and Boes 1974) can be applied to de-
rive their probability distribution functions.
According to this technique, the pdf of the
transformed random variable (Y
t
F
) is given by:
(12)
PðY
F
t
Þ5
@ðq
1
ðY
F
t
ÞÞ
@Y
F
t
Pðq
1
ðY
F
t
ÞÞ
5 JðY
F
t
Þ Pðq
1
ðY
F
t
ÞÞ,
where q
21
(Y
t
F
) is the inverse of the trans-
formation of N into Y
t
F
(i.e., the function re-
lating N to Y
t
F
, P(q
21
(Y
t
F
)) is the pdf of an
independently and identically distributed nor-
mal random variable N with mean ð
g
d
Þ and
variance d
22
evaluated at q
21
(Y
t
F
), and J(Y
t
F
)
is the Jacobian of the transformation—that is,
the derivative of q
21
(Y
t
F
) with respect to Y
t
F
.
Specifically, for the S
U
, N 5 q
1
SU
ðY
F
t
Þ is found
by substituting Equation (5) into Equation (7)
and solving for N:
(13)
N5q
1
SU
ðY
F
t
Þ5 sinh
1
fR
SUt
g where
R
SUt
5
ðY
F
t
X
t
bÞG
1=2
SU
Z
t
s
1 F
SU
,
and F
SU
and G
SU
are as defined previously. The
Jacobian is obtained by taking the absolute
value of the derivative of Equation (13) with
respect to Y
t
F
, which yields:
Ramirez, McDonald, and Carpio: Flexible Yield Distribution Models 307

Citations
More filters
Posted Content

Small Area Estimation of Insurance Premiums and Basis Risk

TL;DR: In this article, the magnitude of basis risk between Actual Production History (APH) and Group Risk Plan (GRP) contracts across corn farms in Illinois counties is estimated using pseudo-simulated yields with farm specific geospatial climate data.
Posted Content

Crop Insurance Savings Accounts

TL;DR: In this article, the viability of an alternative design for crop insurance based upon farmer-owned savings accounts that are regulated, monitored, and marginally assisted by the government is explored, and the proposed design eliminates the premium rating difficulties that weaken actuarial soundness and trigger the need for substantial external subsidies.

AgEcon Search Appendix I to: "Crop Insurance Savings Accounts: A Viable Alternative to Crop Insurance?" to appear in Applied Economic Perspectives and Policy in 2014

TL;DR: In this paper, a range of plausible levels of crop insurance premium estimation error corresponding to typical corn production scenarios in the Midwestern US are investigated. But the authors focus on the distribution of subsidies across participating corn producers.
References
More filters
Book

Introduction to the Theory of Statistics

TL;DR: In this article, a tabular summary of parametric families of distributions is presented, along with a parametric point estimation method and a nonparametric interval estimation method for point estimation.
Journal ArticleDOI

Introduction to the Theory of Statistics.

TL;DR: In this article, a tabular summary of parametric families of distributions is presented, along with a parametric point estimation method and a nonparametric interval estimation method for point estimation.
Posted Content

Nonparametric Regression Techniques in Economics

TL;DR: In this article, a brief overview of the class of models under study and central theoretical issues such as the curse of dimensionality, the bias-variance trade-off and rates of convergence are discussed.
Related Papers (5)