# A Flexible Parametric Family for the Modeling and Simulation of Yield Distributions

Abstract: The distributions currently used to model and simulate crop yields are unable to accommodate a substantial subset of the theoretically feasible mean-variance-skewness-kurtosis (MVSK) hyperspace. Because these first four central moments are key determinants of shape, the available distributions might not be capable of adequately modeling all yield distributions that could be encountered in practice. This study introduces a system of distributions that can span the entire MVSK space and assesses its potential to serve as a more comprehensive parametric crop yield model, improving the breadth of distributional choices available to researchers and the likelihood of formulating proper parametric models.

## Summary (2 min read)

### FSB

- Because standardization only involves subtracting from and dividing the original random variables (Y) by constants, the distributions corresponding to these standardized variables (Y S) can still accommodate the same sets of skewness-kurtosis combinations allowed by the original SU and SB families.
- From Equation (7), note that for both reparameterized variables: (8) E½YFt 5Mt5Xtb and V ½YFt 5 s2t 5 ðZtsÞ 2 where Xt and Zt represent vectors of explanatory variables believed to affect the means and variances of the distributions, and b and s are conformable parameter vectors.
- The Gamma distribution only spans a curvilinear segment on the upper right quadrant of the SK plane as well.
- Note that the SB can accommodate all SK combinations allowed by it.
- Because higherorder moments also affect distributional shape, it is possible that, in some applications, a similarly parameterized Beta would provide for a better model than the SB.

### Estimation of the SU-SB System

- Estimation of the SU-SB system can also be accomplished by maximum likelihood.
- Since both originate from normal random variables (N), the transformation technique (Mood, Graybill, and Boes 1974) can be applied to derive their probability distribution functions.
- The SU pdf is then obtained by substituting Equation (13) into a normal density with mean ð gdÞ and variance d 22 and premultiplying the result by Equation (14).
- These programs can be easily translated into MatLab or SAS-IML.

### Validating the Proposed Theoretical Framework

- Ramirez and McDonald (2006) suggest that MVSK space coverage is key to a model’s flexibility to adequately represent a wide range of distributions.
- In the case of the seven sets of models corresponding to the SB-generated datasets (Table 2), the MLLFVs of the Beta models are relatively close to those of the SB models, with the differences averaging 1.02 units.
- The ‘‘true’’ cdfs are also plotted using the correct distribution and the exact parameters underlying each of the 21 data-generating processes.
- In addition, the results indicate that the SB is a generally better alternative than the Beta for underlying distributions with SK values on the surrounding regions of the SK space because the Beta can only partially cover these remaining regions.

### Economic Relevance

- A final issue of interest is the economic relevance of using a more suitable probability distribution model for risk management decisions.
- In the case of farm E, for example, the true underlying distribution is Beta and, therefore, the farm manager should choose 66%, 77%, or 88% coverage levels depending on his/her risk tolerance.
- Differences of this magnitude are more likely to cause incorrect coverage selection in some cases and could therefore be considered somewhat important from an economic standpoint.
- In the case of the Beta, the average of the absolute differences is 6.1%.
- If the true distribution underlying the yield data are Beta and management decisions are made on the basis of an estimated SB model, the degree of error and its economic implications are relatively minor.

### Conclusion

- Following the recommended strategy of always considering the SU and SB distributions as potential candidate models could substantially reduce the specification error risk that has long been associated with parametric methods, perhaps to an acceptable level in most applications.
- It is also recognized that the relative complexity of the proposed family versus the most commonly used alternatives could affect its widespread applicability.
- Thus, econometric modeling allowances must be made when working with data discontinuities such as censored yield observations due to droughts or flooding.
- It is also recognized that a statistically reliable use of the procedures discussed in this article requires at least moderate sample sizes (30–50 observations), which are often not available at the individual farm level.

Did you find this useful? Give us your feedback

...read more

##### Citations

44 citations

17 citations

### Cites background or methods from "A Flexible Parametric Family for th..."

...Five of the distributions estimated by Ramirez, McDonald, and Carpio (2010) are chosen for the purposes of this research....

[...]

...Ramirez, McDonald, and Carpio (2010) use this data to estimate models for those 26 yield distributions that are as realistic as possible....

[...]

...This system, which is composed of the SU and the SB families (Johnson 1949), can accommodate any mean-variance-skewness-kurtosis (MVSK) combination that might be encountered in practice (Ramirez, McDonald, and Carpio 2010, Ramirez and McDonald 2006a, Ramirez, Misra, and Field 2003)....

[...]

...Another advantage of using Ramirez, McDonald, and Carpio (2010) results is that they identify a variety of distributional shapes that span over a substantial area of the theoretically feasible skewness-kurtosis (SK) space.2 A thoughtfully selected subset of these 26 models should, therefore, be…...

[...]

...In this approach, the previously discussed Johnson system and the data corresponding to each unit of analysis are used to estimate the joint yield distributions by maximum likelihood (ML) estimation procedures (Ramirez 1997, Ramirez, McDonald, and Carpio 2010)....

[...]

17 citations

13 citations

8 citations

##### References

4,568 citations

3,105 citations

1,892 citations

### "A Flexible Parametric Family for th..." refers background or methods in this paper

...Figure 1 is constructed on the basis of the formulas for the skewness and kurtosis of the SU and SB distributions, which were also first derived by Johnson (1949)....

[...]

...That is, the mean and variance of the reparameterized SU and SB random variables (Yt F) are uniquely controlled by Mt and st2, while g and d determine their skewness and kurtosis according to the formulas provided by Johnson (1949) for the original SU and SB distributions....

[...]

...The reparameterization begins with the original two-parameter families (Johnson, 1949): (1) Z 5 g 1 d sinh 1 Y for the SU distribution (2) Z 5 g 1 d ln½Y=ð1 YÞ for the SB distribution where Y is a nonnormally distributed random variable based on a standard normal variable (Z)....

[...]

...Johnson (1949) also provides the formulas for computing their means and variances, which will be denoted by FSU and FSB (for the means) and GSU and GSB (for the variances)....

[...]

...Their pdfs, which are also provided by Johnson (1949), are obtained by substituting Z in Equation (1) (for the SU) or Equation (2) (for the SB) into a standard normal density and multiplying the resulting equation by the derivative of Equation (1) (for the SU) or Equation (2) (for the SB) with…...

[...]

1,228 citations

### "A Flexible Parametric Family for th..." refers background or methods in this paper

...According to basic statistical theory (Mood, Graybill, and Boes, 1974), the first four central moments of a pdf are the main descriptors of its shape....

[...]

...Since both originate from normal random variables (N), the transformation technique (Mood, Graybill, and Boes 1974) can be applied to derive their probability distribution functions....

[...]

452 citations

### "A Flexible Parametric Family for th..." refers background in this paper

..." Other authors cite theoretical complexity and intensive computational requirements as another disadvantage of nonparametric procedures (Yatchew, 1998)....

[...]

...Other authors cite theoretical complexity and intensive computational requirements as another disadvantage of nonparametric procedures (Yatchew, 1998)....

[...]