scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The Need for More Emphasis on Prediction: A “Nondenominational” Model-Based Approach

20 May 2014-The American Statistician (Taylor & Francis)-Vol. 68, Iss: 2, pp 71-83
TL;DR: It is argued that the performance of a prediction procedure in repeated application is important and should play a significant role in its evaluation.
Abstract: Prediction problems are ubiquitous. In a model-based approach to predictive inference, the values of random variables that are presently observable are used to make inferences about the values of random variables that will become observable in the future, and the joint distribution of the random variables or various of its characteristics are assumed to be known up to the value of a vector of unknown parameters. Such an approach has proved to be highly effective in many important applications. This article argues that the performance of a prediction procedure in repeated application is important and should play a significant role in its evaluation. A “nondenominational” model-based approach to predictive inference is described and discussed; what in a Bayesian approach would be regarded as a prior distribution is simply regarded as part of a model that is hierarchical in nature. Some specifics are given for mixed-effects linear models, and an application to the prediction of the outcomes of basketball or ...
Citations
More filters
Journal ArticleDOI
TL;DR: The contribution of the discipline of statistics to scientific knowledge is widely recognized (McNutt 2014) with increasingly positive public percepti... as discussed by the authors, and this is an exciting time to be a statistician.
Abstract: This is an exciting time to be a statistician. The contribution of the discipline of statistics to scientific knowledge is widely recognized (McNutt 2014) with increasingly positive public percepti...

68 citations


Cites background from "The Need for More Emphasis on Predi..."

  • ...…technical areas of the field of statistics [Finzer, 2013] The data science education dilemma [Spiegelhalter, 2014] The future lies in uncertainty [Harville, 2014] The need for more emphasis on prediction: A ‘nondenomina- tional’ model-based approach [Madigan and Wasserstein, 2014] Statistics and…...

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors describe contributions of analytics and statistical methods to our understanding of insurance operations and markets, and describe current trends in analytics, and present the foundations of the discipline and the supporting literature.

27 citations

Journal ArticleDOI
TL;DR: In this analysis, the nonparametric model outperforms the parametric model in predicting costs of both renewal and new business, particularly important as healthcare costs rise around the world.
Abstract: Models commonly employed to fit current claims data and predict future claims are often parametric and relatively inflexible. An incorrect model assumption can cause model misspecification which leads to reduced profits at best and dangerous, unanticipated risk exposure at worst. Even mixture models may not be sufficiently flexible to properly fit the data. Using a Bayesian nonparametric model instead can dramatically improve claim predictions and consequently risk management decisions in group health practices. The improvement is significant in both simulated and real data from a major health insurer’s medium-sized groups. The nonparametric method outperforms a similar Bayesian parametric model, especially when predicting future claims for new business (entire groups not in the previous year’s data). In our analysis, the nonparametric model outperforms the parametric model in predicting costs of both renewal and new business. This is particularly important as healthcare costs rise around the world.

23 citations


Cites background from "The Need for More Emphasis on Predi..."

  • ...The importance of proper prediction is exemplified and described in both Klinker (2010) and Harville (2014)....

    [...]

Journal ArticleDOI
TL;DR: Results show that the proposed multi-trait, multi-environment model is an attractive alternative for modeling multiple count traits measured in multiple environments.
Abstract: When a plant scientist wishes to make genomic-enabled predictions of multiple traits measured in multiple individuals in multiple environments, the most common strategy for performing the analysis is to use a single trait at a time taking into account genotype × environment interaction (G × E), because there is a lack of comprehensive models that simultaneously take into account the correlated counting traits and G × E. For this reason, in this study we propose a multiple-trait and multiple-environment model for count data. The proposed model was developed under the Bayesian paradigm for which we developed a Markov Chain Monte Carlo (MCMC) with noninformative priors. This allows obtaining all required full conditional distributions of the parameters leading to an exact Gibbs sampler for the posterior distribution. Our model was tested with simulated data and a real data set. Results show that the proposed multi-trait, multi-environment model is an attractive alternative for modeling multiple count traits measured in multiple environments.

17 citations


Cites background or methods from "The Need for More Emphasis on Predi..."

  • ...…quantities that are regarded as unobservable at the time of the inferences; the focus is on quantities that will, or could, become observable in the future, or that can be observed at the time of the inferences but only with an unacceptable delay or with undue effort or expense (Harville 2014)....

    [...]

  • ...These quantities are regarded as the realizations of the elements of an unobservable random vector, say an M-dimensional unobservable random column vector w, and/or as realizations of linear combinations (or other functions) of the elements of w (Harville 2014)....

    [...]

  • ...Since predictionproblems are ubiquitous andof great interest and importance in statistical science, more attention has been given to parametric inference than to predictive inference (Harville 2014)....

    [...]

  • ...E-mail: j.crossa@cgiar.org Volume 7 | May 2017 | 1595 to prediction is a useful tool for predicting future observations, andmany linear mixed effects models have been developed for predicting future observations (Harville 2014)....

    [...]

Posted Content
TL;DR: This paper reviews two main types of prediction interval methods under a parametric framework: methods based on an (approximate) pivotal quantity and those based on a predictive distribution (sometimes derived based on the likelihood).
Abstract: This paper reviews two main types of prediction interval methods under a parametric framework. First, we describe methods based on an (approximate) pivotal quantity. Examples include the plug-in, pivotal, and calibration methods. Then we describe methods based on a predictive distribution (sometimes derived based on the likelihood). Examples include Bayesian, fiducial, and direct-bootstrap methods. Several examples involving continuous distributions along with simulation studies to evaluate coverage probability properties are provided. We provide specific connections among different prediction interval methods for the (log-)location-scale family of distributions. This paper also discusses general prediction interval methods for discrete data, using the binomial and Poisson distributions as examples. We also overview methods for dependent data, with application to time series, spatial data, and Markov random fields, for example.

10 citations


Cites background from "The Need for More Emphasis on Predi..."

  • ...While most statistics textbooks and courses emphasize the explanatory and descriptive roles of statistics, the topic of statistical prediction often receives less attention, despite its practical importance, as noted in recent commentaries (cf. Shmueli 2010 and Harville 2014)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: Cressie et al. as discussed by the authors presented the Statistics for Spatial Data (SDS) for the first time in 1991, and used it for the purpose of statistical analysis of spatial data.
Abstract: 5. Statistics for Spatial Data. By N. Cressie. ISBN 0 471 84336 9. Wiley, Chichester, 1991. 900 pp. £71.00.

5,555 citations

Book
02 Sep 2011
TL;DR: This chapter discusses the role of asymptotics for BLPs, and applications of equivalence and orthogonality of Gaussian measures to linear prediction, and the importance of Observations not part of a sequence.
Abstract: 1 Linear Prediction.- 1.1 Introduction.- 1.2 Best linear prediction.- Exercises.- 1.3 Hilbert spaces and prediction.- Exercises.- 1.4 An example of a poor BLP.- Exercises.- 1.5 Best linear unbiased prediction.- Exercises.- 1.6 Some recurring themes.- The Matern model.- BLPs and BLUPs.- Inference for differentiable random fields.- Nested models are not tenable.- 1.7 Summary of practical suggestions.- 2 Properties of Random Fields.- 2.1 Preliminaries.- Stationarity.- Isotropy.- Exercise.- 2.2 The turning bands method.- Exercise.- 2.3 Elementary properties of autocovariance functions.- Exercise.- 2.4 Mean square continuity and differentiability.- Exercises.- 2.5 Spectral methods.- Spectral representation of a random field.- Bochner's Theorem.- Exercises.- 2.6 Two corresponding Hilbert spaces.- An application to mean square differentiability.- Exercises.- 2.7 Examples of spectral densities on 112.- Rational spectral densities.- Principal irregular term.- Gaussian model.- Triangular autocovariance functions.- Matern class.- Exercises.- 2.8 Abelian and Tauberian theorems.- Exercises.- 2.9 Random fields with nonintegrable spectral densities.- Intrinsic random functions.- Semivariograms.- Generalized random fields.- Exercises.- 2.10 Isotropic autocovariance functions.- Characterization.- Lower bound on isotropic autocorrelation functions.- Inversion formula.- Smoothness properties.- Matern class.- Spherical model.- Exercises.- 2.11 Tensor product autocovariances.- Exercises.- 3 Asymptotic Properties of Linear Predictors.- 3.1 Introduction.- 3.2 Finite sample results.- Exercise.- 3.3 The role of asymptotics.- 3.4 Behavior of prediction errors in the frequency domain.- Some examples.- Relationship to filtering theory.- Exercises.- 3.5 Prediction with the wrong spectral density.- Examples of interpolation.- An example with a triangular autocovariance function.- More criticism of Gaussian autocovariance functions.- Examples of extrapolation.- Pseudo-BLPs with spectral densities misspecified at high frequencies.- Exercises.- 3.6 Theoretical comparison of extrapolation and ointerpolation.- An interpolation problem.- An extrapolation problem.- Asymptotics for BLPs.- Inefficiency of pseudo-BLPs with misspecified high frequency behavior.- Presumed mses for pseudo-BLPs with misspecified high frequency behavior.- Pseudo-BLPs with correctly specified high frequency behavior.- Exercises.- 3.7 Measurement errors.- Some asymptotic theory.- Exercises.- 3.8 Observations on an infinite lattice.- Characterizing the BLP.- Bound on fraction of mse of BLP attributable to a set of frequencies.- Asymptotic optimality of pseudo-BLPs.- Rates of convergence to optimality.- Pseudo-BLPs with a misspecified mean function.- Exercises.- 4 Equivalence of Gaussian Measures and Prediction.- 4.1 Introduction.- 4.2 Equivalence and orthogonality of Gaussian measures.- Conditions for orthogonality.- Gaussian measures are equivalent or orthogonal.- Determining equivalence or orthogonality for periodic random fields.- Determining equivalence or orthogonality for nonperiodic random fields.- Measurement errors and equivalence and orthogonality.- Proof of Theorem 1.- Exercises.- 4.3 Applications of equivalence of Gaussian measures to linear prediction.- Asymptotically optimal pseudo-BLPs.- Observations not part of a sequence.- A theorem of Blackwell and Dubins.- Weaker conditions for asymptotic optimality of pseudo-BLPs.- Rates of convergence to asymptotic optimality.- Asymptotic optimality of BLUPs.- Exercises.- 4.4 Jeffreys's law.- A Bayesian version.- Exercises.- 5 Integration of Random Fields.- 5.1 Introduction.- 5.2 Asymptotic properties of simple average.- Results for sufficiently smooth random fields.- Results for sufficiently rough random fields.- Exercises.- 5.3 Observations on an infinite lattice.- Asymptotic mse of BLP.- Asymptotic optimality of simple average.- Exercises.- 5.4 Improving on the sample mean.- Approximating $$\int_0^1 {\exp } (ivt)dt$$.- Approximating $$\int_{{{[0,1]}^d}} {\exp (i{\omega ^T}x)} dx$$ in more than one dimension.- Asymptotic properties of modified predictors.- Are centered systematic samples good designs?.- Exercises.- 5.5 Numerical results.- Exercises.- 6 Predicting With Estimated Parameters.- 6.1 Introduction.- 6.2 Microergodicity and equivalence and orthogonality of Gaussian measures.- Observations with measurement error.- Exercises.- 6.3 Is statistical inference for differentiable processes possible?.- An example where it is possible.- Exercises.- 6.4 Likelihood Methods.- Restricted maximum likelihood estimation.- Gaussian assumption.- Computational issues.- Some asymptotic theory.- Exercises.- 6.5 Matern model.- Exercise.- 6.6 A numerical study of the Fisher information matrix under the Matern model.- No measurement error and?unknown.- No measurement error and?known.- Observations with measurement error.- Conclusions.- Exercises.- 6.7 Maximum likelihood estimation for a periodic version of the Matern model.- Discrete Fourier transforms.- Periodic case.- Asymptotic results.- Exercises.- 6.8 Predicting with estimated parameters.- Jeffreys's law revisited.- Numerical results.- Some issues regarding asymptotic optimality.- Exercises.- 6.9 An instructive example of plug-in prediction.- Behavior of plug-in predictions.- Cross-validation.- Application of Matern model.- Conclusions.- Exercises.- 6.10 Bayesian approach.- Application to simulated data.- Exercises.- A Multivariate Normal Distributions.- B Symbols.- References.

2,998 citations


"The Need for More Emphasis on Predi..." refers methods in this paper

  • ...Kriging is used in connection with a variable that is defined at multiple geographical locations; the values of the variable at the locations where the values are known are used to “predict” the values at the locations where they are unknown—refer, for example, to Cressie (1993) or Stein (1999)....

    [...]

Journal ArticleDOI
TL;DR: Algorithmic models have been widely used in fields outside statistics as discussed by the authors, both in theory and practice, and can be used both on large complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets.
Abstract: There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theory and practice, has developed rapidly in fields outside statistics. It can be used both on large complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets. If our goal as a field is to use data to solve problems, then we need to move away from exclusive dependence on data models and adopt a more diverse set of tools.

2,948 citations


"The Need for More Emphasis on Predi..." refers background in this paper

  • ...In recent years, the algorithmic approach to prediction favored by the datamining and computer-science communities has received much attention—refer, for example, to Breiman (2001)....

    [...]

Book
01 Jan 2001
TL;DR: In this paper, the authors present a model for estimating the effect of random effects on a set of variables in a linear mixed model with the objective of finding the probability of a given variable having a given effect.
Abstract: Preface. Preface to the First Edition. 1. Introduction. 1.1 Models. 1.2 Factors, Levels, Cells, Effects And Data. 1.3 Fixed Effects Models. 1.4 Random Effects Models. 1.5 Linear Mixed Models (Lmms). 1.6 Fixed Or Random? 1.7 Inference. 1.8 Computer Software. 1.9 Exercises. 2. One-Way Classifications. 2.1 Normality And Fixed Effects. 2.2 Normality, Random Effects And MLE. 2.3 Normality, Random Effects And REM1. 2.4 More On Random Effects And Normality. 2.5 Binary Data: Fixed Effects. 2.6 Binary Data: Random Effects. 2.7 Computing. 2.8 Exercises. 3. Single-Predictor Regression. 3.1 Introduction. 3.2 Normality: Simple Linear Regression. 3.3 Normality: A Nonlinear Model. 3.4 Transforming Versus Linking. 3.5 Random Intercepts: Balanced Data. 3.6 Random Intercepts: Unbalanced Data. 3.7 Bernoulli - Logistic Regression. 3.8 Bernoulli - Logistic With Random Intercepts. 3.9 Exercises. 4. Linear Models (LMs). 4.1 A General Model. 4.2 A Linear Model For Fixed Effects. 4.3 Mle Under Normality. 4.4 Sufficient Statistics. 4.5 Many Apparent Estimators. 4.6 Estimable Functions. 4.7 A Numerical Example. 4.8 Estimating Residual Variance. 4.9 Comments On The 1- And 2-Way Classifications. 4.10 Testing Linear Hypotheses. 4.11 T-Tests And Confidence Intervals. 4.12 Unique Estimation Using Restrictions. 4.13 Exercises. 5. Generalized Linear Models (GLMs). 5.1 Introduction. 5.2 Structure Of The Model. 5.3 Transforming Versus Linking. 5.4 Estimation By Maximum Likelihood. 5.5 Tests Of Hypotheses. 5.6 Maximum Quasi-Likelihood. 5.7 Exercises. 6. Linear Mixed Models (LMMs). 6.1 A General Model. 6.2 Attributing Structure To VAR(y). 6.3 Estimating Fixed Effects For V Known. 6.4 Estimating Fixed Effects For V Unknown. 6.5 Predicting Random Effects For V Known. 6.6 Predicting Random Effects For V Unknown. 6.7 Anova Estimation Of Variance Components. 6.8 Maximum Likelihood (Ml) Estimation. 6.9 Restricted Maximum Likelihood (REMl). 6.10 Notes And Extensions. 6.11 Appendix For Chapter 6. 6.12 Exercises. 7. Generalized Linear Mixed Models. 7.1 Introduction. 7.2 Structure Of The Model. 7.3 Consequences Of Having Random Effects. 7.4 Estimation By Maximum Likelihood. 7.5 Other Methods Of Estimation. 7.6 Tests Of Hypotheses. 7.7 Illustration: Chestnut Leaf Blight. 7.8 Exercises. 8. Models for Longitudinal data. 8.1 Introduction. 8.2 A Model For Balanced Data. 8.3 A Mixed Model Approach. 8.4 Random Intercept And Slope Models. 8.5 Predicting Random Effects. 8.6 Estimating Parameters. 8.7 Unbalanced Data. 8.8 Models For Non-Normal Responses. 8.9 A Summary Of Results. 8.10 Appendix. 8.11 Exercises. 9. Marginal Models. 9.1 Introduction. 9.2 Examples Of Marginal Regression Models. 9.3 Generalized Estimating Equations. 9.4 Contrasting Marginal And Conditional Models. 9.5 Exercises. 10. Multivariate Models. 10.1 Introduction. 10.2 Multivariate Normal Outcomes. 10.3 Non-Normally Distributed Outcomes. 10.4 Correlated Random Effects. 10.5 Likelihood Based Analysis. 10.6 Example: Osteoarthritis Initiative. 10.7 Notes And Extensions. 10.8 Exercises. 11. Nonlinear Models. 11.1 Introduction. 11.2 Example: Corn Photosynthesis. 11.3 Pharmacokinetic Models. 11.4 Computations For Nonlinear Mixed Models. 11.5 Exercises. 12. Departures From Assumptions. 12.1 Introduction. 12.2 Misspecifications Of Conditional Model For Response. 12.3 Misspecifications Of Random Effects Distribution. 12.4 Methods To Diagnose And Correct For Misspecifications. 12.5 Exercises. 13. Prediction. 13.1 Introduction. 13.2 Best Prediction (BP). 13.3 Best Linear Prediction (BLP). 13.4 Linear Mixed Model Prediction (BLUP). 13.5 Required Assumptions. 13.6 Estimated Best Prediction. 13.7 Henderson's Mixed Model Equations. 13.8 Appendix. 13.9 Exercises. 14. Computing. 14.1 Introduction. 14.2 Computing Ml Estimates For LMMs. 14.3 Computing Ml Estimates For GLMMs. 14.4 Penalized Quasi-Likelihood And Laplace. 14.5 Exercises. Appendix M: Some Matrix Results. M.1 Vectors And Matrices Of Ones. M.2 Kronecker (Or Direct) Products. M.3 A Matrix Notation. M.4 Generalized Inverses. M.5 Differential Calculus. Appendix S: Some Statistical Results. S.1 Moments. S.2 Normal Distributions. S.3 Exponential Families. S.4 Maximum Likelihood. S.5 Likelihood Ratio Tests. S.6 MLE Under Normality. References. Index.

2,742 citations

Journal ArticleDOI
TL;DR: A recently devised method of prediction based on sample reuse techniques that is most useful in low structure data paradigms that involve minimal assumptions is presented.
Abstract: An account is given of a recently devised method of prediction based on sample reuse techniques. It is most useful in low structure data paradigms that involve minimal assumptions. A series of applications demonstrating the technique is presented.

2,278 citations


"The Need for More Emphasis on Predi..." refers background in this paper

  • ...According to Geisser (1975), “the prediction of observables or potential observables is of much greater relevance than the estimation of what are often artificial constructs-parameters.”...

    [...]

  • ...According to Geisser (1975), “the prediction of observables or potential observables is of much greater relevance than the estimation of what are often artificial constructs-parameters....

    [...]

  • ...According to Geisser (1975), “the prediction of observables or potential observables is of much greater relevance than the estimation of what are often artificial constructs-parameters.” And W. E. Deming made some remarks that were interpreted by Wallis (1980) to the effect that “the only useful function of a statistician is to make predictions, and thus to provide a basis for action....

    [...]