scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Bayes Estimates for the Linear Model

01 Sep 1972-Journal of the royal statistical society series b-methodological (John Wiley & Sons, Ltd)-Vol. 34, Iss: 1, pp 1-18
About: This article is published in Journal of the royal statistical society series b-methodological.The article was published on 1972-09-01. It has received 1908 citations till now. The article focuses on the topics: Bayes error rate & Bayes classifier.
Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined and derive a measure pD for the effective number in a model as the difference between the posterior mean of the deviances and the deviance at the posterior means of the parameters of interest, which is related to other information criteria and has an approximate decision theoretic justification.
Abstract: Summary. We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. Using an information theoretic argument we derive a measure pD for the effective number of parameters in a model as the difference between the posterior mean of the deviance and the deviance at the posterior means of the parameters of interest. In general pD approximately corresponds to the trace of the product of Fisher's information and the posterior covariance, which in normal models is the trace of the ‘hat’ matrix projecting observations onto fitted values. Its properties in exponential families are explored. The posterior mean deviance is suggested as a Bayesian measure of fit or adequacy, and the contributions of individual observations to the fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. Adding pD to the posterior mean deviance gives a deviance information criterion for comparing models, which is related to other information criteria and has an approximate decision theoretic justification. The procedure is illustrated in some examples, and comparisons are drawn with alternative Bayesian and classical proposals. Throughout it is emphasized that the quantities required are trivial to compute in a Markov chain Monte Carlo analysis.

11,691 citations

Book
01 Jan 1987
TL;DR: In this article, the authors present a general classification notation for multilevel models and a discussion of the general structure and maximum likelihood estimation for a multi-level model, as well as the adequacy of Ordinary Least Squares estimates.
Abstract: Contents Dedication Preface Acknowledgements Notation A general classification notation and diagram Glossary Chapter 1 An introduction to multilevel models 1.1 Hierarchically structured data 1.2 School effectiveness 1.3 Sample survey methods 1.4 Repeated measures data 1.5 Event history and survival models 1.6 Discrete response data 1.7 Multivariate models 1.8 Nonlinear models 1.9 Measurement errors 1.10 Cross classifications and multiple membership structures. 1.11 Factor analysis and structural equation models 1.12 Levels of aggregation and ecological fallacies 1.13 Causality 1.14 The latent normal transformation and missing data 1.15 Other texts 1.16 A caveat Chapter 2 The 2-level model 2.1 Introduction 2.2 The 2-level model 2.3 Parameter estimation 2.4 Maximum likelihood estimation using Iterative Generalised Least Squares (IGLS) 2.5 Marginal models and Generalized Estimating Equations (GEE) 2.6 Residuals 2.7 The adequacy of Ordinary Least Squares estimates. 2.8 A 2-level example using longitudinal educational achievement data 2.9 General model diagnostics 2.10 Higher level explanatory variables and compositional effects 2.11 Transforming to normality 2.12 Hypothesis testing and confidence intervals 2.13 Bayesian estimation using Markov Chain Monte Carlo (MCMC) 2.14 Data augmentation Appendix 2.1 The general structure and maximum likelihood estimation for a multilevel model Appendix 2.2 Multilevel residuals estimation Appendix 2.3 Estimation using profile and extended likelihood Appendix 2.4 The EM algorithm Appendix 2.5 MCMC sampling Chapter 3. Three level models and more complex hierarchical structures. 3.1 Complex variance structures 3.2 A 3-level complex variation model example. 3.3 Parameter Constraints 3.4 Weighting units 3.5 Robust (Sandwich) Estimators and Jacknifing 3.6 The bootstrap 3.7 Aggregate level analyses 3.8 Meta analysis 3.9 Design issues Chapter 4. Multilevel Models for discrete response data 4.1 Generalised linear models 4.2 Proportions as responses 4.3 Examples 4.4 Models for multiple response categories 4.5 Models for counts 4.6 Mixed discrete - continuous response models 4.7 A latent normal model for binary responses 4.8 Partitioning variation in discrete response models Appendix 4.1. Generalised linear model estimation Appendix 4.2 Maximum likelihood estimation for generalised linear models Appendix 4.3 MCMC estimation for generalised linear models Appendix 4.4. Bootstrap estimation for generalised linear models Chapter 5. Models for repeated measures data 5.1 Repeated measures data 5.2 A 2-level repeated measures model 5.3 A polynomial model example for adolescent growth and the prediction of adult height 5.4 Modelling an autocorrelation structure at level 1. 5.5 A growth model with autocorrelated residuals 5.6 Multivariate repeated measures models 5.7 Scaling across time 5.8 Cross-over designs 5.9 Missing data 5.10 Longitudinal discrete response data Chapter 6. Multivariate multilevel data 6.1 Introduction 6.2 The basic 2-level multivariate model 6.3 Rotation Designs 6.4 A rotation design example using Science test scores 6.5 Informative response selection: subject choice in examinations 6.6 Multivariate structures at higher levels and future predictions 6.7 Multivariate responses at several levels 6.8 Principal Components analysis Appendix 6.1 MCMC algorithm for a multivariate normal response model with constraints Chapter 7. Latent normal models for multivariate data 7.1 The normal multilevel multivariate model 7.2 Sampling binary responses 7.3 Sampling ordered categorical responses 7.4 Sampling unordered categorical responses 7.5 Sampling count data 7.6 Sampling continuous non-normal data 7.7 Sampling the level 1 and level 2 covariance matrices 7.8 Model fit 7.9 Partially ordered data 7.10 Hybrid normal/ordered variables 7.11 Discussion Chapter 8. Multilevel factor analysis, structural equation and mixture models 8.1 A 2-stage 2-level factor model 8.2 A general multilevel factor model 8.3 MCMC estimation for the factor model 8.4 Structural equation models 8.5 Discrete response multilevel structural equation models 8.6 More complex hierarchical latent variable models 8.7 Multilevel mixture models Chapter 9. Nonlinear multilevel models 9.1 Introduction 9.2 Nonlinear functions of linear components 9.3 Estimating population means 9.4 Nonlinear functions for variances and covariances 9.5 Examples of nonlinear growth and nonlinear level 1 variance Appendix 9.1 Nonlinear model estimation Chapter 10. Multilevel modelling in sample surveys 10.1 Sample survey structures 10.2 Population structures 10.3 Small area estimation Chapter 11 Multilevel event history and survival models 11.1 Introduction 11.2 Censoring 11.3 Hazard and survival funtions 11.4 Parametric proportional hazard models 11.5 The semiparametric Cox model 11.6 Tied observations 11.7 Repeated events proportional hazard models 11.8 Example using birth interval data 11.9 Log duration models 11.10 Examples with birth interval data and children s activity episodes 11.11 The grouped discrete time hazards model 11.12 Discrete time latent normal event history models Chapter 12. Cross classified data structures 12.1 Random cross classifications 12.2 A basic cross classified model 12.3 Examination results for a cross classification of schools 12.4 Interactions in cross classifications 12.5 Cross classifications with one unit per cell 12.6 Multivariate cross classified models 12.7 A general notation for cross classifications 12.8 MCMC estimation in cross classified models Appendix 12.1 IGLS Estimation for cross classified data. Chapter 13 Multiple membership models 13.1 Multiple membership structures 13.2 Notation and classifications for multiple membership structures 13.3 An example of salmonella infection 13.4 A repeated measures multiple membership model 13.5 Individuals as higher level units 13.5.1 Example of research grant awards 13.6 Spatial models 13.7 Missing identification models Appendix 13.1 MCMC estimation for multiple membership models. Chapter 14 Measurement errors in multilevel models 14.1 A basic measurement error model 14.2 Moment based estimators 14.3 A 2-level example with measurement error at both levels. 14.4 Multivariate responses 14.5 Nonlinear models 14.6 Measurement errors for discrete explanatory variables 14.7 MCMC estimation for measurement error models Appendix 14.1 Measurement error estimation 14.2 MCMC estimation for measurement error models Chapter 15. Smoothing models for multilevel data. 15.1 Introduction 15.2. Smoothing estimators 15.3 Smoothing splines 15.4 Semi parametric smoothing models 15.5 Multilevel smoothing models 15.6 General multilevel semi-parametric smoothing models 15.7 Generalised linear models 15.8 An example Fixed Random 15.9 Conclusions Chapter 16. Missing data, partially observed data and multiple imputation 16.1 Creating a completed data set 16.2 Joint modelling for missing data 16.3 A two level model with responses of different types at both levels. 16.4 Multiple imputation 16.5 A simulation example of multiple imputation for missing data 16.6 Longitudinal data with attrition 16.7 Partially known data values 16.8 Conclusions Chapter 17 Multilevel models with correlated random effects 17.1 Non-independence of level 2 residuals 17.2 MCMC estimation for non-independent level 2 residuals 17.3 Adaptive proposal distributions in MCMC estimation 17.4 MCMC estimation for non-independent level 1 residuals 17.5 Modelling the level 1 variance as a function of explanatory variables with random effects 17.6 Discrete responses with correlated random effects 17.7 Calculating the DIC statistic 17.8 A growth data set 17.9 Conclusions Chapter 18. Software for multilevel modelling References Author index Subject index

5,839 citations


Cites background or methods from "Bayes Estimates for the Linear Mode..."

  • ...2), and more general extensions, as a Bayesian linear model (Lindley and Smith, 1972) where the β j are assumed to be exchangeable and to have a prior distribution with variance σ u0 2 ....

    [...]

  • ...For the first order approximation the procedure outlined here is closely related to that given by Lindstrom and Bates (1990) for 2-level repeated measures data who consider a first order expansion about the unit-specific predicted values....

    [...]

Journal ArticleDOI
TL;DR: The pooled mean group estimator (PMG) estimator as discussed by the authors constrains long-run coefficients to be identical but allows short run coefficients and error variances to differ across groups.
Abstract: It is now quite common to have panels in which both T, the number of time series observations, and N, the number of groups, are quite large and of the same order of magnitude. The usual practice is either to estimate N separate regressions and calculate the coefficient means, which we call the mean group (MG) estimator, or to pool the data and assume that the slope coefficients and error variances are identical. In this article we propose an intermediate procedure, the pooled mean group (PMG) estimator, which constrains long-run coefficients to be identical but allows short-run coefficients and error variances to differ across groups. We consider both the case where the regressors are stationary and the case where they follow unit root processes, and for both cases derive the asymptotic distribution of the PMG estimators as T tends to infinity. We also provide two empirical applications: Aggregate consumption functions for 24 Organization for Economic Cooperation and Development economies over th...

4,592 citations

Journal ArticleDOI
TL;DR: In this paper, exact Bayesian methods for modeling categorical response data are developed using the idea of data augmentation, which can be summarized as follows: the probit regression model for binary outcomes is seen to have an underlying normal regression structure on latent continuous data, and values of the latent data can be simulated from suitable truncated normal distributions.
Abstract: A vast literature in statistics, biometrics, and econometrics is concerned with the analysis of binary and polychotomous response data. The classical approach fits a categorical response regression model using maximum likelihood, and inferences about the model are based on the associated asymptotic theory. The accuracy of classical confidence statements is questionable for small sample sizes. In this article, exact Bayesian methods for modeling categorical response data are developed using the idea of data augmentation. The general approach can be summarized as follows. The probit regression model for binary outcomes is seen to have an underlying normal regression structure on latent continuous data. Values of the latent data can be simulated from suitable truncated normal distributions. If the latent data are known, then the posterior distribution of the parameters can be computed using standard results for normal linear models. Draws from this posterior are used to sample new latent data, and t...

3,272 citations


Cites methods from "Bayes Estimates for the Linear Mode..."

  • ...The normal regression structure on Z also motivates the consideration of normal hierarchical models as presented in Lindley and Smith (1972). Given a particular probit model and regression parameter fJofdimension k, one may suspect that fJ lies on a linear subspace AfJo, where fJo is p-dimensional, where P < k....

    [...]

  • ...One can use standard theory for the normal hierarchical model (Lindley and Smith 1972) to obtain the posterior distributions of fJ and u 2 conditional on the latent data Z....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a restricted maximum likelihood (reml) approach which takes into account the loss in degrees of freedom resulting from estimating fixed effects, and developed a satisfactory asymptotic theory for estimators of variance components.
Abstract: Recent developments promise to increase greatly the popularity of maximum likelihood (ml) as a technique for estimating variance components. Patterson and Thompson (1971) proposed a restricted maximum likelihood (reml) approach which takes into account the loss in degrees of freedom resulting from estimating fixed effects. Miller (1973) developed a satisfactory asymptotic theory for ml estimators of variance components. There are many iterative algorithms that can be considered for computing the ml or reml estimates. The computations on each iteration of these algorithms are those associated with computing estimates of fixed and random effects for given values of the variance components.

2,440 citations

References
More filters
Book
01 Jan 1965
TL;DR: Algebra of Vectors and Matrices, Probability Theory, Tools and Techniques, and Continuous Probability Models.
Abstract: Algebra of Vectors and Matrices. Probability Theory, Tools and Techniques. Continuous Probability Models. The Theory of Least Squares and Analysis of Variance. Criteria and Methods of Estimation. Large Sample Theory and Methods. Theory of Statistical Inference. Multivariate Analysis. Publications of the Author. Author Index. Subject Index.

8,300 citations

Journal ArticleDOI
TL;DR: In this paper, an estimation procedure based on adding small positive quantities to the diagonal of X′X was proposed, which is a method for showing in two dimensions the effects of nonorthogonality.
Abstract: In multiple regression it is shown that parameter estimates based on minimum residual sum of squares have a high probability of being unsatisfactory, if not incorrect, if the prediction vectors are not orthogonal. Proposed is an estimation procedure based on adding small positive quantities to the diagonal of X′X. Introduced is the ridge trace, a method for showing in two dimensions the effects of nonorthogonality. It is then shown how to augment X′X to obtain biased estimates with smaller mean square error.

8,091 citations

Journal ArticleDOI
TL;DR: In this paper, the use of ridge regression methods is discussed and recommendations are made for obtaining a better regression equation than that given by ordinary least squares estimation. But the authors focus on the RIDGE TRACE which is a two-dimensional graphical procedure for portraying the complex relationships in multifactor data.
Abstract: This paper is an exposition of the use of ridge regression methods. Two examples from the literature are used as a base. Attention is focused on the RIDGE TRACE which is a two-dimensional graphical procedure for portraying the complex relationships in multifactor data. Recommendations are made for obtaining a better regression equation than that given by ordinary least squares estimation.

2,345 citations

Journal ArticleDOI
Donald W. Marquaridt1
TL;DR: In this article, the authors discuss a class of biased linear estimators employing generalized inverses and establish a unifying perspective on nonlinear estimation from nonorthogonal data.
Abstract: A principal objective of this paper is to discuss a class of biased linear estimators employing generalized inverses. A second objective is to establish a unifying perspective. The paper exhibits theoretical properties shared by generalized inverse estimators, ridge estimators, and corresponding nonlinear estimation procedures. From this perspective it becomes clear why all these methods work so well in practical estimation from nonorthogonal data.

1,828 citations

Book
01 Jan 1957

1,670 citations