scispace - formally typeset
Search or ask a question
Posted Content

Bayesian Model Averaging for Generalized Linear Models with Missing Covariates

01 May 2013-Research Papers in Economics (Einaudi Institute for Economics and Finance (EIEF))-
TL;DR: In this article, the authors address the problem of estimating generalized linear models (GLMs) when the outcome of interest is always observed, the values of some covariates are missing for some observations, but imputations are available to fill-in the missing values.
Abstract: We address the problem of estimating generalized linear models (GLMs) when the outcome of interest is always observed, the values of some covariates are missing for some observations, but imputations are available to fill-in the missing values. Under certain conditions on the missing-data mechanism and the imputation model, this situation generates a trade-off between bias and precision in the estimation of the parameters of interest. The complete cases are often too few, so precision is lost, but just filling-in the missing values with the imputations may lead to bias when the imputation model is either incorrectly specified or uncongenial. Following the generalized missing-indicator approach originally proposed by Dardanoni et al. (2011) for linear regression models, we characterize this bias-precision trade- off in terms of model uncertainty regarding which covariates should be dropped from an augmented GLM for the full sample of observed and imputed data. This formulation is attractive because model uncertainty can then be handled very naturally through Bayesian model averaging (BMA). In addition to applying the generalized missing-indicator method to the wider class of GLMs, we make two extensions. First, we propose a block-BMA strategy that incorporates information on the available missing-data patterns and has the advantage of being computationally simple. Second, we allow the observed outcome to be multivariate, thus covering the case of seemingly unrelated regression equations models, and ordered, multinomial or conditional logit and probit models. Our approach is illustrated through an empirical application using the first wave of the Survey on Health, Aging and Retirement in Europe (SHARE).
Citations
More filters
Journal ArticleDOI
TL;DR: In this article, the authors describe the imputation methodology used in the first two waves of SHARE, which is the fully conditional specification approach of van Buuren, Brand, Groothuis-Oudshoorn, and Rubin (2006).
Abstract: The Survey of Health, Aging and Retirement in Europe (SHARE), like all large household surveys, suffers from the problem of item non-response, and hence the need of imputation of missing values arises In this paper I describe the imputation methodology used in the first two waves of SHARE, which is the fully conditional specification approach of van Buuren, Brand, Groothuis-Oudshoorn, and Rubin (2006) Methods for assessing the convergence of the imputation process are also discussed Finally, I give details on numerous issues affecting the implementation of the imputation process that are particular to SHARE

74 citations

Book ChapterDOI
01 Jan 2018
TL;DR: In this paper, the authors provide an overview of frequentist model averaging, including different methods for selecting the model weights, including bagging, weighted AIC, stacking and focussed methods.
Abstract: We provide an overview of frequentist model averaging. For point estimation, we consider different methods for selecting the model weights, including those based on AIC, bagging, weighted AIC, stacking and focussed methods. For interval estimation, we consider Wald, MATA and percentile-bootstrap intervals. Use of the methods are illustrated by examples involving real data.

11 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

38,681 citations

Book
19 Jun 2013
TL;DR: The second edition of this book is unique in that it focuses on methods for making formal statistical inference from all the models in an a priori set (Multi-Model Inference).
Abstract: Introduction * Information and Likelihood Theory: A Basis for Model Selection and Inference * Basic Use of the Information-Theoretic Approach * Formal Inference From More Than One Model: Multi-Model Inference (MMI) * Monte Carlo Insights and Extended Examples * Statistical Theory and Numerical Results * Summary

36,993 citations

01 Jan 2005
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

36,760 citations

Book
01 Jan 2001
TL;DR: This is the essential companion to Jeffrey Wooldridge's widely-used graduate text Econometric Analysis of Cross Section and Panel Data (MIT Press, 2001).
Abstract: The second edition of this acclaimed graduate text provides a unified treatment of two methods used in contemporary econometric research, cross section and data panel methods. By focusing on assumptions that can be given behavioral content, the book maintains an appropriate level of rigor while emphasizing intuitive thinking. The analysis covers both linear and nonlinear models, including models with dynamics and/or individual heterogeneity. In addition to general estimation frameworks (particular methods of moments and maximum likelihood), specific linear and nonlinear methods are covered in detail, including probit and logit models and their multivariate, Tobit models, models for count data, censored and missing data schemes, causal (or treatment) effects, and duration analysis. Econometric Analysis of Cross Section and Panel Data was the first graduate econometrics text to focus on microeconomic data structures, allowing assumptions to be separated into population and sampling assumptions. This second edition has been substantially updated and revised. Improvements include a broader class of models for missing data problems; more detailed treatment of cluster problems, an important topic for empirical researchers; expanded discussion of "generalized instrumental variables" (GIV) estimation; new coverage (based on the author's own recent research) of inverse probability weighting; a more complete framework for estimating treatment effects with panel data, and a firmly established link between econometric approaches to nonlinear panel data and the "generalized estimating equation" literature popular in statistics and other fields. New attention is given to explaining when particular econometric methods can be applied; the goal is not only to tell readers what does work, but why certain "obvious" procedures do not. The numerous included exercises, both theoretical and computer-based, allow the reader to extend methods covered in the text and discover new insights.

28,298 citations

Book
01 Jan 1983
TL;DR: In this paper, a generalization of the analysis of variance is given for these models using log- likelihoods, illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables), and gamma (variance components).
Abstract: The technique of iterative weighted linear regression can be used to obtain maximum likelihood estimates of the parameters with observations distributed according to some exponential family and systematic effects that can be made linear by a suitable transformation. A generalization of the analysis of variance is given for these models using log- likelihoods. These generalized linear models are illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables) and gamma (variance components).

23,215 citations