scispace - formally typeset
Search or ask a question
Author

Peter Congdon

Bio: Peter Congdon is an academic researcher from Queen Mary University of London. The author has contributed to research in topics: Population & Life expectancy. The author has an hindex of 30, co-authored 215 publications receiving 5418 citations. Previous affiliations of Peter Congdon include Harold Wood Hospital & John Wiley & Sons.


Papers
More filters
Book
Peter Congdon1
02 May 2001
TL;DR: In this article, Congdon's Bayesian Statistical Modelling is used as a reference source for Bayesian models and literature, and a large number of models, e.g., for standard distributions, classification, regression, hierarchical pooling of information, missing data, correlated data, multivariate data, time series, spatial data, longitudinal data, measurement error, life table and survival analysis.
Abstract: Peter Congdon's Bayesian Statistical Modelling is not a teaching textbook or introduction to Bayesian statistical modelling. Although the basics of Bayesian theory and Markov Chain Monte Carlo (MCMC) methods are briefly reviewed in the book, I think that one should already be familiar with those topics before using the book. Given that, the book can be very helpful to an applied statistician, as it is an excellent reference source for Bayesian models and literature. Using nearly 200 worked examples with data examples and computer code available via the World Wide Web, the book reviews a large number of models, e.g., for standard distributions, classification, regression, hierarchical pooling of information, missing data, correlated data, multivariate data, time series, spatial data, longitudinal data, measurement error, life table and survival analysis. Each chapter starts with an introduction to the model family and then continues with describing variations to basic models, with advice as to model identification, prior selection, interpretation of findings, and computing choices and strategies. In the last chapter also the Bayesian model assessment is briefly reviewed. With 500 pages in the book, there are about 2.5 pages per example, and consequently I believe that in most cases it would be necessary to read also some of the references in order to fully benefit from the models described. Although the data examples are mainly from medical science, public health and the social sciences, the book should be interesting to any applied statistician seeking new possibilities in data analysis. Aki Vehtari

814 citations

Journal ArticleDOI
TL;DR: In this paper, a method for studying DIF is demonstrated that can be used with either dichotomous or polytomous items, and the method is shown to be valid for data that follow a partial credit IRT model.
Abstract: In this paper a method for studying DIF is demonstrated that can used with either dichotomous or polytomous items. The method is shown to be valid for data that follow a partial credit IRT model. It is also shown that logistic regression gives results equivalent to those of the proposed method. In a simulation study, positively biased type 1 error rates of the method are shown to be in accord with results from previous studies; however the size of the bias in the log odds is moderate. Finally, it is demonstrated how these statistics can be used to study DIF variability with the method of Longford, Holland, & Thayer (1993). Much work has been done in recent years in the area of differential item functioning (DIF). Identifying items which exhibit DIF is a preliminary step in the current practice of assessing item and test bias. The ultimate rationale is that removal or modification of biased items will improve the validity of a test, and in conjunction with more direct assessments of validity, will result in a test that is fair to all groups of examinees. This approach to DIF has an intrinsic statistical problem: because many analyses have been based on estimation and testing for individual items, characteristics of clusters of items or of the test as a whole may go unnoticed. Rubin (1988) suggested that in addition to a DIF estimate for each item, a measure of the variability of the DIF would be desirable across items for addressing the question of whether or not the measure of DIF in any single item was important when compared to the overall test. He proposed that methods be devised to handle all items simultaneously. A number of statistical methods have been devised to address DIF more holistically than the one-item-at-a-time approach. These methods fall into a category termed differential test functioning (DTF) to distinguish them from differential item functioning (DIF). Within the DTF category, three major subdivisions can be identified. First, DTF may be obtained as the expected signed or unsigned difference between two test (or subtest) response functions. Signed DIF may accumulate across a selected group of items; for example, if all items

620 citations

Book
01 Jan 2003
TL;DR: The Basis for, and Advantages of, Bayesian Model Estimation via Repeated Sampling via Repeations Sampling are explained and models for Spatial Outcomes and Geographical Association are described.
Abstract: Preface. The Basis for, and Advantages of, Bayesian Model Estimation via Repeated Sampling. Hierarchical Mixture Models. Regression Models. Analysis of Multi-Level Data. Models for Time Series. Analysis of Panel Data. Models for Spatial Outcomes and Geographical Association. Structural Equation and Latent Variable Models. Survival and Event History Models. Modelling and Establishing Causal Relations: Epidemiological Methods and Models. Index.

596 citations

Book
11 Jul 2005
TL;DR: In this paper, the authors present a model comparison and choice approach for binary and count regression, which is based on the linear regression model and generalized linear models, respectively, for ordinal data.
Abstract: Preface. Chapter 1 Principles of Bayesian Inference. 1.1 Bayesian updating. 1.2 MCMC techniques. 1.3 The basis for MCMC. 1.4 MCMC sampling algorithms. 1.5 MCMC convergence. 1.6 Competing models. 1.7 Setting priors. 1.8 The normal linear model and generalized linear models. 1.9 Data augmentation. 1.10 Identifiability. 1.11 Robustness and sensitivity. 1.12 Chapter themes. References. Chapter 2 Model Comparison and Choice. 2.1 Introduction: formal methods, predictive methods and penalized deviance criteria. 2.2 Formal Bayes model choice. 2.3 Marginal likelihood and Bayes factor approximations. 2.4 Predictive model choice and checking. 2.5 Posterior predictive checks. 2.6 Out-of-sample cross-validation. 2.7 Penalized deviances from a Bayes perspective. 2.8 Multimodel perspectives via parallel sampling. 2.9 Model probability estimates from parallel sampling. 2.10 Worked example. References. Chapter 3 Regression for Metric Outcomes. 3.1 Introduction: priors for the linear regression model. 3.2 Regression model choice and averaging based on predictor selection. 3.3 Robust regression methods: models for outliers. 3.4 Robust regression methods: models for skewness and heteroscedasticity. 3.5 Robustness via discrete mixture models. 3.6 Non-linear regression effects via splines and other basis functions. 3.7 Dynamic linear models and their application in non-parametric regression. Exercises. References. Chapter 4 Models for Binary and Count Outcomes. 4.1 Introduction: discrete model likelihoods vs. data augmentation. 4.2 Estimation by data augmentation: the Albert-Chib method. 4.3 Model assessment: outlier detection and model checks. 4.4 Predictor selection in binary and count regression. 4.5 Contingency tables. 4.6 Semi-parametric and general additive models for binomial and count responses. Exercises. References. Chapter 5 Further Questions in Binomial and Count Regression. 5.1 Generalizing the Poisson and binomial: overdispersion and robustness. 5.2 Continuous mixture models. 5.3 Discrete mixtures. 5.4 Hurdle and zero-inflated models. 5.5 Modelling the link function. 5.6 Multivariate outcomes. Exercises. References. Chapter 6 Random Effect and Latent Variable Models for Multicategory Outcomes. 6.1 Multicategory data: level of observation and relations between categories. 6.2 Multinomial models for individual data: modelling choices. 6.3 Multinomial models for aggregated data: modelling contingency tables. 6.4 The multinomial probit. 6.5 Non-linear predictor effects. 6.6 Heterogeneity via the mixed logit. 6.7 Aggregate multicategory data: the multinomial-Dirichlet model and extensions. 6.8 Multinomial extra variation. 6.9 Latent class analysis. Exercises. References. Chapter 7 Ordinal Regression. 7.1 Aspects and assumptions of ordinal data models. 7.2 Latent scale and data augmentation. 7.3 Assessing model assumptions: non-parametric ordinal regression and assessing ordinality. 7.4 Location-scale ordinal regression. 7.5 Structural interpretations with aggregated ordinal data. 7.6 Log-linear models for contingency tables with ordered categories. 7.7 Multivariate ordered outcomes. Exercises. References. Chapter 8Discrete Spatial Data. 8.1 Introduction. 8.2 Univariate responses: the mixed ICAR model and extensions. 8.3 Spatial robustness. 8.4 Multivariate spatial priors. 8.5 Varying predictor effect models. Exercises. References. Chapter 9 Time Series Models for Discrete Variables. 9.1 Introduction: time dependence in observations and latent data. 9.2 Observation-driven dependence. 9.3 Parameter-driven dependence via DLMs. 9.4 Parameter-driven dependence via autocorrelated error models. 9.5 Integer autoregressive models. 9.6 Hidden Markov models. Exercises. References. Chapter 10 Hierarchical and Panel Data Models 10.1 Introduction: clustered data and general linear mixed models. 10.2 Hierarchical models for metric outcomes. 10.3 Hierarchical generalized linear models. 10.4 Random effects for crossed factors. 10.5 The general linear mixed model for panel data. 10.6 Conjugate panel models. 10.7 Growth curve analysis. 10.8 Multivariate panel data. 10.9 Robustness in panel and clustered data analysis. 10.10 APC and spatio-temporal models. 10.11 Space-time and spatial APC models. Exercises. References. Chapter 11 Missing-Data Models. 11.1 Introduction: types of missing data. 11.2 Density mechanisms for missing data. 11.3 Auxiliary variables. 11.4 Predictors with missing values. 11.5 Multiple imputation. 11.6 Several responses with missing values. 11.7 Non-ignorable non-response models for survey tabulations. 11.8 Recent developments. Exercises. References. Index.

172 citations

Journal ArticleDOI
TL;DR: The results suggest generally higher levels of ill health for individuals who are older, not married, in a semi/unskilled manual social class, and socioeconomically deprived (as measured by a composite deprivation score).
Abstract: STUDY OBJECTIVE: To assess the nature of the relation between health and social factors at both the aggregated scale of geographical areas and the individual scale. DESIGN AND SETTING: The individual data are derived from the sample of anonymised records (SAR) from the census of 1991 in Great Britain, and are combined with area data from this census. The ecological setting (context) was defined using multivariate methods to classify the 278 districts of residence identifiable in the SAR. The outcome health variable is the 1991 census long-term limiting illness question. Health variations were analysed by multilevel logistic regression to examine the compositional variation (at the level of the individual) and the contextual variation (variability operating at the level of districts) in reported illness. PARTICIPANTS: 10 per cent randomised subsample of the SAR who are aged 16+ and are resident in households. MAIN RESULTS: The multi-level modelling revealed that area factors have a significant association with individual health outcome but their effect is smaller than that of individual attributes. The results show evidence for both compositional and contextual effects in the pattern of variation in propensity to report illness. CONCLUSIONS: The results suggest generally higher levels of ill health for individuals who are older, not married, in a semi/unskilled manual social class, and socioeconomically deprived (as measured by a composite deprivation score). All individuals living in areas with high levels of illness (which tend to be more deprived areas) show greater morbidity, even after allowing for their individual characteristics. However, within affluent areas, where morbidity was generally lower, the health inequality (health gradient) between rich and poor individuals was particularly strong. We consider the implications of these findings for health and resource allocation policy.

171 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, the authors make a case for the importance of reporting variance explained (R2) as a relevant summarizing statistic of mixed-effects models, which is rare, even though R2 is routinely reported for linear models and also generalized linear models (GLM).
Abstract: Summary The use of both linear and generalized linear mixed-effects models (LMMs and GLMMs) has become popular not only in social and medical sciences, but also in biological sciences, especially in the field of ecology and evolution. Information criteria, such as Akaike Information Criterion (AIC), are usually presented as model comparison tools for mixed-effects models. The presentation of ‘variance explained’ (R2) as a relevant summarizing statistic of mixed-effects models, however, is rare, even though R2 is routinely reported for linear models (LMs) and also generalized linear models (GLMs). R2 has the extremely useful property of providing an absolute value for the goodness-of-fit of a model, which cannot be given by the information criteria. As a summary statistic that describes the amount of variance explained, R2 can also be a quantity of biological interest. One reason for the under-appreciation of R2 for mixed-effects models lies in the fact that R2 can be defined in a number of ways. Furthermore, most definitions of R2 for mixed-effects have theoretical problems (e.g. decreased or negative R2 values in larger models) and/or their use is hindered by practical difficulties (e.g. implementation). Here, we make a case for the importance of reporting R2 for mixed-effects models. We first provide the common definitions of R2 for LMs and GLMs and discuss the key problems associated with calculating R2 for mixed-effects models. We then recommend a general and simple method for calculating two types of R2 (marginal and conditional R2) for both LMMs and GLMMs, which are less susceptible to common problems. This method is illustrated by examples and can be widely employed by researchers in any fields of research, regardless of software packages used for fitting mixed-effects models. The proposed method has the potential to facilitate the presentation of R2 for a wide range of circumstances.

7,749 citations

Journal ArticleDOI

6,278 citations

Journal ArticleDOI
TL;DR: As an example of how the current "war on terrorism" could generate a durable civic renewal, Putnam points to the burst in civic practices that occurred during and after World War II, which he says "permanently marked" the generation that lived through it and had a "terrific effect on American public life over the last half-century."
Abstract: The present historical moment may seem a particularly inopportune time to review Bowling Alone, Robert Putnam's latest exploration of civic decline in America. After all, the outpouring of volunteerism, solidarity, patriotism, and self-sacrifice displayed by Americans in the wake of the September 11 terrorist attacks appears to fly in the face of Putnam's central argument: that \"social capital\" -defined as \"social networks and the norms of reciprocity and trustworthiness that arise from them\" (p. 19)'has declined to dangerously low levels in America over the last three decades. However, Putnam is not fazed in the least by the recent effusion of solidarity. Quite the contrary, he sees in it the potential to \"reverse what has been a 30to 40-year steady decline in most measures of connectedness or community.\"' As an example of how the current \"war on terrorism\" could generate a durable civic renewal, Putnam points to the burst in civic practices that occurred during and after World War II, which he says \"permanently marked\" the generation that lived through it and had a \"terrific effect on American public life over the last half-century.\" 3 If Americans can follow this example and channel their current civic

5,309 citations

Journal ArticleDOI
TL;DR: This work considers approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models, where the latent field is Gaussian, controlled by a few hyperparameters and with non‐Gaussian response variables and can directly compute very accurate approximations to the posterior marginals.
Abstract: Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalized) linear models, (generalized) additive models, smoothing spline models, state space models, semiparametric regression, spatial and spatiotemporal models, log-Gaussian Cox processes and geostatistical and geoadditive models. We consider approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models, where the latent field is Gaussian, controlled by a few hyperparameters and with non-Gaussian response variables. The posterior marginals are not available in closed form owing to the non-Gaussian response variables. For such models, Markov chain Monte Carlo methods can be implemented, but they are not without problems, in terms of both convergence and computational time. In some practical applications, the extent of these problems is such that Markov chain Monte Carlo sampling is simply not an appropriate tool for routine analysis. We show that, by using an integrated nested Laplace approximation and its simplified version, we can directly compute very accurate approximations to the posterior marginals. The main benefit of these approximations is computational: where Markov chain Monte Carlo algorithms need hours or days to run, our approximations provide more precise estimates in seconds or minutes. Another advantage with our approach is its generality, which makes it possible to perform Bayesian analysis in an automatic, streamlined way, and to compute model comparison criteria and various predictive measures so that models can be compared and the model under study can be challenged.

4,164 citations

Journal ArticleDOI
TL;DR: It is concluded that multiple Imputation for Nonresponse in Surveys should be considered as a legitimate method for answering the question of why people do not respond to survey questions.
Abstract: 25. Multiple Imputation for Nonresponse in Surveys. By D. B. Rubin. ISBN 0 471 08705 X. Wiley, Chichester, 1987. 258 pp. £30.25.

3,216 citations