scispace - formally typeset
Search or ask a question
Author

John Hinde

Bio: John Hinde is an academic researcher from National University of Ireland, Galway. The author has contributed to research in topics: Overdispersion & Count data. The author has an hindex of 28, co-authored 92 publications receiving 4964 citations. Previous affiliations of John Hinde include Lancaster University & University of Exeter.


Papers
More filters
Book
16 Mar 1989
TL;DR: The GLIM 3 directives system defined structures in GLIM datasets and macros are discussed in this paper, where the authors introduce the GLIM3 directives system and discuss the use of regression models for calibration fatorial designs midding data.
Abstract: Part 1 Introducing GLIM 3: getting started in GLIM 3. Part 2 Statistical modelling and statistical inference: the Bernoulli distribution for binary data types of variables definition of a statistical model model criticism likelihood-based confidence intervals. Part 3 Normal regression and analysis of variance: the normal distribution and the Box-Cox transformation family link functions and transformations regression models for prediction the use of regression models for calibration fatorial designs midding data. Part 4 Binomial response data: binary responses transformations and link functions contingency table construction from binary data multidimensional contingency tables with a binary response. Part 5: multinomial and Poisson response data. Part 6 Survival data: probability plotting with censored data - the Kaplan-Meier estimator the Weibull distribution the Cox proportional hazards model and the piecewise exponential distribution the logistic and log logistic distributions time-dependent explanatory variables. Appendices: discussion GLIM directives system defined structures in GLIM datasets and macros.

742 citations

Journal ArticleDOI
TL;DR: This chapter discusses the development of GLIM directives system defined structures in GLIM datasets and macros, and the use of regression models for calibration fatorial designs midding data.
Abstract: Part 1 Introducing GLIM 3: getting started in GLIM 3. Part 2 Statistical modelling and statistical inference: the Bernoulli distribution for binary data types of variables definition of a statistical model model criticism likelihood-based confidence intervals. Part 3 Normal regression and analysis of variance: the normal distribution and the Box-Cox transformation family link functions and transformations regression models for prediction the use of regression models for calibration fatorial designs midding data. Part 4 Binomial response data: binary responses transformations and link functions contingency table construction from binary data multidimensional contingency tables with a binary response. Part 5: multinomial and Poisson response data. Part 6 Survival data: probability plotting with censored data - the Kaplan-Meier estimator the Weibull distribution the Cox proportional hazards model and the piecewise exponential distribution the logistic and log logistic distributions time-dependent explanatory variables. Appendices: discussion GLIM directives system defined structures in GLIM datasets and macros.

620 citations

Journal ArticleDOI
TL;DR: In this article, different formulations for the overdispersion mechanism can lead to different variance functions which can be placed within a general family of estimation methods, including maximum likelihood, moment methods, extended quasi-likelihood, pseudo-like likelihood and non-parametric maximum likelihood.

463 citations

Journal ArticleDOI
01 Jul 1981

423 citations

01 Jan 1998
TL;DR: In this paper, the authors propose a model for counting data that allows for excess zeros, which is the distinction between structural zeros which are inevitable, and sampling zero counts, which occur by chance.
Abstract: Poisson regression models provide a standard framework for the analysis of count data. In practice, however, count data are often overdispersed relative to the Poisson distribution. One frequent manifestation of overdispersion is that the incidence of zero counts is greater than expected for the Poisson distribution and this is of interest because zero counts frequently have special status. For example, in counting disease lesions on plants, a plant may have no lesions either because it is resistant to the disease, or simply because no disease spores have landed on it. This is the distinction between structural zeros, which are inevitable, and sampling zeros, which occur by chance. In recent years there has been considerable interest in models for count data that allow for excess zeros, particularly in the econometric literature. These models complement more conventional models for overdispersion that concentrate on modelling the variance-mean relationship correctly. Application areas are diverse and have included manufacturing defects (Lambert, 1992), patent applications (Crepon & Duguet, 1997), road safety (Miaou, 1994), species abundance (Welsh et al., 1996; Faddy, 1998), medical consultations

411 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, the authors provide an introduction to mixed-effects models for the analysis of repeated measurement data with subjects and items as crossed random effects, and a worked-out example of how to use recent software for mixed effects modeling is provided.

6,853 citations

Book
01 Jan 1987
TL;DR: In this article, the authors present a general classification notation for multilevel models and a discussion of the general structure and maximum likelihood estimation for a multi-level model, as well as the adequacy of Ordinary Least Squares estimates.
Abstract: Contents Dedication Preface Acknowledgements Notation A general classification notation and diagram Glossary Chapter 1 An introduction to multilevel models 1.1 Hierarchically structured data 1.2 School effectiveness 1.3 Sample survey methods 1.4 Repeated measures data 1.5 Event history and survival models 1.6 Discrete response data 1.7 Multivariate models 1.8 Nonlinear models 1.9 Measurement errors 1.10 Cross classifications and multiple membership structures. 1.11 Factor analysis and structural equation models 1.12 Levels of aggregation and ecological fallacies 1.13 Causality 1.14 The latent normal transformation and missing data 1.15 Other texts 1.16 A caveat Chapter 2 The 2-level model 2.1 Introduction 2.2 The 2-level model 2.3 Parameter estimation 2.4 Maximum likelihood estimation using Iterative Generalised Least Squares (IGLS) 2.5 Marginal models and Generalized Estimating Equations (GEE) 2.6 Residuals 2.7 The adequacy of Ordinary Least Squares estimates. 2.8 A 2-level example using longitudinal educational achievement data 2.9 General model diagnostics 2.10 Higher level explanatory variables and compositional effects 2.11 Transforming to normality 2.12 Hypothesis testing and confidence intervals 2.13 Bayesian estimation using Markov Chain Monte Carlo (MCMC) 2.14 Data augmentation Appendix 2.1 The general structure and maximum likelihood estimation for a multilevel model Appendix 2.2 Multilevel residuals estimation Appendix 2.3 Estimation using profile and extended likelihood Appendix 2.4 The EM algorithm Appendix 2.5 MCMC sampling Chapter 3. Three level models and more complex hierarchical structures. 3.1 Complex variance structures 3.2 A 3-level complex variation model example. 3.3 Parameter Constraints 3.4 Weighting units 3.5 Robust (Sandwich) Estimators and Jacknifing 3.6 The bootstrap 3.7 Aggregate level analyses 3.8 Meta analysis 3.9 Design issues Chapter 4. Multilevel Models for discrete response data 4.1 Generalised linear models 4.2 Proportions as responses 4.3 Examples 4.4 Models for multiple response categories 4.5 Models for counts 4.6 Mixed discrete - continuous response models 4.7 A latent normal model for binary responses 4.8 Partitioning variation in discrete response models Appendix 4.1. Generalised linear model estimation Appendix 4.2 Maximum likelihood estimation for generalised linear models Appendix 4.3 MCMC estimation for generalised linear models Appendix 4.4. Bootstrap estimation for generalised linear models Chapter 5. Models for repeated measures data 5.1 Repeated measures data 5.2 A 2-level repeated measures model 5.3 A polynomial model example for adolescent growth and the prediction of adult height 5.4 Modelling an autocorrelation structure at level 1. 5.5 A growth model with autocorrelated residuals 5.6 Multivariate repeated measures models 5.7 Scaling across time 5.8 Cross-over designs 5.9 Missing data 5.10 Longitudinal discrete response data Chapter 6. Multivariate multilevel data 6.1 Introduction 6.2 The basic 2-level multivariate model 6.3 Rotation Designs 6.4 A rotation design example using Science test scores 6.5 Informative response selection: subject choice in examinations 6.6 Multivariate structures at higher levels and future predictions 6.7 Multivariate responses at several levels 6.8 Principal Components analysis Appendix 6.1 MCMC algorithm for a multivariate normal response model with constraints Chapter 7. Latent normal models for multivariate data 7.1 The normal multilevel multivariate model 7.2 Sampling binary responses 7.3 Sampling ordered categorical responses 7.4 Sampling unordered categorical responses 7.5 Sampling count data 7.6 Sampling continuous non-normal data 7.7 Sampling the level 1 and level 2 covariance matrices 7.8 Model fit 7.9 Partially ordered data 7.10 Hybrid normal/ordered variables 7.11 Discussion Chapter 8. Multilevel factor analysis, structural equation and mixture models 8.1 A 2-stage 2-level factor model 8.2 A general multilevel factor model 8.3 MCMC estimation for the factor model 8.4 Structural equation models 8.5 Discrete response multilevel structural equation models 8.6 More complex hierarchical latent variable models 8.7 Multilevel mixture models Chapter 9. Nonlinear multilevel models 9.1 Introduction 9.2 Nonlinear functions of linear components 9.3 Estimating population means 9.4 Nonlinear functions for variances and covariances 9.5 Examples of nonlinear growth and nonlinear level 1 variance Appendix 9.1 Nonlinear model estimation Chapter 10. Multilevel modelling in sample surveys 10.1 Sample survey structures 10.2 Population structures 10.3 Small area estimation Chapter 11 Multilevel event history and survival models 11.1 Introduction 11.2 Censoring 11.3 Hazard and survival funtions 11.4 Parametric proportional hazard models 11.5 The semiparametric Cox model 11.6 Tied observations 11.7 Repeated events proportional hazard models 11.8 Example using birth interval data 11.9 Log duration models 11.10 Examples with birth interval data and children s activity episodes 11.11 The grouped discrete time hazards model 11.12 Discrete time latent normal event history models Chapter 12. Cross classified data structures 12.1 Random cross classifications 12.2 A basic cross classified model 12.3 Examination results for a cross classification of schools 12.4 Interactions in cross classifications 12.5 Cross classifications with one unit per cell 12.6 Multivariate cross classified models 12.7 A general notation for cross classifications 12.8 MCMC estimation in cross classified models Appendix 12.1 IGLS Estimation for cross classified data. Chapter 13 Multiple membership models 13.1 Multiple membership structures 13.2 Notation and classifications for multiple membership structures 13.3 An example of salmonella infection 13.4 A repeated measures multiple membership model 13.5 Individuals as higher level units 13.5.1 Example of research grant awards 13.6 Spatial models 13.7 Missing identification models Appendix 13.1 MCMC estimation for multiple membership models. Chapter 14 Measurement errors in multilevel models 14.1 A basic measurement error model 14.2 Moment based estimators 14.3 A 2-level example with measurement error at both levels. 14.4 Multivariate responses 14.5 Nonlinear models 14.6 Measurement errors for discrete explanatory variables 14.7 MCMC estimation for measurement error models Appendix 14.1 Measurement error estimation 14.2 MCMC estimation for measurement error models Chapter 15. Smoothing models for multilevel data. 15.1 Introduction 15.2. Smoothing estimators 15.3 Smoothing splines 15.4 Semi parametric smoothing models 15.5 Multilevel smoothing models 15.6 General multilevel semi-parametric smoothing models 15.7 Generalised linear models 15.8 An example Fixed Random 15.9 Conclusions Chapter 16. Missing data, partially observed data and multiple imputation 16.1 Creating a completed data set 16.2 Joint modelling for missing data 16.3 A two level model with responses of different types at both levels. 16.4 Multiple imputation 16.5 A simulation example of multiple imputation for missing data 16.6 Longitudinal data with attrition 16.7 Partially known data values 16.8 Conclusions Chapter 17 Multilevel models with correlated random effects 17.1 Non-independence of level 2 residuals 17.2 MCMC estimation for non-independent level 2 residuals 17.3 Adaptive proposal distributions in MCMC estimation 17.4 MCMC estimation for non-independent level 1 residuals 17.5 Modelling the level 1 variance as a function of explanatory variables with random effects 17.6 Discrete responses with correlated random effects 17.7 Calculating the DIC statistic 17.8 A growth data set 17.9 Conclusions Chapter 18. Software for multilevel modelling References Author index Subject index

5,839 citations

Journal ArticleDOI
TL;DR: In this paper, generalized linear mixed models (GLMM) are used to estimate the marginal quasi-likelihood for the mean parameters and the conditional variance for the variances, and the dispersion matrix is specified in terms of a rank deficient inverse covariance matrix.
Abstract: Statistical approaches to overdispersion, correlated errors, shrinkage estimation, and smoothing of regression relationships may be encompassed within the framework of the generalized linear mixed model (GLMM). Given an unobserved vector of random effects, observations are assumed to be conditionally independent with means that depend on the linear predictor through a specified link function and conditional variances that are specified by a variance function, known prior weights and a scale factor. The random effects are assumed to be normally distributed with mean zero and dispersion matrix depending on unknown variance components. For problems involving time series, spatial aggregation and smoothing, the dispersion may be specified in terms of a rank deficient inverse covariance matrix. Approximation of the marginal quasi-likelihood using Laplace's method leads eventually to estimating equations based on penalized quasilikelihood or PQL for the mean parameters and pseudo-likelihood for the variances. Im...

4,317 citations

BookDOI
01 Jan 2006
TL;DR: Regression models are frequently used to develop diagnostic, prognostic, and health resource utilization models in clinical, health services, outcomes, pharmacoeconomic, and epidemiologic research, and in a multitude of non-health-related areas.
Abstract: Regression models are frequently used to develop diagnostic, prognostic, and health resource utilization models in clinical, health services, outcomes, pharmacoeconomic, and epidemiologic research, and in a multitude of non-health-related areas. Regression models are also used to adjust for patient heterogeneity in randomized clinical trials, to obtain tests that are more powerful and valid than unadjusted treatment comparisons.

4,211 citations

Journal ArticleDOI
TL;DR: A recent survey of capture-recapture models can be found in this article, with an emphasis on flexibility in modeling, model selection, and the analysis of multiple data sets.
Abstract: The understanding of the dynamics of animal populations and of related ecological and evolutionary issues frequently depends on a direct analysis of life history parameters. For instance, examination of trade-offs between reproduction and survival usually rely on individually marked animals, for which the exact time of death is most often unknown, because marked individuals cannot be followed closely through time. Thus, the quantitative analysis of survival studies and experiments must be based on capture- recapture (or resighting) models which consider, besides the parameters of primary interest, recapture or resighting rates that are nuisance parameters. Capture-recapture models oriented to estimation of survival rates are the result of a recent change in emphasis from earlier approaches in which population size was the most important parameter, survival rates having been first introduced as nuisance parameters. This emphasis on survival rates in capture-recapture models developed rapidly in the 1980s and used as a basic structure the Cormack-Jolly-Seber survival model applied to an homogeneous group of animals, with various kinds of constraints on the model parameters. These approaches are conditional on first captures; hence they do not attempt to model the initial capture of unmarked animals as functions of population abundance in addition to survival and capture probabilities. This paper synthesizes, using a common framework, these recent developments together with new ones, with an emphasis on flexibility in modeling, model selection, and the analysis of multiple data sets. The effects on survival and capture rates of time, age, and categorical variables characterizing the individuals (e.g., sex) can be considered, as well as interactions between such effects. This "analysis of variance" philosophy emphasizes the structure of the survival and capture process rather than the technical characteristics of any particular model. The flexible array of models encompassed in this synthesis uses a common notation. As a result of the great level of flexibility and relevance achieved, the focus is changed from fitting a particular model to model building and model selection. The following procedure is recommended: (1) start from a global model compatible with the biology of the species studied and with the design of the study, and assess its fit; (2) select a more parsimonious model using Akaike's Information Criterion to limit the number of formal tests; (3) test for the most important biological questions by comparing this model with neighboring ones using likelihood ratio tests; and (4) obtain maximum likelihood estimates of model parameters with estimates of precision. Computer software is critical, as few of the models now available have parameter estimators that are in closed form. A comprehensive table of existing computer software is provided. We used RELEASE for data summary and goodness-of-fit tests and SURGE for iterative model fitting and the computation of likelihood ratio tests. Five increasingly complex examples are given to illustrate the theory. The first, using two data sets on the European Dipper (Cinclus cinclus), tests for sex-specific parameters,

4,038 citations