scispace - formally typeset
Search or ask a question
Book

Generalized Linear Models

01 Jan 1983-
TL;DR: In this paper, a generalization of the analysis of variance is given for these models using log- likelihoods, illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables), and gamma (variance components).
Abstract: The technique of iterative weighted linear regression can be used to obtain maximum likelihood estimates of the parameters with observations distributed according to some exponential family and systematic effects that can be made linear by a suitable transformation. A generalization of the analysis of variance is given for these models using log- likelihoods. These generalized linear models are illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables) and gamma (variance components).
Citations
More filters
Journal ArticleDOI
TL;DR: Mortality decreased as volume increased for all 14 types of procedures, but the relative importance of volume varied markedly according to the type of procedure.
Abstract: Background Although numerous studies suggest that there is an inverse relation between hospital volume of surgical procedures and surgical mortality, the relative importance of hospital volume in various surgical procedures is disputed. Methods Using information from the national Medicare claims data base and the Nationwide Inpatient Sample, we examined the mortality associated with six different types of cardiovascular procedures and eight types of major cancer resections between 1994 and 1999 (total number of procedures, 2.5 million). Regression techniques were used to describe relations between hospital volume (total number of procedures performed per year) and mortality (in-hospital or within 30 days), with adjustment for characteristics of the patients. Results Mortality decreased as volume increased for all 14 types of procedures, but the relative importance of volume varied markedly according to the type of procedure. Absolute differences in adjusted mortality rates between very-low-volume hospitals and very-high-volume hospitals ranged from over 12 percent (for pancreatic resection, 16.3 percent vs. 3.8 percent) to only 0.2 percent (for carotid endarterectomy, 1.7 percent vs. 1.5 percent). The absolute differences in adjusted mortality rates between very-low-volume hospitals and very-high-volume hospitals were greater than 5 percent for esophagectomy and pneumonectomy, 2 to 5 percent for gastrectomy, cystectomy, repair of a nonruptured abdominal aneurysm, and replacement of an aortic or mitral valve, and less than 2 percent for coronary-artery bypass grafting, lower-extremity bypass, colectomy, lobectomy, and nephrectomy. Conclusions In the absence of other information about the quality of surgery at the hospitals near them, Medicare patients undergoing selected cardiovascular or cancer procedures can significantly reduce their risk of operative death by selecting a high-volume hospital.

4,363 citations

Journal ArticleDOI
TL;DR: In this paper, generalized linear mixed models (GLMM) are used to estimate the marginal quasi-likelihood for the mean parameters and the conditional variance for the variances, and the dispersion matrix is specified in terms of a rank deficient inverse covariance matrix.
Abstract: Statistical approaches to overdispersion, correlated errors, shrinkage estimation, and smoothing of regression relationships may be encompassed within the framework of the generalized linear mixed model (GLMM). Given an unobserved vector of random effects, observations are assumed to be conditionally independent with means that depend on the linear predictor through a specified link function and conditional variances that are specified by a variance function, known prior weights and a scale factor. The random effects are assumed to be normally distributed with mean zero and dispersion matrix depending on unknown variance components. For problems involving time series, spatial aggregation and smoothing, the dispersion may be specified in terms of a rank deficient inverse covariance matrix. Approximation of the marginal quasi-likelihood using Laplace's method leads eventually to estimating equations based on penalized quasilikelihood or PQL for the mean parameters and pseudo-likelihood for the variances. Im...

4,317 citations

Journal ArticleDOI
TL;DR: This article discusses extensions of generalized linear models for the analysis of longitudinal data in which heterogeneity in regression parameters is explicitly modelled and uses a generalized estimating equation approach to fit both classes of models for discrete and continuous outcomes.
Abstract: This article discusses extensions of generalized linear models for the analysis of longitudinal data. Two approaches are considered: subject-specific (SS) models in which heterogeneity in regression parameters is explicitly modelled; and population-averaged (PA) models in which the aggregate response for the population is the focus. We use a generalized estimating equation approach to fit both classes of models for discrete and continuous outcomes. When the subject-specific parameters are assumed to follow a Gaussian distribution, simple relationships between the PA and SS parameters are available. The methods are illustrated with an analysis of data on mother's smoking and children's respiratory disease.

4,303 citations

Journal ArticleDOI
TL;DR: A flexible statistical framework is developed for the analysis of read counts from RNA-Seq gene expression studies, and parallel computational approaches are developed to make non-linear model fitting faster and more reliable, making the application of GLMs to genomic data more convenient and practical.
Abstract: A flexible statistical framework is developed for the analysis of read counts from RNA-Seq gene expression studies. It provides the ability to analyse complex experiments involving multiple treatment conditions and blocking variables while still taking full account of biological variation. Biological variation between RNA samples is estimated separately from the technical variation associated with sequencing technologies. Novel empirical Bayes methods allow each gene to have its own specific variability, even when there are relatively few biological replicates from which to estimate such variability. The pipeline is implemented in the edgeR package of the Bioconductor project. A case study analysis of carcinoma data demonstrates the ability of generalized linear model methods (GLMs) to detect differential expression in a paired design, and even to detect tumour-specific expression changes. The case study demonstrates the need to allow for gene-specific variability, rather than assuming a common dispersion across genes or a fixed relationship between abundance and variability. Genewise dispersions de-prioritize genes with inconsistent results and allow the main analysis to focus on changes that are consistent between biological replicates. Parallel computational approaches are developed to make non-linear model fitting faster and more reliable, making the application of GLMs to genomic data more convenient and practical. Simulations demonstrate the ability of adjusted profile likelihood estimators to return accurate estimators of biological variability in complex situations. When variation is gene-specific, empirical Bayes estimators provide an advantageous compromise between the extremes of assuming common dispersion or separate genewise dispersion. The methods developed here can also be applied to count data arising from DNA-Seq applications, including ChIP-Seq for epigenetic marks and DNA methylation analyses.

4,127 citations

Journal ArticleDOI
TL;DR: A recent survey of capture-recapture models can be found in this article, with an emphasis on flexibility in modeling, model selection, and the analysis of multiple data sets.
Abstract: The understanding of the dynamics of animal populations and of related ecological and evolutionary issues frequently depends on a direct analysis of life history parameters. For instance, examination of trade-offs between reproduction and survival usually rely on individually marked animals, for which the exact time of death is most often unknown, because marked individuals cannot be followed closely through time. Thus, the quantitative analysis of survival studies and experiments must be based on capture- recapture (or resighting) models which consider, besides the parameters of primary interest, recapture or resighting rates that are nuisance parameters. Capture-recapture models oriented to estimation of survival rates are the result of a recent change in emphasis from earlier approaches in which population size was the most important parameter, survival rates having been first introduced as nuisance parameters. This emphasis on survival rates in capture-recapture models developed rapidly in the 1980s and used as a basic structure the Cormack-Jolly-Seber survival model applied to an homogeneous group of animals, with various kinds of constraints on the model parameters. These approaches are conditional on first captures; hence they do not attempt to model the initial capture of unmarked animals as functions of population abundance in addition to survival and capture probabilities. This paper synthesizes, using a common framework, these recent developments together with new ones, with an emphasis on flexibility in modeling, model selection, and the analysis of multiple data sets. The effects on survival and capture rates of time, age, and categorical variables characterizing the individuals (e.g., sex) can be considered, as well as interactions between such effects. This "analysis of variance" philosophy emphasizes the structure of the survival and capture process rather than the technical characteristics of any particular model. The flexible array of models encompassed in this synthesis uses a common notation. As a result of the great level of flexibility and relevance achieved, the focus is changed from fitting a particular model to model building and model selection. The following procedure is recommended: (1) start from a global model compatible with the biology of the species studied and with the design of the study, and assess its fit; (2) select a more parsimonious model using Akaike's Information Criterion to limit the number of formal tests; (3) test for the most important biological questions by comparing this model with neighboring ones using likelihood ratio tests; and (4) obtain maximum likelihood estimates of model parameters with estimates of precision. Computer software is critical, as few of the models now available have parameter estimators that are in closed form. A comprehensive table of existing computer software is provided. We used RELEASE for data summary and goodness-of-fit tests and SURGE for iterative model fitting and the computation of likelihood ratio tests. Five increasingly complex examples are given to illustrate the theory. The first, using two data sets on the European Dipper (Cinclus cinclus), tests for sex-specific parameters,

4,038 citations

References
More filters
01 Jan 1972
TL;DR: The drum mallets disclosed in this article are adjustable, by the percussion player, as to balance, overall weight, head characteristics and tone production of the mallet, whereby the adjustment can be readily obtained.
Abstract: The drum mallets disclosed are adjustable, by the percussion player, as to weight and/or balance and/or head characteristics, so as to vary the "feel" of the mallet, and thus also the tonal effect obtainable when playing upon kettle-drums, snare-drums, and other percussion instruments; and, typically, the mallet has frictionally slidable, removable and replaceable, external balancing mass means, positionable to serve as the striking head of the mallet, whereby the adjustment as to balance, overall weight, head characteristics and tone production may be readily obtained. In some forms, the said mass means regularly serves as a removable and replaceable striking head; while in other forms, the mass means comprises one or more thin elongated tubes having a frictionally-gripping fit on an elongated mallet body, so as to be manually slidable thereon but tight enough to avoid dislodgment under normal playing action; and such a tubular member may be slidable to the head-end of the mallet to serve as a striking head or it may be slidable to a position to serve as a hand grip; and one or more such tubular members may be placed in various positions along the length of the mallet. The mallet body may also have a tapered element at the head-end to assure retention of mass members especially of enlarged-head types; and the disclosure further includes such heads embodying a relatively hard inner portion and a relatively soft outer covering.

10,148 citations

Journal ArticleDOI
01 May 1972
TL;DR: In this paper, the authors used iterative weighted linear regression to obtain maximum likelihood estimates of the parameters with observations distributed according to some exponential family and systematic effects that can be made linear by a suitable transformation.
Abstract: JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. Blackwell Publishing and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access to Journal of the Royal Statistical Society. Series A (General). SUMMARY The technique of iterative weighted linear regression can be used to obtain maximum likelihood estimates of the parameters with observations distributed according to some exponential family and systematic effects that can be made linear by a suitable transformation. A generalization of the analysis of variance is given for these models using log-likelihoods. These generalized linear models are illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables) and gamma (variance components). The implications of the approach in designing statistics courses are discussed.

8,793 citations

Book
01 Jan 1959

7,235 citations

Book
01 Jan 1959
TL;DR: The general decision problem, the Probability Background, Uniformly Most Powerful Tests, Unbiasedness, Theory and First Applications, and UNbiasedness: Applications to Normal Distributions, Invariance, Linear Hypotheses as discussed by the authors.
Abstract: The General Decision Problem.- The Probability Background.- Uniformly Most Powerful Tests.- Unbiasedness: Theory and First Applications.- Unbiasedness: Applications to Normal Distributions.- Invariance.- Linear Hypotheses.- The Minimax Principle.- Multiple Testing and Simultaneous Inference.- Conditional Inference.- Basic Large Sample Theory.- Quadratic Mean Differentiable Families.- Large Sample Optimality.- Testing Goodness of Fit.- General Large Sample Methods.

6,480 citations

Journal ArticleDOI

6,420 citations