scispace - formally typeset
Search or ask a question

Showing papers on "Poisson regression published in 2003"


Journal ArticleDOI
TL;DR: Cox or Poisson regression with robust variance and log-binomial regression provide correct estimates and are a better alternative for the analysis of cross-sectional studies with binary outcomes than logistic regression, since the prevalence ratio is more interpretable and easier to communicate to non-specialists than the odds ratio.
Abstract: Cross-sectional studies with binary outcomes analyzed by logistic regression are frequent in the epidemiological literature. However, the odds ratio can importantly overestimate the prevalence ratio, the measure of choice in these studies. Also, controlling for confounding is not equivalent for the two measures. In this paper we explore alternatives for modeling data of such studies with techniques that directly estimate the prevalence ratio. We compared Cox regression with constant time at risk, Poisson regression and log-binomial regression against the standard Mantel-Haenszel estimators. Models with robust variance estimators in Cox and Poisson regressions and variance corrected by the scale parameter in Poisson regression were also evaluated. Three outcomes, from a cross-sectional study carried out in Pelotas, Brazil, with different levels of prevalence were explored: weight-for-age deficit (4%), asthma (31%) and mother in a paid job (52%). Unadjusted Cox/Poisson regression and Poisson regression with scale parameter adjusted by deviance performed worst in terms of interval estimates. Poisson regression with scale parameter adjusted by χ2 showed variable performance depending on the outcome prevalence. Cox/Poisson regression with robust variance, and log-binomial regression performed equally well when the model was correctly specified. Cox or Poisson regression with robust variance and log-binomial regression provide correct estimates and are a better alternative for the analysis of cross-sectional studies with binary outcomes than logistic regression, since the prevalence ratio is more interpretable and easier to communicate to non-specialists than the odds ratio. However, precautions are needed to avoid estimation problems in specific situations.

3,455 citations


Journal ArticleDOI
TL;DR: In this paper, a bivariate Poisson model and its extensions are proposed to model the number of goals of two competing teams in a football game, which is a plausible assumption in sports with two opposing teams competing against each other.
Abstract: Summary. Models based on the bivariate Poisson distribution are used for modelling sports data. Independent Poisson distributions are usually adopted to model the number of goals of two competing teams. We replace the independence assumption by considering a bivariate Poisson model and its extensions. The models proposed allow for correlation between the two scores, which is a plausible assumption in sports with two opposing teams competing against each other. The effect of introducing even slight correlation is discussed. Using just a bivariate Poisson distribution can improve model fit and prediction of the number of draws in football games. The model is extended by considering an inflation factor for diagonal terms in the bivariate joint distribution. This inflation improves in precision the estimation of draws and, at the same time, allows for overdispersed, relative to the simple Poisson distribution, marginal distributions. The properties of the models proposed as well as interpretation and estimation procedures are provided. An illustration of the models is presented by using data sets from football and water-polo.

412 citations


Journal ArticleDOI
TL;DR: The random effect negative binomial (RENB) model is applied to investigate the relationship between accident occurrence and the geometric, traffic and control characteristics of signalized intersections in Singapore and showed that 11 variables significantly affected the safety at the intersections.

391 citations


Journal ArticleDOI
TL;DR: In this article, the conditional logit model based on random utility maximization has been used to model firm location decisions and the Poisson regression has been proposed as a tractable solution to handle complex choice scenarios with a large number of spatial alternatives.
Abstract: The conditional logit model based on random utility maximization has provided an adequate framework to model firm location decisions. However, in practice, the implementation of this methodology presents problems when one has to handle complex choice scenarios with a large number of spatial alternatives. We posit the Poisson regression as a tractable solution to these problems. We demonstrate that by taking advantage of an equivalence relation between the likelihood function of the conditional logit and the Poisson regression we can, under certain circumstances, easily estimate a conditional logit model regardless of the number of choices. This insight should be particularly useful for studies of economic location.

329 citations


Journal ArticleDOI
TL;DR: In this paper, a zero-inflated negative binomial mixed regression model is presented to analyze a set of pancreas disorder length of stay (LOS) data that comprised mainly same-day separations.
Abstract: In many biometrical applications, the count data encountered often contain extra zeros relative to the Poisson distribution. Zero-inflated Poisson regression models are useful for analyzing such data, but parameter estimates may be seriously biased if the nonzero observations are over-dispersed and simultaneously correlated due to the sampling design or the data collection procedure. In this paper, a zero-inflated negative binomial mixed regression model is presented to analyze a set of pancreas disorder length of stay (LOS) data that comprised mainly same-day separations. Random effects are introduced to account for inter-hospital variations and the dependency of clustered LOS observations. Parameter estimation is achieved by maximizing an appropriate log-likelihood function using an EM algorithm. Alternative modeling strategies, namely the finite mixture of Poisson distributions and the non-parametric maximum likelihood approach, are also considered. The determination of pertinent covariates would assist hospital administrators and clinicians to manage LOS and expenditures efficiently.

286 citations


Journal ArticleDOI
TL;DR: The negative binomial model provides an alternative approach for the analysis of discrete data where overdispersion is a problem, provided that the model is correctly specified and adequately fits the data.

180 citations


Journal ArticleDOI
TL;DR: In this paper, a fully parametric approach is taken and a marginal distribution for the counts is specified, where conditional on past observations the mean is autoregressive, and a variety of models, based on the double Poisson distribution of Efron (1986) is introduced, which in a first step introduce an additional dispersion parameter and in a second step make this dispersion parameters time-varying.
Abstract: This paper introduces and evaluates new models for time series count data. The Autoregressive Conditional Poisson model (ACP) makes it possible to deal with issues of discreteness, overdispersion (variance greater than the mean) and serial correlation. A fully parametric approach is taken and a marginal distribution for the counts is specified, where conditional on past observations the mean is autoregressive. This enables to attain improved inference on coefficients of exogenous regressors relative to static Poisson regression, which is the main concern of the existing literature, while modelling the serial correlation in a flexible way. A variety of models, based on the double Poisson distribution of Efron (1986) is introduced, which in a first step introduce an additional dispersion parameter and in a second step make this dispersion parameter time-varying. All models are estimated using maximum likelihood which makes the usual tests available. In this framework autocorrelation can be tested with a straightforward likelihood ratio test, whose simplicity is in sharp contrast with test procedures in the latent variable time series count model of Zeger (1988). The models are applied to the time series of monthly polio cases in the U.S between 1970 and 1983 as well as to the daily number of price change durations of .75$ on the IBM stock. A .75$ price-change duration is defined as the time it takes the stock price to move by at least .75$. The variable of interest is the daily number of such durations, which is a measure of intradaily volatility, since the more volatile the stock price is within a day, the larger the counts will be. The ACP models provide good density forecasts of this measure of volatility.

160 citations


Journal ArticleDOI
TL;DR: It is indicated that ambient particles have effects on mortality among the elderly, with relative risks comparable or slightly higher than those observed for total mortality and similar effect modification patterns.
Abstract: Within the framework of the APHEA2 (Air Pollution on Health: a European Approach) project, the effects of ambient particles on mortality among persons > or = 65 yrs were investigated. Daily measurements for particles with a 50% cut-off aerodynamic diameter of 10 microm (PM10) and black smoke (BS), as well as the daily number of deaths among persons > or = 65 yrs of age, from 29 European cities, have been collected. Data on other pollutants and meteorological variables, to adjust for confounding effects and data on city characteristics, to investigate potential effect modification, were also recorded. For individual city analysis, generalised additive models extending Poisson regression, using a locally weighted regression (LOESS) smoother to control for seasonal effects, were applied. To combine individual city results and explore effect modification, second stage regression models were applied. The per cent increase (95% confidence intervals), associated with a 10 microg x m(-3) increase in PM10, in the elderly daily number of deaths was 0.8%, (0.7-0.9%) and the corresponding number for BS was 0.6%, (0.5-0.8%). The effect size was modified by the long-term average levels of nitrogen dioxide (higher levels were associated with larger effects), temperature (larger effects were observed in warmer countries), and by the proportion of the elderly in each city (a larger proportion was associated with higher effects). These results indicate that ambient particles have effects on mortality among the elderly, with relative risks comparable or slightly higher than those observed for total mortality and similar effect modification patterns. The effects among the older persons are of particular importance, since the attributable number of events will be much larger, compared to the number of deaths among the younger population.

130 citations


Journal ArticleDOI
TL;DR: The Poisson regression model is frequently used to analyze count data, but data are often over- or sometimes even underdispersed as compared to the standard Poisson model, so the definition of Poisson R-squared measures can be applied in these situations, albeit with bias adjustments accordingly adapted.

108 citations


Journal ArticleDOI
TL;DR: In this paper, a method is presented to derive point and interval estimates of the total number of individuals in a heterogeneous Poisson population based on the Horvitz-Thompson approach.
Abstract: A method is presented to derive point and interval estimates of the total number of individuals in a heterogenous Poisson population. The method is based on the Horvitz-Thompson approach. The zero-truncated Poisson regression model is fitted and results are used to obtain point and interval estimates for the total number of individuals in the population. The method is assessed by performing a simulation experiment computing coverage probabilities of Horvitz-Thompson confidence intervals for cases with different sample sizes and Poisson parameters. We illustrate our method using capture-recapture data from the police registration system providing information on illegal immigrants in four large cities in the Netherlands.

92 citations


Journal ArticleDOI
TL;DR: It is suggested that marijuana use may be independently associated with increased risk of hospitalized injury, and the results must be viewed cautiously.

Journal ArticleDOI
TL;DR: In several real-life examples one encounters count data where the number of zeros is such that the usual Poisson distribution does not fit the data, a zero-inflated generalized Poisson model can be considered and a Bayesian analysis can be carried out.

Journal ArticleDOI
TL;DR: In this paper, a truncated Poisson regression model is used to arrive at point and interval estimates of the size of two offender populations, i.e. drunk drivers and persons who illegally possess firearms.
Abstract: The truncated Poisson regression model is used to arrive at point and interval estimates of the size of two offender populations, i.e. drunk drivers and persons who illegally possess firearms. The dependent capture‐recapture variables are constructed from Dutch police records and are counts of individual arrests for both violations. The population size estimates are derived assuming that each count is a realization of a Poisson distribution, and that the Poisson parameters are related to covariates through the truncated Poisson regression model. These assumptions are discussed in detail, and the tenability of the second assumption is assessed by evaluating the marginal residuals and performing tests on overdispersion. For the firearms example, the second assumption seems to hold well, but for the drunk drivers example there is some overdispersion. It is concluded that the method is useful, provided it is used with care.

Journal Article
TL;DR: It is indicated that temperature has an effect on daily mortality in Shanghai, and the time-series approach is a useful tool for studying the temperature-mortality association.

Journal ArticleDOI
TL;DR: The rate of acute HB infection was significantly associated with year, urban region and lower vaccine uptake, and there was an interaction between region and vaccine uptake such that higher vaccine uptake appeared more protective in rural than in urban regions.
Abstract: Background British Columbia introduced a preadolescent hepatitis B (HB) immunization program in 1992 This study documents trends in the reported rate of acute HB disease since 1992 and examines factors bearing on the rate of infection throughout the period of program implementation Methods All Grade 6 students were eligible for immunization Vaccine uptake was reported annually for every school Acute HB infections were reported by physicians and by biomedical laboratories Year-to-year trends were analyzed overall and by age group using the electronic public health information system and S-plus Likelihood ratio tests were used to establish whether a variable was associated with the rate of acute HB in a given cohort Poisson regression was applied to determine which variables were independently associated with the rate of acute HB Results Immunization coverage ranged between 90 and 93% for each year between 1993 and 2001 The overall rate of reported acute HB declined from 7 per 100 000 to just more than 2 per 100 000, whereas that in 12- to 21-year-olds declined from 17 to 0 per 100 000 over this one decade period In the final Poisson regression model, the rate of acute HB infection was significantly associated with year, urban region and lower vaccine uptake There was an interaction between region and vaccine uptake such that higher vaccine uptake appeared more protective in rural than in urban regions Conclusions Acute HB has been eliminated in the immunized adolescent cohort A higher carrier rate in urban regions most likely explains the apparent difference in program effectiveness between urban and rural regions

Journal ArticleDOI
TL;DR: The elevated mortality from external causes among Finnish building/ground construction workers was probably due to living conditions and related lifestyles, and some evidence was found for a risk of lung cancer due to occupational exposure.
Abstract: Background This study, a component of the International Agency for Research on Cancer (IARC) Multicentric Study on Cancer Risk Among European Asphalt Workers, aimed at identifying major mortality risks among workers in Finnish road paving companies. Methods The Finnish cohort was comprised of 9,643 men and women from six road paving companies. The mortality of men employed during at least one season (5,676) was followed up from 1964 until end of 1994; an average of 17 years. Standardized mortality ratios (SMR) and relative risks (RR), the latter based on multivariate Poisson regression models were estimated by occupational group and by various metrics of occupational exposures. Results All-cause mortality was elevated (SMR 1.11, 95% confidence interval, CI 1.03–1.20), mainly due to excesses in accidents, poisonings, and violence (1.29; CI 1.12–1.49), and lung cancer (1.38; 1.03–1.81). Workers exposed to bitumen fumes had a slightly elevated mortality from lung cancer (1.16; 0.69–1.83). Multivariate Poisson regression models with 15-year lag period suggested trends by cumulative exposure to coal tar, organic vapors, silica dust, diesel exhaust, and bitumen fume. Conclusions The elevated mortality from external causes among Finnish building/ground construction workers was probably due to living conditions and related lifestyles. Some evidence was found for a risk of lung cancer due to occupational exposure, but the confirmation of these findings would require a longer follow-up and improved control for confounding. Am. J. Ind. Med. 43:49–57, 2003. © 2003 Wiley-Liss, Inc.

Book ChapterDOI
01 Jan 2003
TL;DR: In this paper, the authors explored the use of least squares regression to analyze and understand the relationship between a target variable and at least one predicting variable and found that least squares is based on a signal-plus-noise model using an underlying Gaussian distribution, when the correct analysis uses distributions appropriate for categorical data.
Abstract: In Chapters 2 and 3 we explored the use of least squares regression to analyze and understand the relationship between a target variable and at least one predicting variable While it was possible to model the number of deaths monthly from the number of killer tornadoes, there were clear problems in that model fitting, including negative estimated tornado-related deaths, an apparent nonlinear relationship between the target and the predictor, and heteroscedasticity in the residuals from the model As was noted on page 25, the problem is that least squares is based on a “signal-plusnoise” model using an underlying Gaussian distribution, when the correct analysis uses distributions appropriate for categorical data, such as those discussed in Chapter 4 Before discussing specific models, we present results for a very general regression model, the generalized linear model In later sections and chapters we will see how these general results apply to specific models for distributions such as the Poisson and binomial

Journal Article
TL;DR: Wang et al. as discussed by the authors used Poisson regression to evaluate the relationship between cause-specific deaths and air pollutant, considering the potential confounding factors such as seasonal and long-term patterns, meteorological factors (air temperature, air humidity), as well as adjusting the influence of flu epidemics in winter of 1998.
Abstract: To quantitively evaluate the associations between ambient air pollutant and daily mortality of Beijing and to supply the scientific bases for formulating control measures Air pollutants including CO, SO2, NOx, TSP, PM10 time series analysis Poisson regression was used to evaluate the relationship between cause-specific deaths and air pollutant, considering the potential confounding factors such as seasonal and long-term patterns, meteorological factors (air temperature, air humidity), as well as adjusting the influence of flu epidemics in winter of 1998 The results showed that in single-factor Poisson regression analysis, there is a significant positive correlation between the four pollutants and daily mortality except for the relationship between TSP and coronary heart disease deaths In multi-factor Poisson regression analysis, when SO2 increase in 100 micrograms/m3, respiratory deaths, cardiovascular and cerebro-vascular deaths, coronary heart disease deaths and chronic obstructive pulmonary deaths increased by 421%, 397%, 1068%, 1922% respectively Meanwhile, each 100 micrograms/m3 increase in TSP associated with 319% increase in the respiratory deaths and 062% increase in the cardiovascular and cerebrovascular deaths It is suggested that air pollution is a risk factor for health and an increase of air pollution level might lead to a raise in daily mortality

Journal ArticleDOI
TL;DR: The findings provide a new evidence of a downward trend in incidence rates of this disease in China for a period of 20 years, although the observed decline is relatively small and inconsistent across sex and age groups.
Abstract: The objective of this study was to describe trends in the incidence rates of primary liver cancer in a geographically defined Chinese population. Primary liver cancer cases (N=13 685) were diagnosed between 1981 and 2000 and identified by the Tianjin Cancer Registry. Age-adjusted and age-specific incidence rates were examined in both males and females. Poisson regression was employed to assess the incidence rate trends. Crude and age-adjusted incidence rates in the study period were: 27.4/100 000 and 16.4/100 000 in males and 11.5/100 000 and 6.4/100 000 in females, respectively. While the results from Poisson regression analyses suggest statistically significant trends of declining incidence rates of primary liver cancer overall, trends were not consistent across age and sex groups. The decline in incidence was observed, for the most part, in the 40-69 age group, with a greater decrease in males. Our findings provide a new evidence of a downward trend in incidence rates of this disease in China for a period of 20 years. As the observed decline is relatively small and inconsistent across sex and age groups, a continued epidemiological observation on this condition is required.

Journal ArticleDOI
TL;DR: In this paper, three model types were considered (Poisson regression, negative binomial regression, and nonlinear regression), and the results were compared based on magnitudes and signs of model parameter estimates and t-statistics.
Abstract: Incident prediction models are presented for the Interstate 80/Interstate 94 (Borman Expressway in northwestern Indiana) and Interstate 465 (northeastern Indianapolis, Indiana) freeway sections developed as a function of traffic volume, truck percentage, and weather. Separate models were developed for all incidents and noncrash incidents. Three model types were considered (Poisson regression, negative binomial regression, and nonlinear regression), and the results were compared based on magnitudes and signs of model parameter estimates and t-statistics. Least-squares estimation and maximum-likelihood methods were used to estimate the model parameters. Data from the Indiana Department of Transportation and the Indiana Climatology Database were used to establish the relationships. For a given session and incident category, the results from the Poisson and negative binomial models were found to be consistent. It was observed that, unlike section length, traffic volume is nonlinearly related to incidents, and therefore these two variables have to be considered as separate terms in the modeling process. Truck percentage was found to be a statistically significant factor affecting incident occurrence. It was also found that the weather variable (rain and snow) was negatively correlated to incidents. The freeway incident models developed constitute a useful decision support tool for implementation of new freeway patrol systems or for expansion of existing ones. They are also useful for simulating incident occurrences with a view to identifying elements of cost-effective freeway patrol strategies (patrol deployment policies, fleet size, crew size, and beat routes).

Journal ArticleDOI
TL;DR: The findings highlight that the modified ZIP approach provided a satisfactory fit to the data, and that the manual handling WRATS intervention was associated with a reduction in the proportion of cleaners injured at work.

Journal Article
TL;DR: The results show the presence of a mortality gradient for both material and social forms of deprivation, where the relative risks of mortality of the most disadvantaged group and the most advantaged group are, respectively, 1.34 and 1.35.
Abstract: Cerebrovascular accidents (CVAs) constitute an important cause of disability and death in Quebec. Among the primary CVA risk factors, certain socioeconomic characteristics of individuals and living environments appear to play a central role. The purpose of this article is to examine the links between material/social forms of deprivation and CVA mortality in a group of 4,339 individuals aged 25 to 74 years who died between 1994 and 1998. The socioeconomic profile of these persons was estimated on the basis of the enumeration area in which they resided. The Poisson regression technique was used to estimate the relative risk (RR) of mortality by deprivation level. Our results show the presence of a mortality gradient for both material and social forms of deprivation, where the relative risks of mortality of the most disadvantaged group and the most advantaged group are, respectively, 1.34 and 1.35. Despite the existence of a system of universal health care, inequalities in mortality persist and need to be taken into account when implementing intervention programs.

Posted Content
TL;DR: The Multivariate Autoregressive conditional Poisson model (MACP) as discussed by the authors is a multivariate model for time series count data, which makes it possible to deal with issues of discreteness, overdispersion and both auto-and cross-correlation.
Abstract: This paper introduces a new multivariate model for time series count data The Multivariate Autoregressive Conditional Poisson model (MACP) makes it possible to deal with issues of discreteness, overdispersion (variance greater than the mean) and both auto- and cross-correlation We model counts as Poisson or double Poisson and assume that conditionally on past observations the means follow a Vector Autoregression We use a copula to introduce contemporaneous correlation between the series An important advantage of this model is that it can accommodate both positive and negative correlation among variables As a feasible alternative to multivariate duration models, the model is applied to the submission of market orders and quote revisions on IBM on the New York Stock Exchange We show that a single factor cannot explain the dynamics of the market process, which confirms that time deformation, taken as meaning that all market events should accelerate or slow down proportionately, does not hold We advocate the use of the Multivariate Autoregressive Conditional Poisson model for the study of multivariate point processes in finance, when the number of variables considered simultaneously exceeds two and looking at durations becomes too difficult


Journal ArticleDOI
TL;DR: To investigate the trends in incidence and mortality and estimate survival for women diagnosed with ovarian cancer in Western Australia, a large number of women are diagnosed with the disease each year.

Journal ArticleDOI
TL;DR: In this paper, a generalization of the simple Poisson process is discussed and illustrated with an analysis of some longitudinal count data on frequencies of epileptic fits, showing that some covariates have a more significant effect using this modelling than from using mixed Poisson models.
Abstract: Models based on a generalization of the simple Poisson process are discussed and illustrated with an analysis of some longitudinal count data on frequencies of epileptic fits. The models enable a broad class of discrete distributions to be constructed, which cover a variety of dispersion properties that can be characterized in an intuitive and appealing way by a simple parameterization. This class includes the Poisson and negative binomial distributions as well as other distributions with greater dispersion than Poisson, and also distributions underdispersed relative to the Poisson distribution. Comparing a number of analyses of the data shows that some covariates have a more significant effect using this modelling than from using mixed Poisson models. It is argued that this could be due to the mixed Poisson models used in the other analyses not providing an appropriate description of the residual variation, with the greater flexibility of the generalized Poisson modelling generally enabling more critical...

Journal ArticleDOI
TL;DR: In this article, a generalized quasilikelihood approach is proposed to analyze the repeated familial data based on the familial structure caused by gamma random effects, which provides estimates of the regression parameters and the variance component of the random effects after taking the longitudinal correlations of the data into account.

Journal Article
TL;DR: The present study showed that there was a rapid increase in the rate of statins newly dispensed to elderly patients in Ontario, among whom estimates of safety, efficacy and cost effectiveness are not well quantified.