scispace - formally typeset
Search or ask a question

Showing papers on "Negative binomial distribution published in 2002"


Journal ArticleDOI
TL;DR: The authors showed that the conditional negative binomial model for panel data, proposed by Hausman, Hall, and Griliches (1984), is not a true fixed-effects method.
Abstract: This paper demonstrates that the conditional negative binomial model for panel data, proposed by Hausman, Hall, and Griliches (1984), is not a true fixed-effects method. This method—which has been ...

990 citations


Journal ArticleDOI
TL;DR: It is shown that the two aspects of early growth may have different implications for imitation and fine motor dexterity.
Abstract: Poisson regression is widely used in medical studies, and can be extended to negative binomial regression to allow for heterogeneity. When there is an excess number of zero counts, a useful approach is to used a mixture model with a proportion P of subjects not at risk, and a proportion of 1-P at-risk subjects who take on outcome values following a Poisson or negative binomial distribution. Covariate effects can be incorporated into both components of the models. In child assessment, fine motor development is often measured by test items that involve a process of imitation and a process of fine motor exercise. One such developmental milestone is ‘building a tower of cubes’. This study analyses the impact of foetal growth and postnatal somatic growth on this milestone, operationalized as the number of cubes and measured around the age of 22 months. It is shown that the two aspects of early growth may have different implications for imitation and fine motor dexterity. The usual approach of recording and analysing the milestone as a binary outcome, such as whether the child can build a tower of three cubes, may leave out important information. Copyright © 2002 John Wiley & Sons, Ltd.

247 citations


Journal ArticleDOI
TL;DR: Researchers should consider generalized linear models with normal, Poisson, or negative binomial distributions for predicting length of stay following CABG surgery.
Abstract: Investigators in clinical research are often interested in determining the association between patient characteristics and post-operative length of stay (LOS). We examined the relative performance of seven different statistical strategies for analyzing LOS in a cohort of patients undergoing CABG surgery. We compared linear regression; linear regression with log-transformed length of stay; generalized linear models with the following distributions: Poisson, negative binomial, normal, and gamma; and semi-parametric survival models. Nine of twenty patient characteristics were found to be significantly associated with increased LOS in all models. The models disagreed upon the statistical significance of the association between the remaining patient characteristics and increased LOS. Generalized linear models with Poisson, negative binomial, and gamma distributions, and the Cox regression model demonstrated the greatest consistency. With the exception of Cox regression, all models had similar ability to predict length of stay in the actual data. However, the generalized linear models tended to have marginally lower prediction error than the linear models. Using four measures of prediction error, Cox regression had substantially higher prediction error than the other models. Generalized linear models were best able to predict patient length of stay in Monte Carlo simulations that were performed. Researchers should consider generalized linear models with normal, Poisson, or negative binomial distributions for predicting length of stay following CABG surgery. Post-operative length of stay is a complex phenomenon that is difficult to incorporate into a simple parametric model due to a small proportion of patients having very long lengths of stay.

131 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a general score test for the null hypothesis that variance components associated with these random effects are zero, which reduces to an alternative to the overdispersion test of Ridout, Demaine, and Hinde.
Abstract: Hall (2000) has described zero-inflated Poisson and binomial regression models that include ran- dom effects to account for excess zeros and additional sources of heterogeneity in the data. The authors of the present paper propose a general score test for the null hypothesis that variance components associated with these random effects are zero. For a zero-inflated Poisson model with random intercept, the new test reduces to an alternative to the overdispersion test of Ridout, Dem´ etrio & Hinde (2001). The authors also examine their general test in the special case of the zero-inflated binomial model with random intercept and propose an overdispersion test in that context which is based on a beta-binomial alternative. Tests scores d'h ´ et ´ erog ´ en ´ eit ´ e et de surdispersion dans des mod ` eles de r ´ egression binomiaux et de Poisson avec surplus de z ´ eros

67 citations


Journal ArticleDOI
TL;DR: A replicated laboratory study of soil mites, Sancassania berlesei, kept in controlled environments but with food supplied randomly, with different synchrony and variances, while maintaining the same mean rate is reported.
Abstract: Summary 1. Variation in an organism’s environment can influence its life history and therefore its population size. Understanding the interplay between noise and population dynamics is of considerable importance, especially for prescribing management of economically important or threatened species. The impact of noise on population biology has been the subject of frequent theoretical investigations and discussed in posthoc analyses of time-series. However, there is a dearth of experimental investigations. 2. Here we report results from a replicated laboratory study of soil mites, Sancassania berlesei , kept in controlled environments but with food supplied randomly, with different synchrony and variances, while maintaining the same mean rate. 3. Increasing environmental variance increases population variance, but decreases mean population size, therefore changing the observed shape of the distribution of population sizes: the shape becomes more skewed with more environmental variance. 4. The distribution of population sizes are best described by negative binomial or gamma distributions. Log-normal and normal distributions rarely fit and Poisson distributions never fit. 5. The correlation between populations is sensitive to the correlation in environmental noise but insensitive to its variance. 6. Different life stages (eggs, juveniles and adults) respond differently to noise, in that there are significant differences in the way that environmental variation changes the mean, variance, shape of distribution and relationship between environmental and population synchrony.

55 citations


Journal ArticleDOI
TL;DR: This paper analyzed the distributions of the number of goals scored by home teams, away teams, and the total scored in the match, in domestic football games from 169 countries between 1999 and 2001.
Abstract: We analyse the distributions of the number of goals scored by home teams, away teams, and the total scored in the match, in domestic football games from 169 countries between 1999 and 2001. The probability density functions (PDFs) of goals scored are too heavy-tailed to be fitted over their entire ranges by Poisson or negative binomial distributions which would be expected for uncorrelated processes. Log-normal distributions cannot include zero scores and here we find that the PDFs are consistent with those arising from extremal statistics. In addition, we show that it is sufficient to model English top division and FA Cup matches in the seasons of 1970/71–2000/01 on Poisson or negative binomial distributions, as reported in analyses of earlier seasons, and that these are not consistent with extremal statistics.

54 citations


Journal ArticleDOI
TL;DR: In this paper, the authors compare the performance of two relatively new "semiparametric" approaches with two flexible parametric approaches in analysing two patent data sets and compare their results with the results of the standard Poisson and negative binomial regression models.
Abstract: This article explores alternative approaches to modeling the relationship between the number of patents and research and development expenditure. Patent counts typically exhibit long upper tails that are inadequately modeled by standard Poisson and negative binomial regression models. We compare the performance of two relatively new "semiparametric" approaches with two flexible parametric approaches in analysing two patent data sets. Copyright 2002 by Blackwell Publishing Ltd

50 citations


Journal ArticleDOI
TL;DR: In this article, the authors describe vector generalized linear and additive models (VGLMs and VGAMs) and illustrate their use on a few data sets using several statistical models, which include the negative binomial, beta distribution and the bivariate logistic model.

48 citations


Journal ArticleDOI
01 Sep 2002-Extremes
TL;DR: In this paper, the authors examined the limiting behavior of (Mn−b(n))/a(n) as n → ∞ for different families of discrete distributions.
Abstract: If X1, X2,..., Xn are independent and identically distributed discrete random variables and Mn=max (X1,..., Xn) we examine the limiting behavior of (Mn−b(n))/a(n) as n → ∞. It is well known that for discrete distributions such as Poisson and geometric the limiting distribution is not non-degenerate. However, by tuning the parameters of the discrete distribution to vary as n → ∞, it is possible to obtain non-degenerate limits for (Mn−b(n))/a(n). We consider four families of discrete distributions and show how this can be done.

36 citations


Journal ArticleDOI
TL;DR: In this paper, a new formulation of bivariate binomial distribution is proposed, in which marginally each of the two random variables has a binomial distributions and they have some nonzero correlation in the joint distribution.

35 citations


Journal ArticleDOI
Jie Q. Guo1, Tong Li1
TL;DR: In this article, a new type of corrected score estimator is proposed assuming that the distribution of the latent variables is known, and the consistency and asymptotic normality of the proposed estimator are established.

Proceedings ArticleDOI
19 May 2002
TL;DR: It is shown that the negative binomial distribution fits well to the distribution of size data such as the number of methods per class and number of lines of code per method and can be effectively used to trace software evolution processes.
Abstract: Size data of software systems are constantly collected but so far there have been no studies of applying statistical distribution models to analyze and interpret those data. In this paper, we show that the negative binomial distribution fits well to the distribution of size data such as the number of methods per class and number of lines of code per method and can be effectively used to trace software evolution processes.

Journal ArticleDOI
TL;DR: This paper compares the practical performance of alternative goodness-of-fit techniques for count data models in the context of a study of the determinants of demand for dental care in Spain and implements recently proposed specification tests consistent in the direction of general nonparametric alternatives.
Abstract: This paper compares the practical performance of alternative goodness-of-fit techniques for count data models in the context of a study of the determinants of demand for dental care in Spain We apply alternative goodness-of-fit techniques to different specifications In particular, we implement recently proposed specification tests which are consistent in the direction of general nonparametric alternatives The analysis suggests that a negative binomial model is an appropriate specification for dental care demand Dental health and income are identified as important predictors of individuals' behavior

Journal ArticleDOI
08 Feb 2002
TL;DR: In this article, the problem of finding the set of p > 0 such that μ* p exists, where p is a finite union of intervals, has been studied in the literature.
Abstract: For c > 0, this note computes essentially the set of (x,y) in [0, +∞) 2 such that the entire series in z defined by (1 + z/c) x (1-z) -y has all its coefficients non-negative. If X and Y are independent random variables which have respectively Bernoulli and negative binomial distributions, denote by μ the distribution of X + Y. The above problem is equivalent to finding the set of p > 0 such that μ* p exists; this set is a finite union of intervals and may be the first example of this type in the literature. This gives the final touch to the classification of the natural exponential families with variance functions of Babel type, i.e. of the form aR(m) + (bm+c)R(m), where R is a polynomial with degree < 2.

Journal ArticleDOI
TL;DR: In this paper, it was shown that the hypergeometric generalized negative binomial distribution has moments of all positive orders, is overdispersed, skewed to the right, and leptokurtic.
Abstract: It is shown that the hypergeometric generalized negative binomial distribution has moments of all positive orders, is overdispersed, skewed to the right, and leptokurtic. Also, a three-term recurrence relation for computing probabilities from the considered distribution is given. Application of the distribution to entomological field data is given and its goodness-of-fit is demonstrated.

Journal ArticleDOI
TL;DR: The main conclusion is that an initial assumption of a negative binomial model is the conservative approach to chromosome dosimetry for high LET radiations.
Abstract: The usual assumption of a Poisson model for the number of chromosome aberrations in controlled calibration experiments implies variance equal to the mean. However, it is known that chromosome aberration data from experiments involving high linear energy transfer radiations can be overdispersed, i.e. the variance is greater than the mean. Present methods for dealing with overdispersed chromosome data rely on frequentist statistical techniques. In this paper. the problem of overdispersion is considered from a Bayesian standpoint. The Bayes Factor is used to compare Poisson and negative binomial models for two previously published calibration data sets describing the induction of dicentric chromosome aberrations by high doses of neutrons. Posterior densities for the model parameters, which characterise dose response and overdispersion are calculated and graphed. Calibrative densities are derived for unknown neutron doses from hypothetical radiation accident data to deterimine the impact of different model assumptions on dose estimates. The main conclusion is that an initial assumption of a negative binomial model is the conservative approach to chromosome dosimetry for high LET radiations.


01 Jan 2002
TL;DR: A class of infinitely divisible distributions on {0,1,2,…} is defined by requiring the (discrete) Levy function to be equal to the probability function except for a very simple factor as mentioned in this paper.
Abstract: A class of infinitely divisible distributions on {0,1,2,…} is defined by requiring the (discrete) Levy function to be equal to the probability function except for a very simple factor. These distributions turn out to be special cases of the total offspring distributions in (sub)critical branching processes and can also be interpreted as first passage times in certain random walks. There are connections with Lambert's W function and generalized negative binomial convolutions.

Book ChapterDOI
25 Sep 2002

Journal ArticleDOI
TL;DR: In this paper, a multifractal negative binomial distribution derived from the nonlinear Markov process has been used to describe the multiplicity distributions in hadron−hadron and hadron-nucleus interactions at √s > 10 GeV.
Abstract: Multifractal negative binomial distribution derived from the nonlinear Markov process has been used to describe the multiplicity distributions in hadron–hadron and hadron–nucleus interactions at √s > 10 GeV. The analysis aims at the study of intermittent structure of hadron production in h–h and h–A interactions in terms of bunching parameters and a comparison with the conventional negative binomial approach. This is in continuation of our earlier analysis of e+e− and p interactions.

01 Jan 2002
TL;DR: In this paper, an extension of the Cox-Ross-Rubinstein (CRR) model is proposed to evaluate barrier options with exponential boundaries, based on the construction of a binomial tree for the underlying asset price dynamics, characterized by sets of nodes that mirror the barriers evolution.
Abstract: It is a common belief that the standard binomial algorithm of Cox-Ross-Rubinstein (CRR) cannot be used to deal with barrier options with multiple or time-varying boundaries. We propose an extension of the CRR model to evaluate options with exponential boundaries. The essence of the extended binomial model relies upon the construction of a binomial tree for the underlying asset price dynamics, characterized by sets of nodes that mirror the barriers evolution. As a result, a very easy algorithm is derived that produces accurate prices with respect to the corresponding continuous time values. Moreover, numerical results show that the performance of the extended binomial algorithm is superior to that of the trinomial algorithms usually employed to price these options.

Journal ArticleDOI
G. R. Wood1
TL;DR: In this paper, the authors explore the underlying reasons for this behaviour and present a practical resolution of the problem, in both single distribution and regression contexts, and an extension to negative binomial models is also given.
Abstract: Testing goodness of fit for a Poisson distribution is routine when the mean is sufficiently large; the scaled deviance G 2 or Pearson's X 2 statistic follow approximate chi-square distributions and perform the task well. When the mean is low, typically less than one, the approximations to chi-square distributions are poor. In this paper we explore the underlying reasons for this behaviour and present a practical resolution of the problem, in both single distribution and regression contexts. An extension to negative binomial models is also given. This research is motivated by a real example drawn from road accident modelling.

Journal ArticleDOI
TL;DR: In this article, all the troughs under a certain threshold level are considered in deriving the probability distribution of annual minima through the total probability theorem, which can be used for modeling the minimum flows in streams which do not have zero flows.


Journal ArticleDOI
TL;DR: In this article, a model for the lifetime of a system is considered in which the system is susceptible to simultaneous failures of two or more components, the failures having a common external cause.
Abstract: A model for the lifetime of a system is considered in which the system is susceptible to simultaneous failures of two or more components, the failures having a common external cause. Three sets of discrete failure data from the US nuclear industry are examined to motivate and illustrate the model derivation: they are for motor-operated valves, cooling fans and emergency diesel generators. To achieve target reliabilities, these components must be placed in systems that have built-in redundancy. Consequently, multiple failures due to a common cause are critical in the risk of core meltdown. Vesely has offered a simple methodology for inference, called the binomial failure rate model: external events are assumed to be govemed by a Poisson shock model in which resulting shocks kill X out of m system components, X having a binomial distribution with parameters (m, p), 0 < p < 1. In many applications the binomial failure rate model fits failure data poorly, and the model has not typically been applied to probabilistic risk assessments in the nuclear industry. We introduce a realistic generalization of the binomial failure rate model by assigning a mixing distribution to the unknown parameter p. The distribution is generally identifiable, and its unique nonparametric maximum likelihood estimator can be obtained by using a simple iterative scheme.

Journal ArticleDOI
TL;DR: New evidence is provided suggesting that radioassay data are frequently overdispersed with respect to the Poisson distribution, attributed mostly to the excess fluctuations of the detection systems or, in 2 cases, sequential radioactive decay.
Abstract: New evidence is provided suggesting that radioassay data are frequently overdispersed with respect to the Poisson distribution. Twelve cases of radioassay data were measured using commonly available detection systems. The data were analyzed using a limited version of the overdispersion model developed earlier. In that limit, the relationships between three overdispersed distributions were derived and discussed: beta-Poisson, negative binomial, and overdispersed Gaussian. Out of a total of 13 cases studied (12 measured plus one from the literature), 4 were consistent with the Poisson statistics at 90% confidence level while the remaining 9 were found overdispersed. This shows that the overdispersion is rather prevalent in radioassay. All three overdispersed distributions fitted the data very well. The overdispersion was attributed mostly to the excess fluctuations of the detection systems or, in 2 cases, sequential radioactive decay.

Journal ArticleDOI
TL;DR: In most instances, the β–binomial captures the observed heterogeneity in the incidence of seed-borne fungi, and there was greater variability in seed infection than expected for a binomial (i.e., random) distribution.
Abstract: A theoretical probability distribution conveys more information than the mean in summarizing data. We investigated the ability of two discrete probability distributions, binomial and β–binomial, to describe the incidence (proportion) of seed-borne fungi among seed lots. The fit of the distributions to 185 data sets was assessed by either χ2 analysis or a dithered Kolmogorov-Smirnov goodness-of-fit test. The data sets represented a range of fungi, crops, and geographic regions. The binomial distribution was an adequate fit to only 36% of the data sets, whereas the β–binomial distribution adequately fit 85% of the data sets (P > 0.05). The β–binomial was a better fit than the binomial in 72% of data sets (P < 0.01) based on the likelihood-ratio test, indicating that there was greater variability in seed infection than expected for a binomial (i.e., random) distribution. For a subset of 25 data sets on wheat-seed infection by Fusarium graminearum Schwabe, a binary power law analysis indicated that heterogene...

Journal ArticleDOI
TL;DR: In this paper, an improvement of the known Major result concerning the accuracy of the Poisson approximation for the binomial distribution is obtained as an application for a class of functionals on the real line.
Abstract: This work is devoted to the Monge--Kantorovich problem for a class of functionals on the real line. An improvement of the known Major result concerning the accuracy of the Poisson approximation for the binomial distribution is obtained as an application.

DOI
01 Jan 2002
TL;DR: In this article, the authors propose easily interpretable discrepancy measures which allow to quantify the overdispersion effects when comparing a negative binomial regression to Poisson regression, which can lead to a validation of the Poisson regressions or a discrimination of the negative regressions.
Abstract: The Poisson regression model is often used as a first model for count data with covariates. Since this model is a GLM with canonical link, regression parameters can be easily fitted using standard software. However the model requires equidispersion, which might not be valid for the data set under consideration. There have been many models proposed in the literature to allow for overdispersion. One such model is the negative binomial regression model. In addition, score tests have been commonly used to detect overdispersion in the data. However these tests do not allow to quantify the effects of overdispersion. In this paper we propose easily interpretable discrepancy measures which allow to quantify the overdispersion effects when comparing a negative binomial regression to Poisson regression. We propose asymptotic $\alpha$-level tests for testing the size of overdispersion effects in terms of the developed discrepancy measures. A graphical display of p-values curves can then be used to allow for an exact quantification of the overdispersion effects. This can lead to a validation of the Poisson regression or a discrimination of the Poisson regression with respect to the negative binomial regression. The proposed asymptotic tests are investigated in small samples using simulation and applied to two examples.

Journal ArticleDOI
TL;DR: In this paper, a characterization of the trivariate binomial distribution is established based on the distribution of the sum of two trivariates random vectors, and the results are shown to extend to the multidimensional case in a natural way.
Abstract: By considering a trivariate binomial distribution, the regression equations are obtained and a set of necessary and sufficient conditions are given for the regression to be linear. Under the limiting conditions, the expressions approach those of trivariate Poisson distribution derived by Mahamunulu (1967). A characterization of the trivariate binomial distribution is also established based on the distribution of the sum of two trivariate random vectors. The results are shown to extend to the multidimensional case in a natural way.