scispace - formally typeset
Search or ask a question

Showing papers on "Poisson distribution published in 2006"


Proceedings ArticleDOI
26 Jun 2006
TL;DR: A spatially adaptive multiscale algorithm whose time and space complexities are proportional to the size of the reconstructed model, and which reduces to a well conditioned sparse linear system.
Abstract: We show that surface reconstruction from oriented points can be cast as a spatial Poisson problem. This Poisson formulation considers all the points at once, without resorting to heuristic spatial partitioning or blending, and is therefore highly resilient to data noise. Unlike radial basis function schemes, our Poisson approach allows a hierarchy of locally supported basis functions, and therefore the solution reduces to a well conditioned sparse linear system. We describe a spatially adaptive multiscale algorithm whose time and space complexities are proportional to the size of the reconstructed model. Experimenting with publicly available scan data, we demonstrate reconstruction of surfaces with greater detail than previously achievable.

2,712 citations


Book
01 Dec 2006
TL;DR: In this paper, a crash course on regular variation and weak convergence is presented, with a focus on strong convergence and the Poisson process for heavy-tailed analysis of statistics.
Abstract: Crash Courses.- Crash Course I: Regular Variation.- Crash Course II: Weak Convergence Implications for Heavy-Tail Analysis.- Statistics.- Dipping a Toe in the Statistical Water.- Probability.- The Poisson Process.- Multivariate Regular Variation and the Poisson Transform.- Weak Convergence and the Poisson Process.- Applied Probability Models and Heavy Tails.- More Statistics.- Additional Statistics Topics.- Appendices.- Notation and Conventions.- Software.

1,082 citations


Journal ArticleDOI
TL;DR: An integer-valued analogue of the classical GARCH model with Poisson deviates and a condition for the existence of such a process is given in this paper, where an application of the model to a real time series with a numerical example is given.
Abstract: An integer-valued analogue of the classical GARCH$(p,q)$ model with Poisson deviates is proposed and a condition for the existence of such a process is given. For the case $p=1$, $q=1$, it is explicitly shown that an integer-valued GARCH process is a standard ARMA$(1, 1)$ process. The problem of maximum likelihood estimation of parameters is treated. An application of the model to a real time series with a numerical example is given.

432 citations


Journal ArticleDOI
TL;DR: In this article, five regression models were used to assess the relationship between the abundance of a vulnerable plant species, Leionema ralstonii, and the environment and their predictive performance was evaluated with correlation, calibration and error statistics calculated within a bootstrap evaluation procedure that simulated performance for independent data.

354 citations


Journal ArticleDOI
TL;DR: In this article, a series of Poisson-gamma distributions were simulated using different values describing the mean, the dispersion parameter, and the sample size, and they were fitted to crash data collected in Toronto, Ont. characterized by a low sample mean and small sample size.

318 citations


01 Jan 2006
TL;DR: The study shows that a low sample mean combined with a small sample size can seriously affect the estimation of the dispersion parameter, no matter which estimator is used within the estimation process.
Abstract: This paper describes how there has been considerable research conducted on the development of statistical models for predicting crashes on highway facilities. Despite numerous advancements made for improving the estimation tools of statistical models, the most common probabilistic structure used for modeling motor vehicle crashes remains the traditional Poisson and Poisson-gamma (or Negative Binomial) distribution. When crash data exhibit over-dispersion, the Poisson-gamma model is usually the model of choice most favored by transportation safety modelers. Crash data collected for safety studies often have the unusual attributes of being characterized by low sample mean values. Studies have shown that the goodness-of-fit of statistical models produced from such datasets can be significantly affected. This issue has been defined as the “low mean problem” (LMP). Despite recent developments on methods to circumvent the LMP and test the goodness-of-fit of models developed using such datasets, no work has so far examined how the LMP affects the fixed dispersion parameter of Poisson-gamma models used for modeling motor vehicle crashes. The dispersion parameter plays an important role in many types of safety studies. The primary objective of this research project was to verify whether the LMP affects the estimation of the dispersion parameter and, if it is, to determine the magnitude of the problem. The secondary objective consisted of determining the effects of a mis-specified dispersion parameter on common analyses performed in highway safety studies. To accomplish the objectives of the study, a series of Poisson-gamma distributions were simulated using different values describing the mean, the dispersion parameter, and the sample size. Three estimators commonly used for estimating the dispersion parameter of Poisson-gamma models of motor vehicle crashes were evaluated: the method of moments (MM), the weighted regression (WR) and the Maximum Likelihood method (ML). In an attempt to complement the outcome of the simulation study, Poisson-gamma models were fitted to crash data collected in Toronto, Ont. characterized by a low sample mean and small sample size. The study shows that a low sample mean combined with a small sample size can seriously affect the estimation of the dispersion parameter, no matter which estimator is used within the estimation process. The probability the dispersion parameter becomes mis-specified increases significantly as the sample mean and sample size decrease. Consequently, the results show that a mis-specified dispersion parameter can significantly undermine empirical Bayes (EB) estimates as well as the estimation of confidence intervals for the gamma mean and predicted response. The paper ends with recommendations about minimizing the likelihood of producing Poisson-gamma models with a mis-specified dispersion parameter for modeling motor vehicle crashes.

313 citations


Proceedings ArticleDOI
20 Aug 2006
TL;DR: The experimental results indicate that the proposed time-varying Poisson model provides a robust and accurate framework for adaptively and autonomously learning how to separate unusual bursty events from traces of normal human activity.
Abstract: Time-series of count data are generated in many different contexts, such as web access logging, freeway traffic monitoring, and security logs associated with buildings. Since this data measures the aggregated behavior of individual human beings, it typically exhibits a periodicity in time on a number of scales (daily, weekly,etc.) that reflects the rhythms of the underlying human activity and makes the data appear non-homogeneous. At the same time, the data is often corrupted by a number of bursty periods of unusual behavior such as building events, traffic accidents, and so forth. The data mining problem of finding and extracting these anomalous events is made difficult by both of these elements. In this paper we describe a framework for unsupervised learning in this context, based on a time-varying Poisson process model that can also account for anomalous events. We show how the parameters of this model can be learned from count time series using statistical estimation techniques. We demonstrate the utility of this model on two datasets for which we have partial ground truth in the form of known events, one from freeway traffic data and another from building access data, and show that the model performs significantly better than a non-probabilistic, threshold-based technique. We also describe how the model can be used to investigate different degrees of periodicity in the data, including systematic day-of-week and time-of-day effects, and make inferences about the detected events (e.g., popularity or level of attendance). Our experimental results indicate that the proposed time-varying Poisson model provides a robust and accurate framework for adaptively and autonomously learning how to separate unusual bursty events from traces of normal human activity.

262 citations


Journal ArticleDOI
TL;DR: Several modeling strategies for vaccine adverse event count data in which the data are characterized by excess zeroes and heteroskedasticity are compared, illustrating that the ZINB and NBH models are preferred but these models are indistinguishable with respect to fit.
Abstract: We compared several modeling strategies for vaccine adverse event count data in which the data are characterized by excess zeroes and heteroskedasticity. Count data are routinely modeled using Poisson and Negative Binomial (NB) regression but zero-inflated and hurdle models may be advantageous in this setting. Here we compared the fit of the Poisson, Negative Binomial (NB), zero-inflated Poisson (ZIP), zero-inflated Negative Binomial (ZINB), Poisson Hurdle (PH), and Negative Binomial Hurdle (NBH) models. In general, for public health studies, we may conceptualize zero-inflated models as allowing zeroes to arise from at-risk and not-at-risk populations. In contrast, hurdle models may be conceptualized as having zeroes only from an at-risk population. Our results illustrate, for our data, that the ZINB and NBH models are preferred but these models are indistinguishable with respect to fit. Choosing between the zero-inflated and hurdle modeling framework, assuming Poisson and NB models are inadequate because of excess zeroes, should generally be based on the study design and purpose. If the study's purpose is inference then modeling framework should be considered. For example, if the study design leads to count endpoints with both structural and sample zeroes then generally the zero-inflated modeling framework is more appropriate, while in contrast, if the endpoint of interest, by design, only exhibits sample zeroes (e.g., at-risk participants) then the hurdle model framework is generally preferred. Conversely, if the study's primary purpose it is to develop a prediction model then both the zero-inflated and hurdle modeling frameworks should be adequate.

212 citations


Journal ArticleDOI
TL;DR: A class of multi-level ZIP regression model with random effects is presented to account for the preponderance of zero counts and the inherent correlation of observations and application to the analysis of correlated count data from a longitudinal infant feeding study illustrates the usefulness of the approach.
Abstract: Count data with excess zeros relative to a Poisson distribution are common in many biomedical applications. A popular approach to the analysis of such data is to use a zero-inflated Poisson (ZIP) regression model. Often, because of the hierarchical study design or the data collection procedure, zero-inflation and lack of independence may occur simultaneously, which render the standard ZIP model inadequate. To account for the preponderance of zero counts and the inherent correlation of observations, a class of multi-level ZIP regression model with random effects is presented. Model fitting is facilitated using an expectation-maximization algorithm, whereas variance components are estimated via residual maximum likelihood estimating equations. A score test for zero-inflation is also presented. The multi-level ZIP model is then generalized to cope with a more complex correlation structure. Application to the analysis of correlated count data from a longitudinal infant feeding study illustrates the usefulness of the approach.

200 citations


Journal ArticleDOI
TL;DR: In this article, the viscoelastic Poisson ratio has a different time dependence depending on the test modality chosen; interrelations are developed between Poisson's ratios in creep and relaxation.
Abstract: Poisson’s ratio in viscoelastic solids is in general a time dependent (in the time domain) or a complex frequency dependent quantity (in the frequency domain) We show that the viscoelastic Poisson’s ratio has a different time dependence depending on the test modality chosen; interrelations are developed between Poisson’s ratios in creep and relaxation The difference, for a moderate degree of viscoelasticity, is minor Correspondence principles are derived for the Poisson’s ratio in transient and dynamic contexts The viscoelastic Poisson’s ratio need not increase with time, and it need not be monotonic with time Examples are given of material microstructures which give rise to designed time dependent Poisson’s ratios

182 citations


Journal ArticleDOI
TL;DR: In this paper, a flexible class of zero inflated models, such as the zero inflated Poisson (ZIP) model, is introduced as an alternative to traditional maximum likelihood based methods to analyze defect counts.

Journal ArticleDOI
TL;DR: In this paper, a multivariate Poisson specification that simultaneously models injuries by severity is presented, and parameter estimation is performed within the Bayesian paradigm with a Gibbs sampler for crashes on Washington State highways.
Abstract: In practice, crash and injury counts are modeled by using a single equation or a series of independently specified equations, which may neglect shared information in unobserved error terms, reduce efficiency in parameter estimates, and lead to biases in sample databases. This paper offers a multivariate Poisson specification that simultaneously models injuries by severity. Parameter estimation is performed within the Bayesian paradigm with a Gibbs sampler for crashes on Washington State highways. Parameter estimates and goodness-of-fit measures are compared with a series of independent Poisson equations, and a cost-benefit analysis of a 10-mph speed limit change is provided as an example application.

Journal ArticleDOI
TL;DR: In this article, it was shown that in the formal neighborhood of a closed point in some stratum, the singularity is a product of the stratum and a transversal slice and the product decomposition is compatible with natural Poisson structures.
Abstract: We consider symplectic singularities in the sense of A. Beauville as examples of Poisson schemes. Using Poisson methods, we prove that a symplectic singularity admits a finite stratification with smooth symplectic strata. We also prove that in the formal neighborhood of a closed point in some stratum, the singularity is a product of the stratum and a transversal slice. The transversal slice is also a symplectic singularity, and the product decomposition is compatible with natural Poisson structures. Moreover, we prove that the transversal slice admits a $C^*$-action dilating the symplectic form.

Journal ArticleDOI
TL;DR: Various models for time series of counts which can account for discreteness, overdispersion and serial correlation are compared, including observation- and parameter-driven models based upon corresponding conditional Poisson distributions.

Journal ArticleDOI
TL;DR: A generalization of Poisson kriging is presented whereby the size and shape of administrative units, as well as the population density, is incorporated into the filtering of noisy mortality rates and the creation of isopleth risk maps to facilitate the analysis of relationships between health data and putative covariates that are typically measured over different spatial supports.
Abstract: Geostatistical techniques that account for spatially varying population sizes and spatial patterns in the filtering of choropleth maps of cancer mortality were recently developed. Their implementation was facilitated by the initial assumption that all geographical units are the same size and shape, which allowed the use of geographic centroids in semivariogram estimation and kriging. Another implicit assumption was that the population at risk is uniformly distributed within each unit. This paper presents a generalization of Poisson kriging whereby the size and shape of administrative units, as well as the population density, is incorporated into the filtering of noisy mortality rates and the creation of isopleth risk maps. An innovative procedure to infer the point-support semivariogram of the risk from aggregated rates (i.e. areal data) is also proposed. The novel methodology is applied to age-adjusted lung and cervix cancer mortality rates recorded for white females in two contrasted county geographies: 1) state of Indiana that consists of 92 counties of fairly similar size and shape, and 2) four states in the Western US (Arizona, California, Nevada and Utah) forming a set of 118 counties that are vastly different geographical units. Area-to-point (ATP) Poisson kriging produces risk surfaces that are less smooth than the maps created by a naive point kriging of empirical Bayesian smoothed rates. The coherence constraint of ATP kriging also ensures that the population-weighted average of risk estimates within each geographical unit equals the areal data for this unit. Simulation studies showed that the new approach yields more accurate predictions and confidence intervals than point kriging of areal data where all counties are simply collapsed into their respective polygon centroids. Its benefit over point kriging increases as the county geography becomes more heterogeneous. A major limitation of choropleth maps is the common biased visual perception that larger rural and sparsely populated areas are of greater importance. The approach presented in this paper allows the continuous mapping of mortality risk, while accounting locally for population density and areal data through the coherence constraint. This form of Poisson kriging will facilitate the analysis of relationships between health data and putative covariates that are typically measured over different spatial supports.

Journal ArticleDOI
TL;DR: In this article, a geostatistical model with the Poisson distribution was used to model both spatial variation and discrete observation process to obtain accurate maps of relative abundance of fin whales.

Journal ArticleDOI
TL;DR: In this article, a Gibbsian transition kernel is proposed for auxiliary mixture sampling of time series of counts, where the observations are assumed to arise from a Poisson distribution with a mean changing over time according to a latent process.
Abstract: We consider parameter-driven models of time series of counts, where the observations are assumed to arise from a Poisson distribution with a mean changing over time according to a latent process. Estimation of these models is carried out within a Bayesian framework using data augmentation and Markov chain Monte Carlo methods. We suggest a new auxiliary mixture sampler, which possesses a Gibbsian transition kernel, where we draw from full conditional distributions belonging to standard distribution families only. Emphasis lies on application to state space modelling of time series of counts, but we show that auxiliary mixture sampling may be applied to a wider range of parameter-driven models, including random-effects models and panel data models based on the Poisson distribution.

Journal ArticleDOI
TL;DR: In this paper, the authors characterize all two-parameter count distributions (satisfying very general conditions) that are partially closed under addition and find those for which the maximum likelihood estimator of the population mean is the sample mean.
Abstract: In this article we characterize all two-parameter count distributions (satisfying very general conditions) that are partially closed under addition. We also find those for which the maximum likelihood estimator of the population mean is the sample mean. Mixed Poisson models satisfying these properties are completely determined. Among these models are the negative binomial, Poisson-inverse Gaussian, and other known distributions. New count distributions can also be constructed using these characterizations. Three examples of application are given.

Journal ArticleDOI
TL;DR: Using the noise scale factor to estimate random errors in lidar measurements due to shot noise provides a significant advantage over the conventional error estimation techniques, in that with the NSF, uncertainties can be reliably calculated from or for a single data sample.
Abstract: We discuss the estimation of random errors due to shot noise in backscatter lidar observations that use either photomultiplier tube (PMT) or avalanche photodiode (APD) detectors. The statistical characteristics of photodetection are reviewed, and photon count distributions of solar background signals and laser backscatter signals are examined using airborne lidar observations at 532 nm using a photon-counting mode APD. Both distributions appear to be Poisson, indicating that the arrival at the photodetector of photons for these signals is a Poisson stochastic process. For Poisson- distributed signals, a proportional, one-to-one relationship is known to exist between the mean of a distribution and its variance. Although the multiplied photocurrent no longer follows a strict Poisson distribution in analog-mode APD and PMT detectors, the proportionality still exists between the mean and the variance of the multiplied photocurrent. We make use of this relationship by introducing the noise scale factor (NSF), which quantifies the constant of proportionality that exists between the root mean square of the random noise in a measurement and the square root of the mean signal. Using the NSF to estimate random errors in lidar measurements due to shot noise provides a significant advantage over the conventional error estimation techniques, in that with the NSF, uncertainties can be reliably calculated from or for a single data sample. Methods for evaluating the NSF are presented. Algorithms to compute the NSF are developed for the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations lidar and tested using data from the Lidar In-space Technology Experiment.

Journal ArticleDOI
TL;DR: In this paper, a zero-inflated Poisson regression model is developed, under which the species range is determined by a spatial probit model, including physical variables as covariates.
Abstract: Ecological counts data are often characterized by an excess of zeros and spatial dependence. Excess zeros can occur in regions outside the range of the dis- tribution of a given species. A zero-inflated Poisson regression model is developed, under which the species range is determined by a spatial probit model, including physical variables as covariates. Within that range, species counts are independently drawn from a Poisson distribution whose mean depends on biotic variables. Bayesian inference for this model is illustrated using data on oak seedling counts.

Jeong Han Kim1
01 Jan 2006
TL;DR: In this paper, the authors introduced the Poisson cloning model GPC(n, p) for random graphs in which the degrees are i.i.d. Poisson random variables with mean e := p(n. 1).
Abstract: In the random graph G(n, p) with pn bounded, the degrees of the vertices are almost i.i.d. Poisson random variables with mean e := p(n . 1). Motivated by this fact, we introduce the Poisson cloning model GPC(n, p) for random graphs in which the degrees are i.i.d. Poisson random variables with mean e. We first establish a theorem that shows that the newmodel is equivalent to the classical model G(n, p) in an asymptotic sense. Next, we introduce a useful algorithm to generate the random graph GPC(n, p), called the cut-off line algorithm. Then GPC(n, p) equipped with the cut-off line algorithm enables us to very precisely analyze the sizes of the largest component and the t-core of G(n, p). This new approach for the problems yields not only elegant proofs but also improved bounds that are essentially best possible. We also consider the Poisson cloning model for random hypergraphs and the t-core problem for random hypergraphs.

Journal ArticleDOI
TL;DR: In this article, the maximum and minimum values of Poisson's ratio n for materials with cubic symmetry were investigated for cubic crystal data and it was shown that large values of jnj occur in directions at which the Young modulus is approximately equal to one half of its 111 value.
Abstract: Expressions are given for the maximum and minimum values of Poisson’s ratio n for materials with cubic symmetry. Values less than K1 occur if and only if the maximum shear modulus is associated with the cube axis and is at least 25 times the value of the minimum shear modulus. Large values of jnj occur in directions at which the Young modulus is approximately equal to one half of its 111 value. Such directions, by their nature, are very close to 111. Application to data for cubic crystals indicates that certain Indium Thallium alloys simultaneously exhibit Poisson’s ratio less than K1 and greater than C2.

Journal ArticleDOI
TL;DR: This paper assumes that the correlated paired count data follow a bivariate Poisson distribution in order to derive the distribution of their difference, and removes correlation, which naturally exists in paired data, and improves the quality of the inference by using exact distributions instead of normal approximations.
Abstract: Paired count data usually arise in medicine when before and after treatment measurements are considered. In the present paper we assume that the correlated paired count data follow a bivariate Poisson distribution in order to derive the distribution of their difference. The derived distribution is shown to be the same as the one derived for the difference of the independent Poisson variables, thus recasting interest on the distribution introduced by Skellam. Using this distribution we remove correlation, which naturally exists in paired data, and we improve the quality of our inference by using exact distributions instead of normal approximations. The zero-inflated version is considered to account for an excess of zero counts. Bayesian estimation and hypothesis testing for the models considered are discussed. An example from dental epidemiology is used to illustrate the proposed methodology.

Journal ArticleDOI
TL;DR: It is demonstrated that suicide rate fluctuations as large as 20-40% in any year may be attributed to random error, and a simple methodology for the determination of statistically derived thresholds for detecting significant rate changes is developed.
Abstract: The objectives of this study were to generate precise estimates of suicide rates in the military while controlling for factors contributing to rate variability such as demographic differences and classification bias, and to develop a simple methodology for the determination of statistically derived thresholds for detecting significant rate changes. Suicide rate estimates were calculated for the military population and each service branch over 11 years, directly standardized to the 2000 U.S. population. Military rates were highly comparable across branches and were approximately 20% lower than the civilian rate. Direct adjustment essentially controlled for the demographic confounds in this sample. Applying the Poisson-based method, we demonstrate that suicide rate fluctuations as large as 20-40% in any year may be attributed to random error.

Journal ArticleDOI
TL;DR: In this paper, a generalized risk model driven by a nondecreasing Levy process is presented. But unlike the classical case that models the individual claim size distribution and obtains from it the aggregate claims distribution, here the aggregate claim distribution is known in closed form, it is simply the one-dimensional distribution of a subordinator.
Abstract: Dufresne et al. (1991) introduced a general risk model defined as the limit of compound Poisson processes. Such a model is either a compound Poisson process itself or a process with an infinite number of small jumps. Later, in a series of now classical papers, the joint distribution of the time of ruin, the surplus before ruin, and the deficit at ruin was studied (Gerber and Shiu 1997, 1998a, 1998b; Gerber and Landry 1998). These works use the classical and the perturbed risk models and hint that the results can be extended to gamma and inverse Gaussian risk processes. In this paper we work out this extension to a generalized risk model driven by a nondecreasing Levy process. Unlike the classical case that models the individual claim size distribution and obtains from it the aggregate claims distribution, here the aggregate claims distribution is known in closed form. It is simply the one-dimensional distribution of a subordinator. Embedded in this wide family of risk models we find the gamma, in...

Journal ArticleDOI
TL;DR: Excess zeros and variance heterogeneity are common data phenomena in insect counts and if not properly modelled, these properties can invalidate the normal distribution assumptions resulting in biased estimation of ecological effects and jeopardizing the integrity of the scientific inferences.
Abstract: Researchers and regulatory agencies often make statistical inferences from insect count data using modelling approaches that assume homogeneous variance. Such models do not allow for formal appraisal of variability which in its different forms is the subject of interest in ecology. Therefore, the objectives of this paper were to (i) compare models suitable for handling variance heterogeneity and (ii) select optimal models to ensure valid statistical inferences from insect count data. The log-normal, standard Poisson, Poisson corrected for overdispersion, zero-inflated Poisson, the negative binomial distribution and zero-inflated negative binomial models were compared using six count datasets on foliage-dwelling insects and five families of soil-dwelling insects. Akaike's and Schwarz Bayesian information criteria were used for comparing the various models. Over 50% of the counts were zeros even in locally abundant species such as Ootheca bennigseni Weise, Mesoplatys ochroptera Stal and Diaecoderus spp. The Poisson model after correction for overdispersion and the standard negative binomial distribution model provided better description of the probability distribution of seven out of the 11 insects than the log-normal, standard Poisson, zero-inflated Poisson or zero-inflated negative binomial models. It is concluded that excess zeros and variance heterogeneity are common data phenomena in insect counts. If not properly modelled, these properties can invalidate the normal distribution assumptions resulting in biased estimation of ecological effects and jeopardizing the integrity of the scientific inferences. Therefore, it is recommended that statistical models appropriate for handling these data properties be selected using objective criteria to ensure efficient statistical inference.

Journal ArticleDOI
TL;DR: In this article, the authors analyze eruption catalogs from volcanoes worldwide in order to find "universal" relationships and peculiarities linked to different eruptive styles, and build general probabilistic models for volcanic hazard assessment of open and closed conduit systems.
Abstract: [1] The modeling of the statistical distribution of eruptive frequency and volume provides basic information to assess volcanic hazard and to constrain the physics of the eruptive process. We analyze eruption catalogs from volcanoes worldwide in order to find “universal” relationships and peculiarities linked to different eruptive styles. In particular, we test (1) the Poisson process hypothesis in the time domain, looking for significant clustering of events or the presence of almost regular recurrence times, (2) the relationship between the time to the next eruption and the size of the previous event (the “time predictable” model), and (3) the relationship between the size of an event and the previous repose time (the “size predictable” model). The results indicate different behavior for volcanoes with “open” conduit regimes compared to those with “closed” conduit regimes. Open conduit systems follow a time predictable model, with a marked time clustering of events; closed conduit systems have no significant tendency toward a size or a time predictable model, and the eruptions follow mostly a Poisson distribution. These results are used to build general probabilistic models for volcanic hazard assessment of open and closed conduit systems.

Journal ArticleDOI
TL;DR: Five regression models are fitted to data assessing predictors of vigorous physical activity (VPA) among Latina women and the ZIP model fit best, suggesting that increasing days of VPA were associated with more education, and marginally associated with increasing age.
Abstract: Counting outcomes such as days of physical activity or servings of fruits and vegetables often have distributions that are highly skewed toward the right with a preponderance of zeros, posing analytical challenges. This paper demonstrates how such outcomes may be analyzed with several modifications to Poisson regression. Five regression models 1) Poisson, 2) overdispersed Poisson, 3) negative binomial, 4) zero-inflated Poisson (ZIP), and 5) zero-inflated negative binomial (ZINB) are fitted to data assessing predictors of vigorous physical activity (VPA) among Latina women. The models are described, and analytical and graphical approaches are discussed to aid in model selection. Poisson regression provided a poor fit where 82% of the subjects reported no days of VPA. The fit improved considerably with the negative binomial and ZIP models. There was little difference in fit between the ZIP and ZINB models. Overall, the ZIP model fit best. No days of VPA were associated with poorer self-reported health and less assimilation to Anglo culture, and marginally associated with increasing BMI. The intensity portion of the model suggested that increasing days of VPA were associated with more education, and marginally associated with increasing age. These underutilized models provide useful approaches for handling counting outcomes.

Journal ArticleDOI
TL;DR: The objective is to design an alarm time which is adapted to the history of the arrival process and detects the disorder time as soon as possible, and assumes in this paper that the new arrival rate after the disorder is a random variable.
Abstract: We study the quickest detection problem of a sudden change in the arrival rate of a Poisson process from a known value to an unknown and unobservable value at an unknown and unobservable disorder time. Our objective is to design an alarm time which is adapted to the history of the arrival process and detects the disorder time as soon as possible. In previous solvable versions of the Poisson disorder problem, the arrival rate after the disorder has been assumed a known constant. In reality, however, we may at most have some prior information about the likely values of the new arrival rate before the disorder actually happens, and insufficient estimates of the new rate after the disorder happens. Consequently, we assume in this paper that the new arrival rate after the disorder is a random variable. The detection problem is shown to admit a finite-dimensional Markovian sufficient statistic, if the new rate has a discrete distribution with finitely many atoms. Furthermore, the detection problem is cast as a discounted optimal stopping problem with running cost for a finite-dimensional piecewise-deterministic Markov process. This optimal stopping problem is studied in detail in the special case where the new arrival rate has Bernoulli distribution. This is a nontrivial optimal stopping problem for a two-dimensional piecewise-deterministic Markov process driven by the same point process. Using a suitable single-jump operator, we solve it fully, describe the analytic properties of the value function and the stopping region, and present methods for their numerical calculation. We provide a concrete example where the value function does not satisfy the smooth-fit principle on a proper subset of the connected, continuously differentiable optimal stopping boundary, whereas it does on the complement of this set.

Journal ArticleDOI
TL;DR: Using the form of the distribution of the interarrival times of the process N under the Palm distribution, an exploratory statistical analysis of simulated data and of Internet packet arrivals to a server is conducted.
Abstract: In this paper we consider a Poisson cluster process N as a generating process for the arrivals of packets to a server. This process generalizes in a more realistic way the infinite source Poisson model which has been used for modeling teletraffic for a long time. At each Poisson point ? j , a flow of packets is initiated which is modeled as a partial iid sum process $$\Gamma_j+\sum_i=1^kX_ji, k\le K_j$$ , with a random limit K j which is independent of (X ji ) and the underlying Poisson points (? j ). We study the covariance structure of the increment process of N. In particular, the covariance function of the increment process is not summable if the right tail P(K j > x) is regularly varying with index ?? (1, 2), the distribution of the X ji 's being irrelevant. This means that the increment process exhibits long-range dependence. If var(K j ) < ? long-range dependence is excluded. We study the asymptotic behavior of the process (N(t)) t? 0 and give conditions on the distribution of K j and X ji under which the random sums $$\sum_{i=1}^{K_j}X_{ji}$$ have a regularly varying tail. Using the form of the distribution of the interarrival times of the process N under the Palm distribution, we also conduct an exploratory statistical analysis of simulated data and of Internet packet arrivals to a server. We illustrate how the theoretical results can be used to detect distribution al characteristics of K j , X ji , and of the Poisson process.