scispace - formally typeset
Search or ask a question

Showing papers on "Poisson distribution published in 2021"


Journal ArticleDOI
TL;DR: In this article, a zero-inflated generalized Poisson (ZIGP) regression model was proposed to model domestic violence data with too many zeros, which is a good competitor to the negative binomial re-gression model when the count data is over-dispersed.
Abstract: The generalized Poisson regression model has been used to model dispersed count data. It is a good competitor to the negative binomial re- gression model when the count data is over-dispersed. Zero-inflated Poisson and zero-inflated negative binomial regression models have been proposed for the situations where the data generating process results into too many zeros. In this paper, we propose a zero-inflated generalized Poisson (ZIGP) regression model to model domestic violence data with too many zeros. Es- timation of the model parameters using the method of maximum likelihood is provided. A score test is presented to test whether the number of zeros is too large for the generalized Poisson model to adequately fit the domestic violence data.

229 citations


Journal ArticleDOI
TL;DR: A growing interest in non-Gaussian time series, particularly in series comprised of nonnegative integers (counts), is taking place in today's statistics literature as discussed by the authors, which naturally arise in...
Abstract: A growing interest in non-Gaussian time series, particularly in series comprised of nonnegative integers (counts), is taking place in today’s statistics literature. Count series naturally arise in ...

43 citations


Journal ArticleDOI
TL;DR: In this paper, the analytical model that predicts Poisson's ratios is derived, from which the crushing strength in terms of Poisson ratios is further proposed by using one-dimensional shock model.
Abstract: For the topological diversity, in-plane hexagonal honeycombs show some interesting mechanical properties which can be related to Poisson's ratios. This work trying to understand how Poisson's ratios affects the crashworthiness of in-plane honeycombs. Based on the standard beam theory, the analytical model that predicting Poisson's ratios is derived, from which the crushing strength in terms of Poisson's ratios is further proposed by using one-dimensional shock model. Combining the analytical and numerical method, Poisson's ratios on the critical crushing characteristics as plateau stress, densified strain, Specific Energy Absorption (SEA) has been investigated under crushing speeds range from 5 m/s to 150 m/s which is complied with previous study about in-plane honeycomb crashworthiness. Through observing the wave trapping behavior of honeycombs, some typical deformation modes have been identified and a deformation-mode map is generated by considering critical wave trapping speeds and Poisson's ratio.

41 citations


Journal ArticleDOI
TL;DR: In health psychology, dependent variables in health psychology are often counts, for example, of a behaviour or number of engagements with an intervention as discussed by the authors, and these counts can be very strongly skewed, and/or co...
Abstract: Background: Dependent variables in health psychology are often counts, for example, of a behaviour or number of engagements with an intervention. These counts can be very strongly skewed, and/or co...

39 citations


Journal ArticleDOI
01 Jun 2021
TL;DR: This paper gives a review of concentration inequalities which are widely employed in analyzes of mathematical statistics in a wide range of settings, from distribution free to distribution dependent, from sub-Gaussian to sub-exponential, sub-Gamma, and sub-Weibull random variables, and from the mean to the maximum concentration.
Abstract: This paper gives a review of concentration inequalities which are widely employed in non-asymptotical analyses of mathematical statistics in a wide range of settings, from distribution-free to distribution-dependent, from sub-Gaussian to sub-exponential, sub-Gamma, and sub-Weibull random variables, and from the mean to the maximum concentration. This review provides results in these settings with some fresh new results. Given the increasing popularity of high-dimensional data and inference, results in the context of high-dimensional linear and Poisson regressions are also provided. We aim to illustrate the concentration inequalities with known constants and to improve existing bounds with sharper constants.

39 citations


Journal ArticleDOI
TL;DR: In this article, a two-dimensional δ-phase carbon monochalcoated carbon dioxide (CMCO) was proposed for nanomechanical applications using first-principles calculations.
Abstract: Auxetic materials (negative Poisson’s ratio) are of exceptional importance for nanomechanical applications. Using first-principles calculations, we propose two-dimensional δ-phase carbon monochalco...

38 citations



Journal ArticleDOI
TL;DR: In this paper, the authors presented an evaluation frame-work to evaluate the suitability of applying the Poisson, NB, GP, ZIP and ZIGP regression models for counting C. caretta hatchlings.
Abstract: Recently, count regression models have been used to model over- dispersed and zero-inflated count response variable that is affected by one or more covariates. Generalized Poisson (GP) and negative binomial (NB) regression models have been suggested to deal with over-dispersion. Zero- inflated count regression models such as the zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB) and zero-inflated generalized Pois- son (ZIGP) regression models have been used to handle count data with many zeros. The aim of this study is to model the number of C. caretta hatchlings dying from exposure to the sun. We present an evaluation frame- work to the suitability of applying the Poisson, NB, GP, ZIP and ZIGP to zoological data set where the count data may exhibit evidence of many zeros and over-dispersion. Estimation of the model parameters using the method of maximum likelihood (ML) is provided. Based on the score test and the goodness of fit measure for zoological data, the GP regression model performs better than other count regression models.

30 citations


Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper considered the maximum likelihood estimate of the Poisson distribution, and utilized the Kullback-Leibler divergence for the data-fitting term to measure the observations and the underlying tensor.
Abstract: Poisson observations for videos are important models in computer vision. In this paper, we study the third-order tensor completion problem with Poisson observations. The main aim is to recover a tensor based on a small number of its Poisson observation entries. A existing matrix-based method may be applied to this problem via the matricized version of the tensor. However, this method does not leverage on the global low-rankness of a tensor and may be substantially suboptimal. Our approach is to consider the maximum likelihood estimate of the Poisson distribution, and utilize the Kullback-Leibler divergence for the data-fitting term to measure the observations and the underlying tensor. Moreover, we propose to employ a transformed tensor nuclear norm (TTNN) ball constraint and a bounded constraint of each entry, where the TTNN is used to get a lower transformed multi-rank tensor with suitable unitary transformation. We show that the upper bound of the estimator of the proposed model is less than that of the existing matrix-based method. Also a lower error bound is established. An alternating direction method of multipliers is developed to solve the resulting model. Extensive numerical experiments are presented to demonstrate the effectiveness of our proposed model.

30 citations


Journal ArticleDOI
TL;DR: In this paper, the degenerate zero-truncated Poisson random variables whose probability mass functions are a natural extension of the ZPT distribution were studied. But the authors did not investigate the properties of those random variables.
Abstract: Recently, the degenerate Poisson random variable with parameter $$\alpha > 0$$ , whose probability mass function is given by $$P_{\lambda}(i) = e_{\lambda}^{-1} (\alpha) \frac{\alpha^{i}}{i!} (1)_{i,\lambda}$$ $$(i = 0,1,2,\dots)$$ , was studied. In probability theory, the zero-truncated Poisson distributions are certain discrete probability distributions whose supports are the set of positive integers. These distributions are also known as the conditional Poisson distributions or the positive Poisson distributions. In this paper, we introduce the degenerate zero-truncated Poisson random variables whose probability mass functions are a natural extension of the zero-truncated Poisson distributions, and investigate various properties of those random variables.

26 citations


Journal ArticleDOI
14 Apr 2021-PLOS ONE
TL;DR: In this article, a Poisson distribution for the daily incidence number is used to predict the number of new SARS-CoV-2 infections during the current COVID-19 pandemic, assuming that the transmission rate will stay the same or change by a certain degree.
Abstract: Background Prediction of the dynamics of new SARS-CoV-2 infections during the current COVID-19 pandemic is critical for public health planning of efficient health care allocation and monitoring the effects of policy interventions. We describe a new approach that forecasts the number of incident cases in the near future given past occurrences using only a small number of assumptions. Methods Our approach to forecasting future COVID-19 cases involves 1) modeling the observed incidence cases using a Poisson distribution for the daily incidence number, and a gamma distribution for the series interval; 2) estimating the effective reproduction number assuming its value stays constant during a short time interval; and 3) drawing future incidence cases from their posterior distributions, assuming that the current transmission rate will stay the same, or change by a certain degree. Results We apply our method to predicting the number of new COVID-19 cases in a single state in the U.S. and for a subset of counties within the state to demonstrate the utility of this method at varying scales of prediction. Our method produces reasonably accurate results when the effective reproduction number is distributed similarly in the future as in the past. Large deviations from the predicted results can imply that a change in policy or some other factors have occurred that have dramatically altered the disease transmission over time. Conclusion We presented a modelling approach that we believe can be easily adopted by others, and immediately useful for local or state planning.

Journal ArticleDOI
TL;DR: In this article, the authors derived energy-conserving time-discretizations for finite element particle-in-cell discretizations of the Vlasov-Maxwell system.

Journal ArticleDOI
TL;DR: The theoretical foundation for a number of model selection criteria is established in the context of inhomogeneous point processes and under various asymptotic settings: infill, increasing domain and combinations of these.
Abstract: The theoretical foundation for a number of model selection criteria is established in the context of inhomogeneous point processes and under various asymptotic settings: infill, increasing domain, and combinations of these. For inhomogeneous Poisson processes we consider Akaike information criterion and the Bayesian information criterion, and in particular we identify the point process analogue of sample size needed for the Bayesian information criterion. Considering general inhomogeneous point processes we derive new composite likelihood and composite Bayesian information criteria for selecting a regression model for the intensity function. The proposed model selection criteria are evaluated using simulations of Poisson processes and cluster point processes.

Journal ArticleDOI
TL;DR: The proposed zero‐inflated Poisson factor analysis model provides valuable insights into the relation between subgingival microbiome and periodontal disease and an efficient and robust expectation‐maximization algorithm for parameter estimation is developed.
Abstract: Dimension reduction of high-dimensional microbiome data facilitates subsequent analysis such as regression and clustering. Most existing reduction methods cannot fully accommodate the special features of the data such as count-valued and excessive zero reads. We propose a zero-inflated Poisson factor analysis model in this paper. The model assumes that microbiome read counts follow zero-inflated Poisson distributions with library size as offset and Poisson rates negatively related to the inflated zero occurrences. The latent parameters of the model form a low-rank matrix consisting of interpretable loadings and low-dimensional scores that can be used for further analyses. We develop an efficient and robust expectation-maximization algorithm for parameter estimation. We demonstrate the efficacy of the proposed method using comprehensive simulation studies. The application to the Oral Infections, Glucose Intolerance, and Insulin Resistance Study provides valuable insights into the relation between subgingival microbiome and periodontal disease.

Journal ArticleDOI
TL;DR: In this article, the authors argue that leveraging the Poisson distribution would be more appropriate and use simulations to show that bivariate Poisson regression (Karlis and Ntzoufras in J R Stat Soc Ser D Stat 52(3):381-393, 2003) reduces absolute bias when estimating the home advantage benefit in a single season of soccer games, relative to linear regression, by almost 85%.
Abstract: In wake of the Covid-19 pandemic, 2019-2020 soccer seasons across the world were postponed and eventually made up during the summer months of 2020. Researchers from a variety of disciplines jumped at the opportunity to compare the rescheduled games, played in front of empty stadia, to previous games, played in front of fans. To date, most of this post-Covid soccer research has used linear regression models, or versions thereof, to estimate potential changes to the home advantage. However, we argue that leveraging the Poisson distribution would be more appropriate and use simulations to show that bivariate Poisson regression (Karlis and Ntzoufras in J R Stat Soc Ser D Stat 52(3):381-393, 2003) reduces absolute bias when estimating the home advantage benefit in a single season of soccer games, relative to linear regression, by almost 85%. Next, with data from 17 professional soccer leagues, we extend bivariate Poisson models estimate the change in home advantage due to games being played without fans. In contrast to current research that suggests a drop in the home advantage, our findings are mixed; in some leagues, evidence points to a decrease, while in others, the home advantage may have risen. Altogether, this suggests a more complex causal mechanism for the impact of fans on sporting events.

Journal ArticleDOI
TL;DR: In this article, a series of cylindrical metastructures with programmable Poisson's ratio were devised via the curling of the planar metamaterials, and the relationship of the deformation characteristics and Poisson ratio between the cylinear structures and planar meta-materials was systematically analyzed.

Journal ArticleDOI
TL;DR: In this paper, the authors propose a conceptual framework for understanding classical (dynamical) r-matrices, quasi-Poisson groupoids and so on, and also propose a notion of a symplectic realization of shifted Poisson structures.

Journal ArticleDOI
TL;DR: In this article, the authors obtain Poisson equations satisfied by elliptic modular graph functions with four links, which leads to a non-trivial algebraic relation between the various graphs.

Journal ArticleDOI
TL;DR: A machine learning approach to zero-inflated Poisson (ZIP) regression is introduced to address common difficulty arising from imbalanced financial data and demonstrates a significant improvement in performance relative to other popular alternatives, which justifies the modeling techniques.
Abstract: A machine learning approach to zero-inflated Poisson (ZIP) regression is introduced to address common difficulty arising from imbalanced financial data. The suggested ZIP can be interpreted as an adaptive weight adjustment procedure that removes the need for post-modeling re-calibration and results in a substantial enhancement of predictive accuracy. Notwithstanding the increased complexity due to the expanded parameter set, we utilize a cyclic coordinate descent optimization to implement the ZIP regression, with adjustments made to address saddle points. We also study how various approaches alleviate the potential drawbacks of incomplete exposures in insurance applications. The procedure is tested on real-life data. We demonstrate a significant improvement in performance relative to other popular alternatives, which justifies our modeling techniques.

Proceedings ArticleDOI
15 Jun 2021
TL;DR: In this article, the authors considered a bipartite graph with offline vertices on one side, and with i.i.d. online nodes on the other side and gave a 0.711-competitive online algorithm, which improved the best previous ratio of 0.706.
Abstract: We study the online stochastic matching problem. Consider a bipartite graph with offline vertices on one side, and with i.i.d.online vertices on the other side. The offline vertices and the distribution of online vertices are known to the algorithm beforehand. The realization of the online vertices, however, is revealed one at a time, upon which the algorithm immediately decides how to match it. For maximizing the cardinality of the matching, we give a 0.711-competitive online algorithm, which improves the best previous ratio of 0.706. When the offline vertices are weighted, we introduce a 0.7009-competitive online algorithm for maximizing the total weight of the matched offline vertices, which improves the best previous ratio of 0.662. Conceptually, we find that the analysis of online algorithms simplifies if the online vertices follow a Poisson process, and establish an approximate equivalence between this Poisson arrival model and online stochstic matching. Technically, we propose a natural linear program for the Poisson arrival model, and demonstrate how to exploit its structure by introducing a converse of Jensen’s inequality. Moreover, we design an algorithmic amortization to replace the analytic one in previous work, and as a result get the first vertex-weighted online stochastic matching algorithm that improves the results in the weaker random arrival model.

Journal ArticleDOI
TL;DR: GPcounts as mentioned in this paper is a GP regression method for counting data using a negative binomial likelihood function, which can be used to model temporal and spatial counts data in cases where simpler Gaussian and Poisson likelihoods are unrealistic.
Abstract: Motivation The negative binomial distribution has been shown to be a good model for counts data from both bulk and single-cell RNA-sequencing (RNA-seq). Gaussian process (GP) regression provides a useful non-parametric approach for modeling temporal or spatial changes in gene expression. However, currently available GP regression methods that implement negative binomial likelihood models do not scale to the increasingly large datasets being produced by single-cell and spatial transcriptomics. Results The GPcounts package implements GP regression methods for modelling counts data using a negative binomial likelihood function. Computational efficiency is achieved through the use of variational Bayesian inference. The GP function models changes in the mean of the negative binomial likelihood through a logarithmic link function and the dispersion parameter is fitted by maximum likelihood. We validate the method on simulated time course data, showing better performance to identify changes in over-dispersed counts data than methods based on Gaussian or Poisson likelihoods. To demonstrate temporal inference, we apply GPcounts to single-cell RNA-seq datasets after pseudotime and branching inference. To demonstrate spatial inference, we apply GPcounts to data from the mouse olfactory bulb to identify spatially variable genes and compare to two published GP methods. We also provide the option of modelling additional dropout using a zero-inflated negative binomial. Our results show that GPcounts can be used to model temporal and spatial counts data in cases where simpler Gaussian and Poisson likelihoods are unrealistic. Availability GPcounts is implemented using the GPflow library in Python and is available at https://github.com/ManchesterBioinference/GPcounts along with the data, code and notebooks required to reproduce the results presented here. The version used for this paper is archived at https://doi.org/10.5281/zenodo.5027066. Supplementary information Supplementary data are available at Bioinformatics online.

Journal ArticleDOI
01 Feb 2021
TL;DR: In this paper, a new three-parameter model called the Alpha power Gompertz is derived, studied and proposed for modeling lifetime Poisson processes, which has left skew, decreasing, unimodal density with a bathtub shaped hazard rate function.
Abstract: A new three-parameter model called the Alpha power Gompertz is derived, studied and proposed for modeling lifetime Poisson processes. The advantage of the new model is that, it has left skew, decreasing, unimodal density with a bathtub shaped hazard rate function. The statistical structural properties of the proposed model such as probability weighted moments, moments, order statistics, entropies, hazard rate, survival, quantile, odd, reversed hazard, moment generating and cumulative functions are investigated. The new proposed model is expressed as a linear mixture of Gompertz densities. The parameters of the proposed model were obtained using maximum likelihood method. The behaviour of the new density is examined through simulation. The proposed model was applied to two real-life data sets to demonstrate its flexibility. The new density proposes provides a better fit when compared with other existing models and can serve as an alternative model in the literature.

Journal ArticleDOI
TL;DR: In this article, the relationship of the likelihood function and parameter estimation between the conditional Poisson regression models and Cox's proportional hazard models in SCCS and matched cohort studies was demonstrated.
Abstract: The self-controlled case series (SCCS) and the matched cohort are two frequently used study designs to adjust for known and unknown con- founding effects in epidemiological studies. Count data arising from these two designs may not be independent. While conditional Poisson regres- sion models have been used to take into account the dependence of such data, these models have not been available in some standard statistical soft- ware packages (e.g., SAS). This article demonstrates 1) the relationship of the likelihood function and parameter estimation between the conditional Poisson regression models and Cox's proportional hazard models in SCCS and matched cohort studies; 2) that it is possible to fit conditional Pois- son regression models with procedures (e.g., PHREG in SAS) using Cox's partial likelihood model. We tested both conditional Poisson likelihood and Cox's partial likelihood models on data from studies using either SCCS or a matched cohort design. For the SCCS study, we fitted both parametric and semi-parametric models to model age effects, and described a simple way to apply the parametric and complex semi-parametric analysis to case series data.

Posted Content
TL;DR: The approach generalizes and simplifies standard compartmental models using high-dimensional systems of ordinary differential equations (ODEs) to account for disease complexity and shows that such models can always be rewritten in the framework, thus, providing a low-dimensional yet equivalent representation of these complex models.
Abstract: We present a unifying, tractable approach for studying the spread of viruses causing complex diseases that require to be modeled using a large number of types (e.g., infective stage, clinical state, risk factor class). We show that recording each infected individual's infection age, i.e., the time elapsed since infection, 1. The age distribution $n(t, a)$ of the population at time $t$ can be described by means of a first-order, one-dimensional partial differential equation (PDE) known as the McKendrick-von Foerster equation. 2. The frequency of type $i$ at time $t$ is simply obtained by integrating the probability $p(a, i)$ of being in state $i$ at age a against the age distribution $n(t, a)$. The advantage of this approach is three-fold. First, regardless of the number of types, macroscopic observables (e.g., incidence or prevalence of each type) only rely on a one-dimensional PDE "decorated" with types. This representation induces a simple methodology based on the McKendrick-von Foerster PDE with Poisson sampling to infer and forecast the epidemic. We illustrate this technique using a French data from the COVID-19 epidemic. Second, our approach generalizes and simplifies standard compartmental models using high-dimensional systems of ordinary differential equations (ODEs) to account for disease complexity. We show that such models can always be rewritten in our framework, thus, providing a low-dimensional yet equivalent representation of these complex models. Third, beyond the simplicity of the approach, we show that our population model naturally appears as a universal scaling limit of a large class of fully stochastic individual-based epidemic models, here the initial condition of the PDE emerges as the limiting age structure of an exponentially growing population starting from a single individual.

Journal ArticleDOI
TL;DR: A penalization-based regression is applied to model the impact of weather conditions on pedestrian injury in the presence of a high level of collinearity among these conditions, and it is revealed that weather conditions involved in this study are of insignificant impact on pedestrian injuries counts.
Abstract: Statistical models for measuring the impact of adverse weather conditions on pedestrian injuries are of great importance for enhancing road safety measures. The development of these models in the presence of high collinearity among the weather conditions poses a real challenge in practice. The collinearity among these conditions may result in underestimation of the regression coefficients of the regression model, and hence inconsistency regarding the impact of the weather conditions on the pedestrian injuries counts. This paper presents a methodology through which the penalization-based regression is applied to model the impact of weather conditions on pedestrian injury in the presence of a high level of collinearity among these conditions. More specifically, the methodology integrates both the least absolute shrinkage squared operator (Lasso) with the cross-validation approach. The statistical performance of the proposed methodology is assessed through an analytical comparison involving the standard Poisson regression, Poisson generalized linear model (Poisson-GzLM), and Ridge penalized regression model. The mean squared error (MSE) was used as a criterion of comparison. In terms of the MSE, the Lasso-based Poisson generalized linear model (Lasso-GzLM) revealed an advantage over the other regression models. Moreover, the study revealed that weather conditions involved in this study are of insignificant impact on pedestrian injury counts.


Journal ArticleDOI
TL;DR: In this article, the authors prove a local limit theorem for the ratio of the Poisson distribution to the Gaussian distribution with the same mean and variance, using only elementary methods (Taylor expansions and Stirling's formula).

Journal ArticleDOI
04 Jul 2021
TL;DR: In this paper, the existence of solutions and their stability of random impulsive stochastic functional differential equations (ISFDEs) driven by Poisson jumps with finite delays were studied.
Abstract: In this article, we study the existence of solutions and their stability of random impulsive stochastic functional differential equations (ISFDEs) driven by Poisson jumps with finite delays. Initially, we prove the existence of the mild solutions to the equations by using Banach fixed point theorem. Then, we study the stability of random ISFDEs through the continuous dependence of solutions on initial condition. Next, we investigate the Hyers Ulam stability results under the Lipschitz condition on a bounded and closed interval. Finally, an example is presented to illustrate our results.

Journal ArticleDOI
TL;DR: In this paper, the most commonly used regression model in general insurance pricing is the compound Poisson model with gamma claim sizes, and there are two different parametrizations for this model: the Poisson-gamma model and Tweedie's compound poisson model.
Abstract: The most commonly used regression model in general insurance pricing is the compound Poisson model with gamma claim sizes. There are two different parametrizations for this model: the Poisson-gamma parametrization and Tweedie’s compound Poisson parametrization. Insurance industry typically prefers the Poisson-gamma parametrization. We review both parametrizations, provide new results that help to lower computational costs for Tweedie’s compound Poisson parameter estimation within generalized linear models, and we provide evidence supporting the industry preference for the Poisson-gamma parametrization.

Proceedings ArticleDOI
19 Mar 2021
TL;DR: A basic understanding of probability, of its key mathematical features as well as the characteristics it presents within specific circumstances, is the aim of as mentioned in this paper, where the action of probability is related to the characteristics of the phenomena that we can forecast.
Abstract: A basic understanding of probability, of its key mathematical features as well as the characteristics it presents within specific circumstances, is the aim of this paper. The action of probability is related to the characteristics of the phenomena that we can forecast. This relation can be described as a distribution of probability. Identify the nature of phenomena (which can also be described by variables), the distribution of probability is defined. The likelihood can be represented by a binomial or Poisson distribution for categorical (or discrete) variables in the majority of cases. For their potential use, distributions of probability are briefly defined along with some examples. The Poisson distribution is a discrete distribution of probabilities that is mostly utilized within a given time span for a model distribution of count data, such as the number of traffic accidents and the number of phone calls received. Each entry starts with a definition and explanation of the Poisson distribution properties, that is followed by a discussion of how to obtain or estimate the Poisson distribution. Finally, every entry provides a discussion of applications that use python libraries for delivery.