scispace - formally typeset
Search or ask a question

Showing papers in "Communications in Statistics-theory and Methods in 2012"


Journal ArticleDOI
TL;DR: In this article, a two-parameter discrete gamma distribution is derived corresponding to the continuous two parameters gamma distribution using the general approach for discretization of continuous probability distributions and a few important distributional and reliability properties of the proposed distribution are examined.
Abstract: A two-parameter discrete gamma distribution is derived corresponding to the continuous two parameters gamma distribution using the general approach for discretization of continuous probability distributions. One parameter discrete gamma distribution is obtained as a particular case. A few important distributional and reliability properties of the proposed distribution are examined. Parameter estimation by different methods is discussed. Performance of different estimation methods are compared through simulation. Data fitting is carried out to investigate the suitability of the proposed distribution in modeling discrete failure time data and other count data.

98 citations


Journal ArticleDOI
TL;DR: In this article, a measure of distance between two distributions that is similar Kullback-Leibler divergence, but using the distribution function rather than the density function, is introduced.
Abstract: Testing exponentiality has long been an interesting issue in statistical inferences. In this article, we introduce a new measure of distance between two distributions that is similar Kullback–Leibler divergence, but using the distribution function rather than the density function. This new measure is based on the cumulative residual entropy. Based on this new measure, a consistent test statistic for testing the hypothesis of exponentiality against some alternatives is developed. Critical values for various sample sizes determined by means of Monte Carlo simulations are presented for the test statistics. Also, by means of Monte Carlo simulations, the power of the proposed test under various alternative is compared with that of other tests. Finally, we found that the power differences between the proposed test and other tests are not remarkable. The use of the proposed test is shown in an illustrative example.

87 citations


Journal ArticleDOI
TL;DR: “A Dendrite Method for Cluster Analysis” was a classical work by Tadeusz Caliński and Joachim Harabasz, published in early Communications in Statistics and its two later parts, Theory & Methods and Simulation & Computation, right behind Kulldorff’s 1997 article.
Abstract: “A Dendrite Method for Cluster Analysis” was a classical work by Tadeusz Calinski and Joachim Harabasz, published in early Communications in Statistics (Calinski and Harabasz, 1974). Their method i...

66 citations


Journal ArticleDOI
TL;DR: In this article, a regression estimator that performs better than the ratio estimator even for modest correlation between the primary and the auxiliary variables is introduced. But the proposed estimator does not utilize the auxiliary information.
Abstract: Sousa et al. (2010) introduced a ratio estimator for the mean of a sensitive variable and showed that this estimator performs better than the ordinary mean estimator based on a randomized response technique (RRT). In this article, we introduce a regression estimator that performs better than the ratio estimator even for modest correlation between the primary and the auxiliary variables. The underlying assumption is that the primary variable is sensitive in nature but a non sensitive auxiliary variable exists that is positively correlated with the primary variable. Expressions for the Bias and MSE (Mean Square Error) are derived based on the first order of approximation. It is shown that the proposed regression estimator performs better than the ratio estimator and the ordinary RRT mean estimator (that does not utilize the auxiliary information). We also consider a generalized regression-cum-ratio estimator that has even smaller MSE. An extensive simulation study is presented to evaluate the performances o...

52 citations


Journal ArticleDOI
TL;DR: Hierarchical CUB models as mentioned in this paper are a generalization of CUB model in which parameters are allowed to be random and the main feature that distinguishes such proposal from the standard one is the modeling of variation among groups.
Abstract: Hierarchical CUB models are a generalization of CUB models in which parameters are allowed to be random. The main feature that distinguishes such proposal from the standard one is the modeling of variation among groups. We illustrate the usefulness of these hierarchical structures by discussing model specification, inferential issues, and empirical results with reference to a real data set.

46 citations


Journal ArticleDOI
TL;DR: In this paper, a class of self-exciting threshold integer-valued autoregressive models driven by independent Poisson-distributed random variables is introduced and parameter estimation is also addressed.
Abstract: In this article, we introduce a class of self-exciting threshold integer-valued autoregressive models driven by independent Poisson-distributed random variables. Basic probabilistic and statistical properties of this class of models are discussed. Moreover, parameter estimation is also addressed. Specifically, the methods of estimation under analysis are the least squares-type and likelihood-based ones. Their performance is compared through a simulation study.

45 citations


Journal ArticleDOI
TL;DR: In this article, a conditional maximum likelihood approach based on the upper-order statistics was used to estimate the tail of a power law probability distribution with exponential tempering, using simulated data from a tempered stable distribution and for several data sets from geophysics and finance.
Abstract: Tail estimates are developed for power law probability distributions with exponential tempering, using a conditional maximum likelihood approach based on the upper-order statistics. Tempered power law distributions are intermediate between heavy power-law tails and Laplace or exponential tails, and are sometimes called “semi-heavy” tailed distributions. The estimation method is demonstrated on simulated data from a tempered stable distribution, and for several data sets from geophysics and finance that show a power law probability tail with some tempering.

44 citations


Journal ArticleDOI
TL;DR: In this article, a discrete generalized exponential (DGE(α, p) distribution is introduced, which can be viewed as another generalization of the geometric distribution and it is more flexible in data modeling.
Abstract: In this article, we attempt to introduce a discrete analog of the generalized exponential distribution of Gupta and Kundu (1999). This new discrete generalized exponential (DGE(α, p)) distribution can be viewed as another generalization of the geometric distribution and it is more flexible in data modeling. We shall first study some basic distributional and moment properties of this family of new distributions. Then, we will reveal their structural properties and applications and also investigate estimation of their parameters. Finally, we shall discuss their convolution properties and arrive at some characterizations in the special cases DGE(2, p) and DGE(3, p).

42 citations


Journal ArticleDOI
TL;DR: In this article, one-and two-sample Bayesian prediction intervals based on Type-II hybrid censored data are derived, where the Gibbs sampling procedure is used to draw Markov Chain Monte Carlo (MCMC) samples, and they are in turn used to compute the approximate predictive survival function, and corresponding numerical results are presented.
Abstract: In this article, one- and two-sample Bayesian prediction intervals based on Type-II hybrid censored data are derived. For the illustration of the developed results, the Exponential(θ) and Pareto(α, β) distributions are used as examples. One-sample Bayesian predictive survival function can not be obtained in closed form. Gibbs sampling procedure is therefore used to draw Markov Chain Monte Carlo (MCMC) samples, and they are in turn used to compute the approximate predictive survival function, and the corresponding numerical results are presented.

39 citations


Journal ArticleDOI
TL;DR: In this paper, a stationary integer-valued autoregressive process with negative binomial marginals is considered and a set of estimators are considered and their asymptotic distributions are derived.
Abstract: The authors consider a stationary integer-valued autoregressive process of the first order with negative binomial marginals (NBINAR(1)). A set of estimators are considered and their asymptotic distributions are derived. Some numerical results of the estimates are presented. Also, the authors discuss a possible application of the process.

38 citations


Journal ArticleDOI
TL;DR: In this article, the authors derived analytic expressions for the biases of the maximum likelihood estimators of the scale parameter in the half-logistic distribution with known location, and of the location parameter when the latter is unknown.
Abstract: We derive analytic expressions for the biases of the maximum likelihood estimators of the scale parameter in the half-logistic distribution with known location, and of the location parameter when the latter is unknown. Using these expressions to bias-correct the estimators is highly effective, without adverse consequences for estimation mean squared error. The overall performance of the first of these bias-corrected estimators is slightly better than that of a bootstrap bias-corrected estimator. The bias-corrected estimator of the location parameter significantly out-performs its bootstrapped-based counterpart. Taking computational costs into account, the analytic bias corrections clearly dominate the use of the bootstrap.

Journal ArticleDOI
TL;DR: In this article, the authors extended the idea of Carot et al. (2002) to combine double sampling and variable sampling interval s charts (DSVSI s chart) for improving efficiency in detection of small standard deviation shifts.
Abstract: This study extended the idea of Carot et al. (2002) to combine double sampling and variable sampling interval s charts (DSVSI s chart) for improving efficiency in detection of small standard deviation shifts. The performance of DSVSI s charts in detecting small shifts was measured and compared with double sampling, variable sampling intervals, EWMA, and Cusum control charts. This comparative study presented found the DSVSI s chart to be efficient in detecting small shifts.

Journal ArticleDOI
TL;DR: In this paper, a numerical method for solving continuous time Markov chain (CTMC) model for reliability evaluation of phased-mission system is presented, which generates infinitesimal matrix based on the statistical independence of subsystem failure and repair process.
Abstract: This article presents a numerical method for solving continuous time Markov chain (CTMC) model for reliability evaluation of phased-mission system. The method generates infinitesimal matrix based on the statistical independence of subsystem failure and repair process. The infinitesimal generator matrix is stored by the use of sparse matrix-compressed storage schemes, and the transient solution of the CTMC model is obtained by using three methods including the uniformization method, forward Euler method, and Runge-Kutta method, which take advantage of the sparseness of the infinitesimal generator matrix. An example PMS is used to compare the preconditioning methods for sparse matrix and numerical methods. Experiment results show that compressed row storage scheme (CRS) saves more storage than other storage formats, and uniformization method combined with CRS achieves the best efficiency and accuracy.

Journal ArticleDOI
TL;DR: In this paper, a class of goodness-of-fit tests for the gamma distribution that utilizes the empirical Laplace transform was proposed, and the consistency of the tests as well as their asymptotic distribution under the null hypothesis were investigated.
Abstract: We propose a class of goodness-of-fit tests for the gamma distribution that utilizes the empirical Laplace transform. The consistency of the tests as well as their asymptotic distribution under the null hypothesis are investigated. As the decay of the weight function tends to infinity, the test statistics approach limit values related to the first non zero component of Neyman's smooth test for the gamma law. The new tests are compared with other omnibus tests for the gamma distribution.

Journal ArticleDOI
TL;DR: In this paper, a functional form of the generalized Poisson regression model that parametrically nests the Poisson and the two well known generalized poisson regression models (GP-1 and GP-2) is presented.
Abstract: This article develops a functional form of the generalized Poisson regression model that parametrically nests the Poisson and the two well known generalized Poisson regression models (GP-1 and GP-2). The proposed model is applied on the Malaysian motor insurance claim count data.

Journal ArticleDOI
TL;DR: In this article, the authors show that the confidence posterior probability of an interval hypothesis is suitable as an estimator of the indicator of hypothesis truth since it converges to 1 if the hypothesis is true or to 0 otherwise.
Abstract: By representing fair betting odds according to one or more pairs of confidence set estimators, dual parameter distributions called confidence posteriors secure the coherence of actions without any prior distribution. This theory reduces to the maximization of expected utility when the pair of posteriors is induced by an exact or approximate confidence set estimator or when a reduction rule is applied to the pair. Unlike the p-value, the confidence posterior probability of an interval hypothesis is suitable as an estimator of the indicator of hypothesis truth since it converges to 1 if the hypothesis is true or to 0 otherwise.

Journal ArticleDOI
TL;DR: In this article, the authors present various distributional properties and application to reliability analysis of the Govindarajulu distribution, and make a comparative study with other competing models with reference to real data.
Abstract: In this article, we present various distributional properties and application to reliability analysis of the Govindarajulu distribution. A quantile-based analysis is performed as the distribution function is not analytically tractable. The properties of the distribution like percentiles, L-moments, L-skewness, and kurtosis and order statistics are presented. Various reliability characteristics are derived along with some characterization theorems by relationship between reliability measures. We also make a comparative study with other competing models with reference to real data.

Journal ArticleDOI
TL;DR: A new method of text clustering is proposed, considering links between terms and documents, and using centrality measures to assess word/text importance in a corpus and to sequentially classify documents.
Abstract: Text clustering is an unsupervised process of classifying texts and words into different groups. In literature, many algorithms use a bag of words model to represent texts and classify contents. The bag of words model assumes that word order has no signicance. The aim of this article is to propose a new method of text clustering, considering links between terms and documents. We use centrality measures to assess word/text importance in a corpus and to sequentially classify documents.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a robust model for parameter estimation when modeling in the presence of latent heterogeneity of population, where the data are a mixture of subgroups to allow for heterogeneity of the population.
Abstract: This article proposes a robust model for parameter estimation when modeling in the presence of latent heterogeneity of population. Mixture of regression models are used for modeling when the data are a mixture of subgroups to allow for heterogeneity of the population. Parameter estimates from standard mixture of linear regression models are sensitive to atypical observations. To study mixtures of linear regression models, we introduce a class of robust estimators, called S-estimators. We investigate their breakdown point in mixture of linear regression models. It is expected that the robust S-estimators can achieve the high breakdown point in the contaminated data from the heterogenous populations. This model presents a unified, robust framework and parameter estimation is achieved via an expectation-conditional maximization (ECM) algorithm. This new family of robust mixture models is validated through Monte Carlo simulations. The application to real life data sets has shown that use of robust S-estimator...

Journal ArticleDOI
TL;DR: In this paper, the authors focus on the general k-step step-stress accelerated life tests with Type-I censoring for two-parameter Weibull distributions based on the tampered failure rate (TFR) model.
Abstract: In this article, we focus on the general k-step step-stress accelerated life tests with Type-I censoring for two-parameter Weibull distributions based on the tampered failure rate (TFR) model. We get the optimum design for the tests under the criterion of the minimization of the asymptotic variance of the maximum likelihood estimate of the pth percentile of the lifetime under the normal operating conditions. Optimum test plans for the simple step-stress accelerated life tests under Type-I censoring are developed for the Weibull distribution and the exponential distribution in particular. Finally, an example is provided to illustrate the proposed design and a sensitivity analysis is conducted to investigate the robustness of the design.

Journal ArticleDOI
TL;DR: This article investigates how the use of permutation tests instead of parametric ones affects the performance of Bayesian network structure learning from discrete data.
Abstract: In literature there are several studies on the performance of Bayesian network structure learning algorithms. The focus of these studies is almost always the heuristics the learning algorithms are based on, i.e., the maximization algorithms (in score-based algorithms) or the techniques for learning the dependencies of each variable (in constraint-based algorithms). In this article, we investigate how the use of permutation tests instead of parametric ones affects the performance of Bayesian network structure learning from discrete data. Shrinkage tests are also covered to provide a broad overview of the techniques developed in current literature.

Journal ArticleDOI
TL;DR: In this article, the authors used combinations of factorial points, axial points, and complementary design points to estimate the parameters of a third-order response surface model, which was used to make the most of a situation in which the experiment was designed for an inadequate model, but can still provide useful information.
Abstract: Box-Behnken designs are popular with experimenters who wish to estimate a second-order model, due to their having three levels, their simplicity and their high efficiency for the second-order model. However, there are situations in which the model is inadequate due to lack of fit caused by higher-order terms. These designs have little ability to estimate third-order terms. Using combinations of factorial points, axial points, and complementary design points, we augment these designs and develop catalogues of third-order designs for 3–12 factors. These augmented designs can be used to estimate the parameters of a third-order response surface model. Since the aim is to make the most of a situation in which the experiment was designed for an inadequate model, the designs are clearly suboptimal and not rotatable for the third-order model, but can still provide useful information.

Journal ArticleDOI
TL;DR: In this article, a variable two-stage acceptance sampling plan is developed when the quality characteristic is evaluated through a process loss function, and the plan parameters of the proposed plan are determined by using the two-point approach and tabulated according to various quality levels.
Abstract: In this article, a variable two-stage acceptance sampling plan is developed when the quality characteristic is evaluated through a process loss function. The plan parameters of the proposed plan are determined by using the two-point approach and tabulated according to various quality levels. Two cases are discussed when the process mean lies at the target value and when it does not, respectively. Extensive tables are provided for both cases and the results are explained with examples. The advantage of the proposed plan is compared with the existing variable single acceptance sampling plan using the process loss function.

Journal ArticleDOI
TL;DR: In this paper, the authors provide a better foundation for some properties and an analytical study of its bimodality, and derive explicit expressions for moments, generating function, mean deviations using a power series expansion for the quantile function, and Shannon entropy.
Abstract: The beta normal distribution is a generalization of both the normal distribution and the normal order statistics. Some of its mathematical properties and a few applications have been studied in the literature. We provide a better foundation for some properties and an analytical study of its bimodality. The hazard rate function and the limiting behavior are examined. We derive explicit expressions for moments, generating function, mean deviations using a power series expansion for the quantile function, and Shannon entropy.

Journal ArticleDOI
TL;DR: In this article, the double separable covariance matrix of order three was proposed for the multivariate normal distribution with a Kronecker product structured covariance matrices, which generalizes the procedure proposed by Srivastava et al. (2008).
Abstract: In this article, the multivariate normal distribution with a Kronecker product structured covariance matrix is studied. Particularly focused is the estimation of a Kronecker structured covariance matrix of order three, the so called double separable covariance matrix. The suggested estimation generalizes the procedure proposed by Srivastava et al. (2008) for a separable covariance matrix. The restrictions imposed by separability and double separability are also discussed.

Journal ArticleDOI
TL;DR: In this article, a copula-based method is proposed for analyzing the reliability of supply chains, by introducing the model of k-out-of-n: G system into the studies of supply chain, an evaluation method is suggested and the reliability indexes are obtained.
Abstract: Supply chain management has received considerable attention in the literature and it is meaningful and important to be able to measure the reliability of supply chains. In the article, the suppliers in the supply chain systems are not independent of each other and the dependency relation may be either linear or nonlinear correlation. From the view of the distribution service process, a copula-based method is proposed for analyzing the reliability of supply chains. In this article, by introducing the model of k-out-of-n: G system into the studies of supply chains, an evaluation method is suggested and the reliability indexes are obtained. Finally, a numerical example is presented to illustrate the results obtained in this article.

Journal ArticleDOI
TL;DR: In this article, a structural form of an M-Wright distributed random variable is derived and the mixture representation then led to a random number generation algorithm.
Abstract: In this article, a structural form of an M-Wright distributed random variable is derived. The mixture representation then led to a random number generation algorithm. A formal parameter estimation procedure is also proposed. This procedure is needed to make the M-Wright function usable in practice. The asymptotic normality of the estimator is established as well. The estimator and the random number generation algorithm are then tested using synthetic data.

Journal ArticleDOI
TL;DR: The Max-GWMA chart as mentioned in this paper is based on the maximum of the absolute values of two weighted moving average (GWMA) statistics, one for controlling the mean and the other the variance, and it outperforms the combined GWMA chart in terms of the average run length, standard deviation of the run length (SDRL), and diagnostic abilities performances.
Abstract: Two generally weighted moving average (GWMA) charts are usually used concurrently for a simultaneous monitoring of the process mean and process variance. In this article, we propose a new GWMA chart, called the Max-GWMA chart, which uses a single statistic for a simultaneous monitoring of the process mean and variance. The statistic of the Max-GWMA chart is based on the maximum of the absolute values of two GWMA statistics, one for controlling the mean while the other the variance. We show that the Max-GWMA chart outperforms the combined GWMA chart, in terms of the average run length (ARL), standard deviation of the run length (SDRL) and diagnostic abilities performances. The combined GWMA chart consists of two GWMA charts that are run concurrently, one for monitoring the mean and the other the variance.

Journal ArticleDOI
TL;DR: A coefficient is developed, called the random effects coefficient of determination, that estimates the proportion of the conditional variance of the dependent variable explained by random effects, that takes values from 0 to 1 and indicates how strong therandom effects are.
Abstract: The key feature of a mixed model is the presence of random effects. We have developed a coefficient, called the random effects coefficient of determination, [Formula: see text], that estimates the proportion of the conditional variance of the dependent variable explained by random effects. This coefficient takes values from 0 to 1 and indicates how strong the random effects are. The difference from the earlier suggested fixed effects coefficient of determination is emphasized. If [Formula: see text] is close to 0, there is weak support for random effects in the model because the reduction of the variance of the dependent variable due to random effects is small; consequently, random effects may be ignored and the model simplifies to standard linear regression. The value of [Formula: see text] apart from 0 indicates the evidence of the variance reduction in support of the mixed model. If random effects coefficient of determination is close to 1 the variance of random effects is very large and random effects turn into free fixed effects-the model can be estimated using the dummy variable approach. We derive explicit formulas for [Formula: see text] in three special cases: the random intercept model, the growth curve model, and meta-analysis model. Theoretical results are illustrated with three mixed model examples: (1) travel time to the nearest cancer center for women with breast cancer in the U.S., (2) cumulative time watching alcohol related scenes in movies among young U.S. teens, as a risk factor for early drinking onset, and (3) the classic example of the meta-analysis model for combination of 13 studies on tuberculosis vaccine.

Journal ArticleDOI
TL;DR: This work develops a testing procedure for a class of mixture models with covariates (defined as CUB models), proposed by Piccolo and D'Elia and Piccolo (2003) and generally developed in a parametric context, and proposes a nonparametric solution to perform inference on CUB Models, specifically on the coefficients of the covariates.
Abstract: In statistical surveys, people are often asked to express evaluations on several topics or to make an ordered arrangement in a list of objects (items, services, sentences, etc.); thus, the analysis of ratings and rankings is receiving a growing interest in many fields. In this framework, we develop a testing procedure for a class of mixture models with covariates (defined as CUB models), proposed by Piccolo (2003) and D'Elia and Piccolo (2005) and generally developed in a parametric context. Instead, we propose a nonparametric solution to perform inference on CUB models, specifically on the coefficients of the covariates. A simulation study proves that this approach is more appropriate in some specific data settings, mostly for small sample sizes.