scispace - formally typeset
Search or ask a question

Showing papers in "Communications in Statistics-theory and Methods in 2016"


Journal ArticleDOI
TL;DR: In this paper, the authors introduce a new family of continuous distributions generated from a logistic random variable called the logistic-X family, which can be expressed as a linear combination of exponentiated densities based on the same baseline distribution.
Abstract: The logistic distribution has a prominent role in the theory and practice of statistics. We introduce a new family of continuous distributions generated from a logistic random variable called the logistic-X family. Its density function can be symmetrical, left-skewed, right-skewed, and reversed-J shaped, and can have increasing, decreasing, bathtub, and upside-down bathtub hazard rates shaped. Further, it can be expressed as a linear combination of exponentiated densities based on the same baseline distribution. We derive explicit expressions for the ordinary and incomplete moments, quantile and generating functions, Bonferroni and Lorenz curves, Shannon entropy, and order statistics. The model parameters are estimated by the method of maximum likelihood and the observed information matrix is determined. We also investigate the properties of one special model, the logistic-Frechet distribution, and illustrate its importance by means of two applications to real data sets.

129 citations


Journal ArticleDOI
TL;DR: In this article, an adaptive method for the automatic scaling of random-walk Metropolis-Hastings algorithms is presented, which quickly and robustly identifies the scaling factor that yields a specified overall sampler acceptance probability.
Abstract: We present an adaptive method for the automatic scaling of random-walk Metropolis–Hastings algorithms, which quickly and robustly identifies the scaling factor that yields a specified overall sampler acceptance probability. Our method relies on the use of the Robbins–Monro search process, whose performance is determined by an unknown steplength constant. Based on theoretical considerations we give a simple estimator of this constant for Gaussian proposal distributions. The effectiveness of our method is demonstrated with both simulated and real data examples.

73 citations


Journal ArticleDOI
TL;DR: Goal oriented sensitivity indices are defined and it is shown that Sobol indices are sensitivity indices associated to a particular characteristic of the distribution Y which quantify the importance of each variable Xi with respect to this parameter of interest.
Abstract: In a model of the form Y = h(X1, …, Xd) where the goal is to estimate a parameter of the probability distribution of Y, we define new sensitivity indices which quantify the importance of each variable Xi with respect to this parameter of interest The aim of this paper is to define goal oriented sensitivity indices and we will show that Sobol indices are sensitivity indices associated to a particular characteristic of the distribution Y We name the framework we present as Goal Oriented Sensitivity Analysis (GOSA)

71 citations


Journal ArticleDOI
TL;DR: In this paper, a two-parameter generalized inverse Lindley distribution capable of modeling a upside-down bathtub-shaped hazard rate function is introduced, and some statistical properties of proposed distribution are explicitly derived here.
Abstract: In this article, a two-parameter generalized inverse Lindley distribution capable of modeling a upside-down bathtub-shaped hazard rate function is introduced. Some statistical properties of proposed distribution are explicitly derived here. The method of maximum likelihood, least square, and maximum product spacings are used for estimating the unknown model parameters and also compared through the simulation study. The approximate confidence intervals, based on a normal and a log-normal approximation, are also computed. Two algorithms are proposed for generating a random sample from the proposed distribution. A real data set is modeled to illustrate its applicability, and it is shown that our distribution fits much better than some other existing inverse distributions.

71 citations


Journal ArticleDOI
TL;DR: In this article, an interpretation of the Pearson correlation coefficient as the negative association between linear regression residuals is used to develop asymmetric formulas, which allow researchers to decide upon directional dependence.
Abstract: An interpretation of the Pearson correlation coefficient as the negative association between linear regression residuals is used to develop asymmetric formulas, which allow researchers to decide upon directional dependence. Model selection based on residuals extends direction dependence methodology (originally proposed for non normal variables) to normally distributed predictors. Simulation results on the robustness of the methods and an empirical example are presented. We discuss potential advantages of a change in perspective in which non normality is not treated as a source of bias, but as a valuable characteristic of variables, which can be used to gain further insights into bi- and multivariate relations.

46 citations


Journal ArticleDOI
TL;DR: In this article, the asymptotic distribution for the difference between two independent regression coefficients was established and the proposed method was used to derive the confidence set for the differences between coefficients and hypothesis testing for the equality of the two regression models.
Abstract: In some situations, for example, in biology or psychology studies, we wish to determine whether the linear relationship between response variable and predictor variables differs in two populations. The analysis of the covariance (ANCOVA) or, equivalently, the partial F-test approaches are the commonly used methods. In this study, the asymptotic distribution for the difference between two independent regression coefficients was established. The proposed method was used to derive the asymptotic confidence set for the difference between coefficients and hypothesis testing for the equality of the two regression models. Then a simulation study was conducted to compare the proposed method with the partial F method. The performance of the new method was comparable with that of the partial F method.

37 citations


Journal ArticleDOI
TL;DR: In this paper, the authors derived analytic expressions for the biases, to O(n−1), of the maximum likelihood estimators of the parameters of the generalized Pareto distribution, and used these expressions to bias-correct the estimators in a selective manner.
Abstract: We derive analytic expressions for the biases, to O(n−1), of the maximum likelihood estimators of the parameters of the generalized Pareto distribution. Using these expressions to bias-correct the estimators in a selective manner is found to be extremely effective in terms of bias reduction, and can also result in a small reduction in relative mean squared error (MSE). In terms of remaining relative bias, the analytic bias-corrected estimators are somewhat less effective than their counterparts obtained by using a parametric bootstrap bias correction. However, the analytic correction out-performs the bootstrap correction in terms of remaining %MSE. It also performs credibly relative to other recently proposed estimators for this distribution. Taking into account the relative computational costs, this leads us to recommend the selective use of the analytic bias adjustment for most practical situations.

36 citations


Journal ArticleDOI
TL;DR: In this paper, a new discrete probability distribution with integer support on (−∞, ∞) is proposed as a discrete analog of the continuous logistic distribution and its relationship with some known distributions is discussed.
Abstract: A new discrete probability distribution with integer support on (−∞, ∞) is proposed as a discrete analog of the continuous logistic distribution. Some of its important distributional and reliability properties are established. Its relationship with some known distributions is discussed. Parameter estimation by maximum-likelihood method is presented. Simulation is done to investigate properties of maximum-likelihood estimators. Real life application of the proposed distribution as empirical model is considered by conducting a comparative data fitting with Skellam distribution, Kemp's discrete normal, Roy's discrete normal, and discrete Laplace distribution.

32 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed an unbiased sampling scheme, named paired double ranked set sampling (PDRSS), for estimating the population mean based on perfect and imperfect rankings and showed that for perfect ranking, the variance of the mean estimator under PDRSS is always less than the variance based on simple random sampling, paired RSS and RSS.
Abstract: In environmental monitoring and assessment, the main focus is to achieve observational economy and to collect data with unbiased, efficient and cost-effective sampling methods. Ranked set sampling (RSS) is one traditional method that is mostly used for accomplishing observational economy. In this article, we propose an unbiased sampling scheme, named paired double RSS (PDRSS) for estimating the population mean. We study the performance of the mean estimators under PDRSS based on perfect and imperfect rankings. It is shown that, for perfect ranking, the variance of the mean estimator under PDRSS is always less than the variance of mean estimator based on simple random sampling, paired RSS and RSS. The mean estimators under RSS, median RSS, PDRSS, and double RSS are also compared with the regression estimator of population mean based on SRS. The procedure is also illustrated with a case study using a real data set.

32 citations


Journal ArticleDOI
TL;DR: In this paper, a simple proof of the Chebyshev's inequality for random vectors is given, which provides a lower bound for the percentage of the population of an arbitrary random vector X with finite mean μ = E(X), and a positive definite covariance matrix V = Cov(X) whose Mahalanobis distance with respect to V to the mean μ is less than a fixed value.
Abstract: In this short note, a very simple proof of the Chebyshev's inequality for random vectors is given. This inequality provides a lower bound for the percentage of the population of an arbitrary random vector X with finite mean μ = E(X) and a positive definite covariance matrix V = Cov(X) whose Mahalanobis distance with respect to V to the mean μ is less than a fixed value. The main advantage of the proof is that it is a simple exercise for a first year probability course. An alternative proof based on principal components is also provided. This proof can be used to study the case of a singular covariance matrix V.

31 citations


Journal ArticleDOI
TL;DR: The asymptotic properties of the new class of LN kernel estimators using the idea of weighted distribution are studied and numerical studies based on both simulated and real data set are presented.
Abstract: The log-normal (LN) kernel estimator of a density with support [0, ∞) was discussed by Jin and Kawczak (2003). The contribution of this paper is to suggest a new class of LN kernel estimators using the idea of weighted distribution. The asymptotic properties of the new class of estimators are studied. Also, numerical studies based on both simulated and real data set are presented.

Journal ArticleDOI
TL;DR: In this paper, the authors give characterizations of discrete compound Poisson distributions under some conditions and apply them to probabilistic number theory, where the strongly additive function converges to a discrete compound poisson in distribution.
Abstract: The aim of this paper is to give some new characterizations of discrete compound Poisson distributions. Firstly, we give a characterization by the Levy–Khintchine formula of infinitely divisible distributions under some conditions. The second characterization need to present by row sum of random triangular arrays converges in distribution. And we give an application in probabilistic number theory, the strongly additive function converging to a discrete compound Poisson in distribution. The next characterization, is an extension of Watanabe’s theorem of characterization of homogeneous Poisson process. The last characterization will be illustrated by waiting time distributions, especially the matrix-exponential representation.

Journal ArticleDOI
TL;DR: In this article, the Anderson-Darling statistic, the correlation coefficient test, a statistic using moments, and a nested test against the generalized extreme value distributions are discussed along with an application to laboratory rat data, critical values calculated by the empirical distribution of test statistics are also presented.
Abstract: While the Gompertz distribution is often fitted to lifespan data, testing whether the fit satisfies theoretical criteria is being neglected. Here four goodness-of-fit measures – the Anderson–Darling statistic, the correlation coefficient test, a statistic using moments, and a nested test against the generalized extreme value distributions – are discussed. Along with an application to laboratory rat data, critical values calculated by the empirical distribution of the test statistics are also presented.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a new model based on the negative binomial Birnbaum-Saunders distribution for the number of competing causes of an event and the time to the event of interest.
Abstract: We propose a cure rate survival model by assuming that the number of competing causes of the event of interest follows the negative binomial distribution and the time to the event of interest has the Birnbaum-Saunders distribution. Further, the new model includes as special cases some well-known cure rate models published recently. We consider a frequentist analysis for parameter estimation of the negative binomial Birnbaum-Saunders model with cure rate. Then, we derive the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes. We illustrate the usefulness of the proposed model in the analysis of a real data set from the medical area.

Journal ArticleDOI
TL;DR: A Marshall-Olkin variant of the Provost type gamma-Weibull probability distribution is introduced in this paper, which provides a better fit than related distributions as measured by the Anderson-Darling and Cramer-von Mises statistics.
Abstract: A Marshall–Olkin variant of the Provost type gamma–Weibull probability distribution is being introduced in this paper. Some of its statistical functions and numerical characteristics among others characteristics function, moment generalizing function, central moments of real order are derived in the computational series expansion form and various illustrative special cases are discussed. This density function is utilized to model two real data sets. The new distribution provides a better fit than related distributions as measured by the Anderson–Darling and Cramer–von Mises statistics. The proposed distribution could find applications for instance in the physical and biological sciences, hydrology, medicine, meteorology, engineering, etc.

Journal ArticleDOI
TL;DR: In this paper, the shrinkage ridge estimator and its positive part are defined for the regression coefficient vector in a partial linear model and the differencing approach is used to enjoy the ease of parameter estimation after removing the non parametric part of the model.
Abstract: In this paper, shrinkage ridge estimator and its positive part are defined for the regression coefficient vector in a partial linear model. The differencing approach is used to enjoy the ease of parameter estimation after removing the non parametric part of the model. The exact risk expressions in addition to biases are derived for the estimators under study and the region of optimality of each estimator is exactly determined. The performance of the estimators is evaluated by simulated as well as real data sets.

Journal ArticleDOI
TL;DR: In this article, the authors investigated the small-sample quality of the maximum likelihood estimators (MLEs) of the parameters of the zero-inflated Poisson distribution and determined the finite-sample biases to O(n -1 ) using an analytic bias reduction methodology based on the work of Cox and Snell (1968) and Cordeiro and Klein (1994).
Abstract: We investigate the small-sample quality of the maximum likelihood estimators (MLEs) of the parameters of the zero-inflated Poisson distribution. The finite-sample biases are determined to O(n -1 ) using an analytic bias reduction methodology based on the work of Cox and Snell (1968) and Cordeiro and Klein (1994). Monte Carlo simulations show that the MLEs have very small percentage biases for this distribution, but the analytic bias reduction methods essentially eliminate the bias without adversely affecting the mean squared error s of the estimators. The analytic adjustment compares favourably with the parametric bootstrap bias-corrected estimator, in terms of bias reduction itself, as well as with respect to mean squared error and Pitman’s nearness measure.

Journal ArticleDOI
TL;DR: In this article, a Markov chain approach for computing the ARL, percentiles of the run length (RL) distribution and SDRL, for the two runs rules schemes of Abbas et al. was presented.
Abstract: Runs rules are usually used with Shewhart-type charts to enhance the charts' sensitivities toward small and moderate shifts. Abbas et al. in 2011 took it a step further by proposing two runs rules schemes, applied to the exponentially weighted moving average (EWMA) chart and evaluated their average run length (ARL) performances using simulation. They showed that the proposed schemes are superior to the classical EWMA chart and other schemes being investigated. Besides pointing out some erroneous ARL and standard deviation of the run length (SDRL) computations in Abbas et al., this paper presents a Markov chain approach for computing the ARL, percentiles of the run length (RL) distribution and SDRL, for the two runs rules schemes of Abbas et al. Using Markov chain, we also propose two combined runs rules EWMA schemes to quicken the two schemes of Abbas et al. in responding to large shifts. The runs rules (basic and combined rules) EWMA schemes will be compared with some existing control charting me...

Journal ArticleDOI
TL;DR: In this article, the bimodal skew-symmetric normal (BSSN) distribution was proposed to capture bimmodality, excess kurtosis, and skewness.
Abstract: We introduce a new parsimonious bimodal distribution, referred to as the bimodal skew-symmetric Normal (BSSN) distribution, which is potentially effective in capturing bimodality, excess kurtosis, and skewness. Explicit expressions for the moment-generating function, mean, variance, skewness, and excess kurtosis were derived. The shape properties of the proposed distribution were investigated in regard to skewness, kurtosis, and bimodality. Maximum likelihood estimation was considered and an expression for the observed information matrix was provided. Illustrative examples using medical and financial data as well as simulated data from a mixture of normal distributions were worked.

Journal ArticleDOI
TL;DR: In this paper, an adaptive multivariate cumulative sum (CUSUM) statistical process control chart was proposed for signaling a range of location shifts, where the reference value was changed adaptively in each run, with the current mean shift estimated by exponentially weighted moving average (EWMA) statistic.
Abstract: In this work, we proposed an adaptive multivariate cumulative sum (CUSUM) statistical process control chart for signaling a range of location shifts. This method was based on the multivariate CUSUM control chart proposed by Pignatiello and Runger (1990), but we adopted the adaptive approach similar to that discussed by Dai et al. (2011), which was based on a different CUSUM method introduced by Crosier (1988). The reference value in this proposed procedure was changed adaptively in each run, with the current mean shift estimated by exponentially weighted moving average (EWMA) statistic. By specifying the minimal magnitude of the mean shift, our proposed control chart achieved a good overall performance for detecting a range of shifts rather than a single value. We compared our adaptive multivariate CUSUM method with that of Dai et al. (2001) and the non adaptive versions of these two methods, by evaluating both the steady state and zero state average run length (ARL) values. The detection efficien...

Journal ArticleDOI
TL;DR: In this article, the authors investigate the performance of Shewhart-type control charts for zero-inflated processes with estimated parameters and propose guidelines for the statistical design of the examined charts, when the size of the preliminary sample is predetermined.
Abstract: Zero-inflated probability models are used to model count data that have an excessive number of zeros. Shewhart-type control charts have been proposed for the monitoring of zero-inflated processes. Usually their performance is evaluated under the assumption of known process parameters. However, in practice, their values are rarely known and they have to be estimated from an in-control historical Phase I sample. In the present paper, we investigate the performance of Shewhart-type control charts for zero-inflated processes with estimated parameters and propose practical guidelines for the statistical design of the examined charts, when the size of the preliminary sample is predetermined.

Journal ArticleDOI
TL;DR: In this paper, a bivariate integer-valued autoregressive time series model is presented, which is based on binomial thinning, and the unconditional and conditional first and second moments are considered.
Abstract: A bivariate integer-valued autoregressive time series model is presented. The model structure is based on binomial thinning. The unconditional and conditional first and second moments are considered. Correlation structure of marginal processes is shown to be analogous to the ARMA(2, 1) model. Some estimation methods such as the Yule–Walker and conditional least squares are considered and the asymptotic distributions of the obtained estimators are derived. Comparison between bivariate model with binomial thinning and bivariate model with negative binomial thinning is given.

Journal ArticleDOI
TL;DR: In this paper, new invariant and consistent goodness-of-fit tests for multivariate normality are introduced based on the Karhunen-Loeve transformation of a multidimensional sample from a population.
Abstract: New invariant and consistent goodness-of-fit tests for multivariate normality are introduced. Tests are based on the Karhunen–Loeve transformation of a multidimensional sample from a population. A comparison of simulated powers of tests and other well-known tests with respect to some alternatives is given. The simulation study demonstrates that power of the proposed McCull test almost does not depend on the number of grouping cells. The test shows an advantage over other chi-squared type tests. However, averaged over all of the simulated conditions examined in this article, the Anderson–Darling type and the Cramer–von Mises type tests seem to be the best.

Journal ArticleDOI
TL;DR: In this article, a new clustering technique applicable to large data set has been used to cluster the spectra of 702248 galaxies and quasars having 1540 points in wavelength range imposed by the instrument.
Abstract: Cluster analysis is the distribution of objects into different groups or more precisely the partitioning of a data set into subsets (clusters) so that the data in subsets share some common trait according to some distance measure. Unlike classi cation, in clustering one has to rst decide the optimum number of clusters and then assign the objects into different clusters. Solution of such problems for a large number of high dimensional data points is quite complicated and most of the existing algorithms will not perform properly. In the present work a new clustering technique applicable to large data set has been used to cluster the spectra of 702248 galaxies and quasars having 1540 points in wavelength range imposed by the instrument. The proposed technique has successfully discovered ve clusters from this 702248X1540 data matrix.

Journal ArticleDOI
TL;DR: In this paper, the authors applied the Cox proportional hazards model into their credit scoring models, predicting the time when a customer is most likely to default, to solve the credit risk assessment problem.
Abstract: Traditional credit risk assessment models do not consider the time factor; they only think of whether a customer will default, but not the when to default. The result cannot provide a manager to make the profit-maximum decision. Actually, even if a customer defaults, the financial institution still can gain profit in some conditions. Nowadays, most research applied the Cox proportional hazards model into their credit scoring models, predicting the time when a customer is most likely to default, to solve the credit risk assessment problem. However, in order to fully utilize the fully dynamic capability of the Cox proportional hazards model, time-varying macroeconomic variables are required which involve more advanced data collection. Since short-term default cases are the ones that bring a great loss for a financial institution, instead of predicting when a loan will default, a loan manager is more interested in identifying those applications which may default within a short period of time when app...

Journal ArticleDOI
TL;DR: Calculating Markov kernels of copulas allows not only for a precise description of the way Bertino- and diagonal copulas distribute mass, but also enables a simply proof of the fact that, for certain diagonals, both may degenerate to proper generalized shuffles of the minimum copula.
Abstract: Calculating Markov kernels of copulas allows not only for a precise description of the way Bertino- and diagonal copulas distribute mass, but also enables a simply proof of the fact that, for certain diagonals, both may degenerate to proper generalized shuffles of the minimum copula. After extending the kernel approach to the case of the maximum quasi-copula Aδ with given diagonal δ, a conjecture on singularity of Aδ by Nelsen et al. (2008) is established and an alternative simple and short proof of the result by Ubeda-Flores (2008) characterizing diagonals for which Aδ is a copula is given.

Journal ArticleDOI
TL;DR: In this article, the authors introduce a new class of lifetime distributions, which includes several previously known distributions such as those of Chahkandi and Ganjali (2009), Mahmoudi and Jafari (2012), and Nadarajah et al. (2012).
Abstract: In this article, we introduce a new class of lifetime distributions. This new class includes several previously known distributions such as those of Chahkandi and Ganjali (2009), Mahmoudi and Jafari (2012), and Nadarajah et al. (2012). This new class of four-parameter distributions allows for flexible failure rate behavior. Indeed, the failure rate function here can be increasing, decreasing, bathtub-shaped or upside-down bathtub-shaped. Several distributional properties of the new class including moments, quantiles and order statistics are studied. An EM algorithm for computing the estimates of the parameters involved is proposed and some maximum entropy characterizations are discussed. Finally, to show the flexibility and potential of the new class of distributions, applications to two real data sets are provided.

Journal ArticleDOI
TL;DR: In this article, the run length properties of the Run Rules Phase II c and np charts with estimated process parameters are derived, particularly focusing on the ARL, SDRL, and 0.95 quantiles of run length distribution.
Abstract: The performance of attributes control charts is usually evaluated under the assumption of known process parameters (i.e., the nominal proportion of non conforming units or the nominal average number of nonconformities). However, in practice, these process parameters are rarely known and have to be estimated from an in-control Phase I data set. The major contributions of this paper are (a) the derivation of the run length properties of the Run Rules Phase II c and np charts with estimated parameters, particularly focusing on the ARL, SDRL, and 0.05, 0.5, and 0.95 quantiles of the run length distribution; (b) the investigation of the number m of Phase I samples that is needed by these charts in order to obtain similar in-control ARLs to the known parameters case; and (c) the proposition of new specific chart parameters that allow these charts to have approximately the same in-control ARLs as the ones obtained in the known parameters case.

Journal ArticleDOI
TL;DR: In this article, the properties of nonparametric estimation of the expectation of g(X) (any function of X), by using a judgment poststratification sample (JPS), have been discussed.
Abstract: In this paper, some of the properties of non parametric estimation of the expectation of g(X) (any function of X), by using a judgment poststratification sample (JPS), have been discussed. A class of estimators (including the standard JPS estimator and a JPS estimator proposed by Frey and Feeman (2012, Comput. Stat. Data An.) is considered. The paper provides mean and variance of the members of this class, and examines their consistency and asymptotic distribution. Specifically, the results are for the estimation of population mean, population variance, and cumulative distribution function. We show that any estimators of the class may be less efficient than simple random sampling (SRS) estimator for small sample sizes. We prove that the relative efficiency of some estimators in the class with respect to balanced ranked set sampling (BRSS) estimator tends to 1 as the sample size goes to infinity. Furthermore, the standard JPS mean estimator, and Frey–Feeman JPS mean estimator are specifically studi...

Journal ArticleDOI
TL;DR: In this article, the authors introduced the Gompertz power series (GPS) class of distributions which is obtained by compounding GOMpertz and power series distributions and obtained several properties of the GPS distribution such as its probability density function, and failure rate function, Shannon entropy, mean residual life function, quantiles, and moments.
Abstract: In this article, we introduce the Gompertz power series (GPS) class of distributions which is obtained by compounding Gompertz and power series distributions. This distribution contains several lifetime models such as Gompertz-geometric (GG), Gompertz-Poisson (GP), Gompertz-binomial (GB), and Gompertz-logarithmic (GL) distributions as special cases. Sub-models of the GPS distribution are studied in details. The hazard rate function of the GPS distribution can be increasing, decreasing, and bathtub-shaped. We obtain several properties of the GPS distribution such as its probability density function, and failure rate function, Shannon entropy, mean residual life function, quantiles, and moments. The maximum likelihood estimation procedure via a EM-algorithm is presented, and simulation studies are performed for evaluation of this estimation for complete data, and the MLE of parameters for censored data. At the end, a real example is given.