scispace - formally typeset
Search or ask a question

Showing papers in "Brazilian Journal of Probability and Statistics in 2012"


Journal ArticleDOI
TL;DR: In this paper, the authors studied the beta power distribution and derived explicit expressions for the moments, probability weighted moments, moment generating function, mean deviations, Bonferroni and Lorenz curves, moments of order statistics, entropy and reliability.
Abstract: The power distribution is defined as the inverse of the Pareto distribution. We study in full detail a distribution so-called the beta power distribution. We obtain analytical forms for its probability density and hazard rate functions. Explicit expressions are derived for the moments, probability weighted moments, moment generating function, mean deviations, Bonferroni and Lorenz curves, moments of order statistics, entropy and reliability. We estimate the parameters by maximum likelihood. The practicability of the model is illustrated in two applications to real data.

79 citations


Journal ArticleDOI
TL;DR: It is shown that Bayesian modelling with heavy-tailed distributions has been shown to produce more reasonable con‡ict resolution, typically by favouring one source of information over the other.
Abstract: We review a substantial literature, spanning 50 years, concerning the resolution of conflicts using Bayesian heavy-tailed models. Conflicts arise when different sources of information about the model parameters (e.g., prior information, or the information in individual observations) suggest quite different plausible regions for those parameters. Traditional Bayesian models based on normal distributions or other conjugate structures typically resolve conflicts by centring the posterior at some compromise position, but this is not a realistic resolution when it means that the posterior is then in conflict with the different information sources. Bayesian modelling with heavy-tailed distributions has been shown to produce more reasonable conflict resolution, typically by favouring one source of information over the other. The less favoured source is ultimately wholly or partially rejected as the conflict becomes increasingly extreme. The literature reviewed here provides formal proofs of conflict resolution by asymptotic rejection of some information sources. Results are given for a variety of models, from the simplest case of a single observation relating to a single location parameter up to models with many location parameters, location and scale parameters, or other kinds of parameters. However, these results do not begin to address models of the kind of complexity that are routinely used in practical Bayesian modelling. In addition to reviewing the available theory, we also identify clearly the gaps in the literature that need to be filled in order for modellers to be able to develop applications with appropriate “built-in robustness.”

58 citations


Journal ArticleDOI
TL;DR: In this article, the predictive construction of priors for Bayesian nonparametric inference, for exchangeable and partially exchangeable sequences, has been reviewed and some results are revisited to shed light on theoretical connections among them.
Abstract: The characterization of models and priors through a predictive approach is a fundamental problem in Bayesian statistics. In the last decades, it has received renewed interest, as the basis of important developments in Bayesian nonparametrics and in machine learning. In this paper, we review classical and recent work based on the predictive approach in these areas. Our focus is on the predictive construction of priors for Bayesian nonparametric inference, for exchangeable and partially exchangeable sequences. Some results are revisited to shed light on theoretical connections among them.

28 citations


Journal ArticleDOI
TL;DR: In this article, the authors derive a perturbation expansion for general self-interacting random walks, where steps are made on the basis of the history of the path, and show that the expansion gives rise to useful formulae for the speed and variance of the random walk, when these quantities are known to exist.
Abstract: We derive a perturbation expansion for general self-interacting random walks, where steps are made on the basis of the history of the path. Examples of models where this expansion applies are reinforced random walk, excited random walk, the true (weakly) self-avoiding walk, loop-erased random walk, and annealed random walk in random environment. In this paper we show that the expansion gives rise to useful formulae for the speed and variance of the random walk, when these quantities are known to exist. The results and formulae of this paper have been used elsewhere by the authors to prove monotonicity properties for the speed (in high dimensions) of excited random walk and related models, and certain models of random walk in random environment. We also derive a law of large numbers and central limit theorem (with explicit error terms) directly from this expansion, under strong assumptions on the expansion coefficients. The assumptions are shown to be satisfied by excited random walk in high dimensions with small excitation parameter, a model of reinforced random walk with underlying drift and small reinforcement parameter, and certain models of random walk in random environment under strong ellipticity conditions.

20 citations


Journal ArticleDOI
TL;DR: A full Bayesian method utilizing data augmentation and Gibbs sampling algorithms is presented for analyzing non-ignorable miss- ing data and demonstrates that the simplied selection model can recover re- gression model parameters under both correctly specied situations and many mis-specied situations.
Abstract: A full Bayesian method utilizing data augmentation and Gibbs sampling algorithms is presented for analyzing non-ignorable miss- ing data. The discussion focuses on a simplied selection model for re- gression analysis. Regardless of missing mechanisms, it is assumed that missingness only depends on the missing variable itself. Simulation re- sults demonstrate that the simplied selection model can recover re- gression model parameters under both correctly specied situations and many mis-specied situations. The method is also applied to analyzing a training intervention data set with missing data.

11 citations


Journal ArticleDOI
TL;DR: In this paper, the authors developed Bayesian analysis based on the Jeffreys prior for the hyperbolic family of distributions, and they showed through a simulation study that Bayesian methods based on Jeffrey prior provide reliable point and interval estimators, and for the absolute loss function Bayesian estimators compare favorably to maximum likelihood estimators.
Abstract: In this work, we develop Bayesian analysis based on the Jeffreys prior for the hyperbolic family of distributions. It is usually difficult to estimate the four parameters in this class: to be reliable the maximum likelihood estimator typically requires large sample sizes of the order of thousands of observations. Moreover, improper prior distributions may lead to improper posterior distributions, whereas proper prior distributions may dominate the analysis. Here, we show through a simulation study that Bayesian methods based on Jeffreys prior provide reliable point and interval estimators. Moreover, this simulation study shows that for the absolute loss function Bayesian estimators compare favorably to maximum likelihood estimators. Finally, we illustrate with an application to real data that our methodology allows for parameter estimation with remarkable good properties even for a small sample size.

10 citations


Journal ArticleDOI
TL;DR: In this article, the reference prior and posterior predictive distributions in the semi-conjugate family of Poisson distributions are derived for the two Poisson samples model, where the parameter of interest is the ratio of the two poisson rates.
Abstract: We mainly investigate certain mixtures of Poisson distributions with a scale parameter in the mixing distribution. They help us to derive the bivariate Poisson mixtures arising from the prior and posterior predictive distributions in the semi-conjugate family defined by Laurent and Legrand (ESAIM Probab. Stat. (2011) DOI:10.1051/ps/2010018) for the "two Poisson samples" model, which contains in particular the reference prior when the parameter of interest is the ratio of the two Poisson rates. © Brazilian Statistical Association, 2012.

10 citations


Journal ArticleDOI
TL;DR: In this paper, generalized method of moments (GMM) and generalized quasi-likelihood (GQL) inferences for binary and count panel data models, the GQL estimation approach being more efficient than the GMM approach.
Abstract: Linear dynamic mixed models are commonly used for continuous panel data analysis in economic statistics. There exists generalized method of moments (GMM) and generalized quasi-likelihood (GQL) inferences for binary and count panel data models, the GQL estimation approach being more efficient than the GMM approach. The GMM and GQL estimating equations for the linear dynamic mixed model can not, however, be obtained from the respective estimating equations under the nonlinear models for binary and count data. In this paper, we develop the GMM and GQL estimation approaches for the linear dynamic mixed models and demonstrate that the GQL approach is more efficient than the GMM approach, also under such linear models. This makes the GQL approach uniformly more efficient than the GMM approach in estimating the parameters of both linear and nonlinear dynamic mixed models.

10 citations


Journal ArticleDOI
TL;DR: A likelihood-based estimation method for the stochastic volatility in mean (SVM) model with scale mixtures of normal (SMN) distributions (Abanto-Valle et al., 2012) and makes explicit the useful link between HMMs and SVM models with SMN distributions.
Abstract: A stochastic volatility in mean (SVM) model using the class of symmetric scale mixtures of normal (SMN) distributions is introduced in this article. The SMN distributions form a class of symmetric thick-tailed distributions that includes the normal one as a special case, providing a robust alternative to estimation in SVM models in the absence of normality. A Bayesian method via Markov-chain Monte Carlo (MCMC) techniques is used to estimate parameters. The deviance information criterion (DIC) and the Bayesian predictive information criteria (BPIC) are calculated to compare the fit of distributions. The method is illustrated by analyzing daily stock return data from the Sao Paulo Stock, Mercantile & Futures Exchange index (IBOVESPA). According to both model selection criteria as well as out-of-sample forecasting, we found that the SVM model with slash distribution provides a significant improvement in model fit as well as prediction for the IBOVESPA data over the usual normal model.

10 citations


Journal ArticleDOI
TL;DR: In this paper, the identifiability of a parametric zero-inflated Poisson (ZIP) model is investigated under a number of different assumptions, including the assumption that x being a continuous covariate in a closed interval.
Abstract: Zero-inflated Poisson (ZIP) models, which are mixture models, have been popularly used for count data that often contain large numbers of zeros, but their identifiability has not yet been thoroughly explored. In this work, we systematically investigate the identifiability of the ZIP models under a number of different assumptions. More specifically, we show the identifiability of a parametric ZIP model in which the incidence probability p(x) and Poisson mean λ(x) are modeled parametrically as p(x) = exp(β0 + β1x)/[1 + exp(β0 + β1x)] and λ(x) = exp(α0+α1x) for x being a continuous covariate in a closed interval. A semiparametric ZIP regression model is shown to be identifiable in which (i) p(x) = exp(β0 + β1x)/[1 + exp(β0 + β1x)] and λ(x) = exp[s(x)], (ii) p(x) = exp[r(x)]/{1 + exp[r(x)]} and λ(x) = exp(α0 + α1x), or (iii) p(x) = exp[r(x)]/{1 + exp[r(x)]} and λ(x) = exp[s(x)] for r(x) and s(x) being unspecified smooth functions.

10 citations


Journal ArticleDOI
TL;DR: In this article, the adaptive group Lasso was applied to select the important groups, using spline bases to approximate the nonparametric components and the group lasso to obtain an initial consistent estimator.
Abstract: We consider the problem of simultaneous variable selection and estimation in partially linear additive models with a large number of grouped variables in the linear part and a large number of nonparametric components. In our problem, the number of grouped variables may be larger than the sample size, but the number of important groups is “small” relative to the sample size. We apply the adaptive group Lasso to select the important groups, using spline bases to approximate the nonparametric components and the group Lasso to obtain an initial consistent estimator. Under appropriate conditions, it is shown that, the group Lasso selects the number of groups which is comparable with the underlying important groups and is estimation consistent, the adaptive group Lasso selects the correct important groups with probability converging to one as the sample size increases and is selection consistent. The results of simulation studies show that the adaptive group Lasso procedure works well with samples of moderate size. A real example is used to illustrate the application of the proposed penalized method.

Journal ArticleDOI
TL;DR: This paper develops a simulation-based approach to sequential inference in Bayesian statistics that provides a simple yet powerful framework for the construction of alternative posterior sampling strategies for a variety of commonly used models.
Abstract: In this paper we develop a simulation-based approach to sequential inference in Bayesian statistics. Our resampling–sampling perspective provides draws from posterior distributions of interest by exploiting the sequential nature of Bayes theorem. Predictive inferences are a direct byproduct of our analysis as are marginal likelihoods for model assessment. We illustrate our approach in a hierarchical normal-means model and in a sequential version of Bayesian lasso. This approach provides a simple yet powerful framework for the construction of alternative posterior sampling strategies for a variety of commonly used models.


Journal ArticleDOI
TL;DR: In this paper, two procedures for testing which are based on pooling the posterior evidence for the null hypothesis provided by the full Bayesian significance test and the posterior probability for thenull hypothesis are introduced.
Abstract: We introduce two procedures for testing which are based on pooling the posterior evidence for the null hypothesis provided by the full Bayesian significance test and the posterior probability for the null hypothesis. Although the proposed procedures can be used in more general situations, we focus attention in tests for a precise null hypothesis. We prove that the proposed procedure based on the linear operator is a Bayes rule. We also verify that it does not lead to the Jeffreys–Lindley paradox. For a precise null hypothesis, we prove that the procedure based on the logarithmic operator is a generalization of Jeffreys test. We apply the results to some well-known probability families. The empirical results show that the proposed procedures present good performances. As a by-product we obtain tests for normality under the skew-normal one.

Journal ArticleDOI
TL;DR: In this paper, a new long-term survival model is proposed for analyzing time-to-event data with longterm survivors fraction, where a fraction of systems is subject to failure from independent competing causes of failure, while the remaining proportion p is cured or has not presented the event of interest during the time period of the study.
Abstract: Long-term survival models have historically been considered for analyzing time-to-event data with long-term survivors fraction. However, situations in which a fraction (1 − p) of systems is subject to failure from independent competing causes of failure, while the remaining proportion p is cured or has not presented the event of interest during the time period of the study, have not been fully considered in the literature. In order to accommodate such situations, we present in this paper a new long-term survival model. Maximum likelihood estimation procedure is discussed as well as interval estimation and hypothesis tests. A real dataset illustrates the methodology.

Journal ArticleDOI
TL;DR: A nonparametric and fuzzy set estimator of the Nadaraya-Watson type regression function for independent pair of data is defined and the almost sure, in law, and uniform convergence over compact subset is established.
Abstract: In this paper, we define a nonparametric and fuzzy set estimator of the Nadaraya-Watson type regression function for independent pair of data, and we establish the almost sure, in law, and uniform convergence over compact subset

Journal ArticleDOI
TL;DR: In this paper, a heteroscedastic Von Bertalanffy growth model was proposed to estimate the growth rate of a chicken corporeal weight using a sampling-based approach, which allows information to be input beforehand with lower computational effort.
Abstract: In this work, we propose a heteroscedastic Von Bertalanffy growth model considering a multiplicative heteroscedastic dispersion matrix. All estimates were obtained using a sampling based approach, which allows information to be input beforehand with lower computational effort. Simulations were carried out in order to verify some frequentist properties of the estimation procedure in the presence of small and moderate sample sizes. The methodology is illustrated on a real Kubbard female chicken corporeal weight dataset.

Journal ArticleDOI
TL;DR: In this article, the authors adapt and implement residuals considered in the literature for the probit, logistic and skew-probit links under binary regression, and detect the presence of outliers using the residuals proposed here for different models in simulated dataset and a real medical dataset.
Abstract: Model diagnostics is an integral part of model determination and an important part of the model diagnostics is residual analysis. We adapt and implement residuals considered in the literature for the probit, logistic and skew-probit links under binary regression. New latent residuals for the skew-probit link are proposed here. We have detected the presence of outliers using the residuals proposed here for different models in a simulated dataset and a real medical dataset.

Journal ArticleDOI
TL;DR: The important problem of the ratio of gamma and beta distributed random variables is considered in this article, where the authors derived exact expressions for the probability density function, cumulative distribution function, hazard rate function, shape characteristics, moments, factorial moments, variance, skewness, kurtosis, conditional moments, L moments, characteristic function, mean deviation about the mean, mean deviations about the median, Bonferroni curve, Lorenz curve, percentiles, order statistics and the asymptotic distribution of extreme values.
Abstract: The important problem of the ratio of gamma and beta distributed random variables is considered. Six motivating applications (from efficiency modeling, income modeling, clinical trials, hydrology, reliability and modeling of infectious diseases) are discussed. Exact expressions are derived for the probability density function, cumulative distribution function, hazard rate function, shape characteristics, moments, factorial moments, variance, skewness, kurtosis, conditional moments, L moments, characteristic function, mean deviation about the mean, mean deviation about the median, Bonferroni curve, Lorenz curve, percentiles, order statistics and the asymptotic distribution of the extreme values. Estimation procedures by the methods of moments and maximum likelihood are provided and their performances compared by simulation. For maximum likelihood estimation, the Fisher information matrix is derived and the case of censoring is considered. Finally, an application is discussed for efficiency of warning-time systems.

Journal ArticleDOI
TL;DR: In this paper, Fuquene et al. proposed an analysis based on the Cauchy prior for natural parameter in exponential families, which is a robust model for count data in contrast with the usual conjugate Bayesian approach Gamma/Poisson model.
Abstract: The usual Bayesian approach for count data is Gamma/Poisson conjugate analysis. However, in this conjugate analysis the influence of the prior distribution could be dominant even when prior and likeli- hood are in conflict. Our proposal is an analysis based on the Cauchy prior for natural parameter in exponential families. In this work we show that the Cauchy/Poisson posterior model is a robust model for count data in contrast with the usual conjugate Bayesian approach Gamma/Poisson model. We use the polynomial tails comparison theo- rem given in Fuquene, J. A., Cook, J. D. and Pericchi, L. R. (2009) that gives easy-to-check conditions to ensure prior robustness. In short, this means that when the location of the prior and the bulk of the mass of the likelihood get further apart (a situation of conflict between prior and likelihood information), Bayes Theorem will cause the posterior distri- bution to discount the prior information. Finally, we analyze artificial data sets to investigate the robustness of the Cauchy/Poisson model.

Journal ArticleDOI
TL;DR: In this article, it was established that if a time series satisfies the Berman condition, and another related (summability) condition, the result of filtering that series through a certain type of filter also satisfies the two conditions.
Abstract: It is established that if a time series satisfies the Berman condition, and another related (summability) condition, the result of filtering that series through a certain type of filter also satisfies the two conditions. In particular it follows that if Xt satisfies the two conditions and if Xt and at are related by an invertible ARMA model, then the at satisfy the two conditions.

Journal ArticleDOI
TL;DR: In this paper, it was shown that all multivariate extreme value distributions have the same copula, the so-called K-extremal copula and its density are described through exact expressions.
Abstract: We show that all multivariate extreme value distributions, which are the possible weak limits of the K largest order statistics of i.i.d. samples, have the same copula, the so called K-extremal copula. This copula and its density are described through exact expressions. We also study measures of dependence, we obtain a weak convergence result and we propose a simulation algorithm for the K-extremal copula.

Journal ArticleDOI
TL;DR: In this article, more flexible misclassification models for the number of peaks in a polymorphic locus of trisomic individuals are considered and compared to some others proposed in the literature.
Abstract: Trisomies are numerical chromosomal anomalies (aneuploidies) which are common causes of mental retardation, pregnancy losses and fetal death. The determination of the meiosis I nondisjunction fraction plays an important role in the identification of possible factors which could generate such aneuploidies. In this article, more flexible misclassification models for the number of peaks in a polymorphic locus of trisomic individuals are considered. They are compared to some others proposed in the literature. Estimation and tests for the nondisjunction fraction in meiosis I and for the misclassification errors are introduced extending previous works. Using the Decision Theory approach, we also build a criterion for making decisions under Jeffreys and Pereira–Stern tests. We apply the results to Down Syndrome data that is the most prevalent trisomy in humans.

Journal ArticleDOI
TL;DR: For general smooth functions of the sample cross-moments of a stationary linear process, the authors give Cornish-Fisher expansions for general smooth function of sample crossmoments.
Abstract: We give Cornish–Fisher expansions for general smooth functions of the sample cross-moments of a stationary linear process Examples include the distributions of the sample mean, the sample autocovariance and the sample autocorrelation

Journal ArticleDOI
TL;DR: In this article, it is shown that selection of a linear summary measure on the basis of inspection of the total sample of response curves, leads to valid F-tests in the subsequent analysis of variance.
Abstract: Univariate analysis of variance of a good summary measure, or two, may provide a simple and effective way of analyzing repeated measurements. It is shown here that selection of a linear summary measure on the basis of inspection of the total sample of response curves, leads to valid F-tests in the subsequent analysis of variance. The selection may also be based on residuals from a base model, rather than on the raw data. The treatments should, however, be blinded in this summary measure selection step, that is, the inspection of the sample of curves (or residuals) and the selection of the summary measure may not rely on which responses stem from which treatment groups. It is advocated as a convenient and often effective method to use the first principal component from the total sample of curves as the first summary measure. The main mathematical result of the paper is a simple proof of the validity of the F-tests for linear summary measures selected in this way, provided data are multivariate normally distributed. Alternatively, permutation tests may be used to provide a distribution free reference distribution for the F-statistic. Two examples illustrate the method.