scispace - formally typeset
Search or ask a question

Showing papers by "Igor Prünster published in 2009"


Journal ArticleDOI
TL;DR: A comprehensive Bayesian non‐parametric analysis of random probabilities which are obtained by normalizing random measures with independent increments (NRMI), which allows to derive a generalized Blackwell–MacQueen sampling scheme, which is then adapted to cover also mixture models driven by general NRMIs.
Abstract: . One of the main research areas in Bayesian Nonparametrics is the proposal and study of priors which generalize the Dirichlet process. In this paper, we provide a comprehensive Bayesian non-parametric analysis of random probabilities which are obtained by normalizing random measures with independent increments (NRMI). Special cases of these priors have already shown to be useful for statistical applications such as mixture models and species sampling problems. However, in order to fully exploit these priors, the derivation of the posterior distribution of NRMIs is crucial: here we achieve this goal and, indeed, provide explicit and tractable expressions suitable for practical implementation. The posterior distribution of an NRMI turns out to be a mixture with respect to the distribution of a specific latent variable. The analysis is completed by the derivation of the corresponding predictive distributions and by a thorough investigation of the marginal structure. These results allow to derive a generalized Blackwell–MacQueen sampling scheme, which is then adapted to cover also mixture models driven by general NRMIs.

211 citations


Journal ArticleDOI
TL;DR: In this article, a Bayesian non-parametric methodology has been proposed to deal with the issue of prediction within species sampling problems, which concerns the evaluation, conditional on a sample of size n, of the species variety featured by an additional sample of length m. In this paper, we focus on the two-parameter Poisson-Dirichlet model.
Abstract: Summary. A Bayesian non-parametric methodology has been recently proposed to deal with the issue of prediction within species sampling problems. Such problems concern the evaluation, conditional on a sample of size n, of the species variety featured by an additional sample of size m. Genomic applications pose the additional challenge of having to deal with large values of both n and m. In such a case the computation of the Bayesian non-parametric estimators is cumbersome and prevents their implementation. We focus on the two-parameter Poisson–Dirichlet model and provide completely explicit expressions for the corresponding estimators, which can be easily evaluated for any sizes of n and m. We also study the asymptotic behaviour of the number of new species conditionally on the observed sample: such an asymptotic result, combined with a suitable simulation scheme, allows us to derive asymptotic highest posterior density intervals for the estimates of interest. Finally, we illustrate the implementation of the proposed methodology by the analysis of five expressed sequence tags data sets.

88 citations


Journal Article
TL;DR: In this article, a Bayesian non-parametric methodology has been proposed to deal with the issue of prediction within species sampling problems, where the authors focus on the two-parameter Poisson-Dirichlet model and provide completely explicit expressions for the corresponding estimators, which can be easily evaluated for any sizes of n and m.
Abstract: A Bayesian non-parametric methodology has been recently proposed to deal with the issue of prediction within species sampling problems. Such problems concern the evaluation, conditional on a sample of size n, of the species variety featured by an additional sample of size m. Genomic applications pose the additional challenge of having to deal with large values of both n and m. In such a case the computation of the Bayesian non-parametric estimators is cumbersome and prevents their implementation. We focus on the two-parameter Poisson-Dirichlet model and provide completely explicit expressions for the corresponding estimators, which can be easily evaluated for any sizes of n and m. We also study the asymptotic behaviour of the number of new species conditionally on the observed sample: such an asymptotic result, combined with a suitable simulation scheme, allows us to derive asymptotic highest posterior density intervals for the estimates of interest. Finally, we illustrate the implementation of the proposed methodology by the analysis of five expressed sequence tags data sets.

67 citations


Journal ArticleDOI
TL;DR: A review of the results concerning distributional properties of means of random probability measures can be found in this paper, where the main focus is on means of the Dirichlet process and the connections with the moment problem, combinatorics, special functions, and stochastic processes.
Abstract: The present paper provides a review of the results concerning distributional properties of means of random probability measures. Our in- terest in this topic has originated from inferential problems in Bayesian Nonparametrics. Nonetheless, it is worth noting that these random quanti- ties play an important role in seemingly unrelated areas of research. In fact, there is a wealth of contributions both in the statistics and in the probabil- ity literature that we try to summarize in a unified framework. Particular attention is devoted to means of the Dirichlet process given the relevance of the Dirichlet process in Bayesian Nonparametrics. We then present a number of recent contributions concerning means of more general random probability measures and highlight connections with the moment problem, combinatorics, special functions,excursions of stochastic processes and sta- tistical physics.

29 citations


Journal ArticleDOI
TL;DR: In this paper, the authors provide a comprehensive analysis of the asymptotic behavior of such models and derive fixed sample size central limit theorems for both linear and quadratic functionals of the posterior hazard rate.
Abstract: An important issue in survival analysis is the investigation and the modeling of hazard rates. Within a Bayesian nonparametric framework, a natural and popular approach is to model hazard rates as kernel mixtures with respect to a completely random measure. In this paper we provide a comprehensive analysis of the asymptotic behavior of such models. We investigate consistency of the posterior distribution and derive fixed sample size central limit theorems for both linear and quadratic functionals of the posterior hazard rate. The general results are then specialized to various specific kernels and mixing measures yielding consistency under minimal conditions and neat central limit theorems for the distribution of functionals.

29 citations


Journal ArticleDOI
TL;DR: In this article, the authors provide a comprehensive analysis of the asymptotic behavior of such models and derive fixed sample size central limit theorems for both linear and quadratic functionals of the posterior hazard rate.
Abstract: An important issue in survival analysis is the investigation and the modeling of hazard rates. Within a Bayesian nonparametric framework, a natural and popular approach is to model hazard rates as kernel mixtures with respect to a completely random measure. In this paper we provide a comprehensive analysis of the asymptotic behavior of such models. We investigate consistency of the posterior distribution and derive fixed sample size central limit theorems for both linear and quadratic functionals of the posterior hazard rate. The general results are then specialized to various specific kernels and mixing measures yielding consistency under minimal conditions and neat central limit theorems for the distribution of functionals.

24 citations


Journal Article
TL;DR: In this paper, a sensitivity analysis for a wide class of Bayesian nonparametric density estimators, including the mixture of Dirichlet process and the recently pro-posed mixture of normalized inverse Gaussian process, is performed.
Abstract: Bayesian nonparametric methods have recently gained popularity in the context of density estimation. In particular, the density estimator arising from the mixture of Dirichlet process is now commonly exploited in practice. In this paper we perform a sensitivity analysis for a wide class of Bayesian nonparametric density estimators, including the mixture of Dirichlet process and the recently pro- posed mixture of normalized inverse Gaussian process. Whereas previous studies focused only on the tuning of prior parameters, our approach consists of perturb- ing the prior itself by means of a suitable function. In order to carry out the sensitivity analysis we derive representations for posterior quantities and develop an algorithm for drawing samples from mixtures with a perturbed nonparametric component. Our results bring out some clear evidence for Bayesian nonparametric density estimators, and we provide an heuristic explanation for the neutralization of the perturbation in the posterior distribution.

22 citations


Posted Content
01 Jan 2009
TL;DR: A review of Bayesian nonparametric models that go beyond the Dirichlet process can be found in this article, where the authors provide a review of non-parametric Bayesian models that can be used for statistical applications.
Abstract: Bayesian nonparametric inference is a relatively young area of research and it has recently undergone a strong development Most of its success can be explained by the considerable degree of flexibility it ensures in statistical modelling, if compared to parametric alternatives, and by the emergence of new and efficient simulation techniques that make nonparametric models amenable to concrete use in a number of applied statistical problems Since its introduction in 1973 by TS Ferguson, the Dirichlet process has emerged as a cornerstone in Bayesian nonparametrics Nonetheless, in some cases of interest for statistical applications the Dirichlet process is not an adequate prior choice and alternative nonparametric models need to be devised In this paper we provide a review of Bayesian nonparametric models that go beyond the Dirichlet process

8 citations


Posted Content
TL;DR: In this article, a Bayesian nonparametric methodology has been proposed to deal with the issue of prediction within species sampling problems, where the authors focus on the two parameter Poisson-Dirichlet model and provide completely explicit expressions for the corresponding estimators, which can be easily evaluated for any sizes of n and m.
Abstract: A Bayesian nonparametric methodology has been recently proposed in order to deal with the issue of prediction within species sampling problems. Such problems concern the evaluation, conditional on a sample of size n, of the species variety featured by an additional sample of size m. Genomic applications pose the additional challenge of having to deal with large values of both n and m. In such a case the computation of the Bayesian nonparametric estimators is cumbersome and prevents their implementation. In this paper we focus on the two parameter Poisson-Dirichlet model and provide completely explicit expressions for the corresponding estimators, which can be easily evaluated for any sizes of n and m. We also study the asymptotic behaviour of the number of new species conditionally on the observed sample: such an asymptotic result allows, combined with a suitable simulation scheme, to derive asymptotic highest posterior density intervals for the estimates of interest. Finally, we illustrate the implementation of the proposed methodology by the analysis of five Expressed Sequence Tags (EST) datasets.

6 citations