A Product Partition Model With Regression on Covariates.

doi:10.1198/JCGS.2011.09066

Open AccessJournal ArticleDOI

A Product Partition Model With Regression on Covariates.

Peter Müller, +2 more

- 01 Mar 2011 -

Journal of Computational and Graphical S...

- Vol. 20, Iss: 1, pp 260-278

Chats0

TLDR

A model-based clustering algorithm that exploits available covariates is developed that is suitable for any combination of continuous, categorical, count, and ordinal covariates and formalizes Posterior predictive inference in this model.

Abstract:

We propose a probability model for random partitions in the presence of covariates. In other words, we develop a model-based clustering algorithm that exploits available covariates. The motivating application is predicting time to progression for patients in a breast cancer trial. We proceed by reporting a weighted average of the responses of clusters of earlier patients. The weights should be determined by the similarity of the new patient's covariate with the covariates of patients in each cluster. We achieve the desired inference by defining a random partition model that includes a regression on covariates. Patients with similar covariates are a priori more likely to be clustered together. Posterior predictive inference in this model formalizes the desired prediction.We build on product partition models (PPM). We define an extension of the PPM to include a regression on covariates by including in the cohesion function a new factor that increases the probability of experimental units with similar covariates to be included in the same cluster. We discuss implementations suitable for any combination of continuous, categorical, count, and ordinal covariates.An implementation of the proposed model as R-package is available for download.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

How to find an appropriate clustering for mixed-type variables with application to socio-economic stratification

Christian Hennig, +1 more

- 01 May 2013 -

Journal of The Royal Statistical Society...

TL;DR: The application of a philosophy of cluster analysis to economic data from the 2007 US Survey of Consumer Finances demonstrates techniques and decisions required to obtain an interpretable clustering, and the clustering is shown to be significantly more structured than a suitable null model.

...read moreread less

Journal ArticleDOI

Bayesian Nonparametric Inference – Why and How

Peter Müller, +1 more

- 01 Jun 2013 -

Bayesian Analysis

TL;DR: Inference under models with nonparametric Bayesian (BNP) priors is reviewed for density estimation, clustering, regression and for mixed effects models with random effects distributions.

...read moreread less

Journal ArticleDOI

Mixture Models With a Prior on the Number of Components

Jeffrey W. Miller, +1 more

- 01 Jan 2018 -

Journal of the American Statistical Asso...

TL;DR: The most commonly used method of inference for MFMs is reversible jump Markov chain Monte Carlo, but it can be nontrivial to design good reversible jump moves, especially in high-dimensional spaces as discussed by the authors.

...read moreread less

Posted Content

Mixture models with a prior on the number of components

Jeffrey W. Miller, +1 more

- 22 Feb 2015 -

arXiv: Methodology

TL;DR: It turns out that many of the essential properties of DPMs are also exhibited by MFMs, and the MFM analogues are simple enough that they can be used much like the corresponding DPM properties; this simplifies the implementation of MFMs and can substantially improve mixing.

...read moreread less

Journal ArticleDOI

PReMiuM: An R Package for Profile Regression Mixture Models Using Dirichlet Processes

Silvia Liverani, +4 more

- 20 Mar 2015 -

Journal of Statistical Software

TL;DR: PReMiuM as mentioned in this paper is a recently developed R package for Bayesian clustering using a Dirichlet process mixture model, which allows binary, categorical, count and continuous response, as well as continuous and discrete covariates.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

A Bayesian Analysis of Some Nonparametric Problems

Thomas S. Ferguson

- 01 Mar 1973 -

Annals of Statistics

TL;DR: In this article, a class of prior distributions, called Dirichlet process priors, is proposed for nonparametric problems, for which treatment of many non-parametric statistical problems may be carried out, yielding results that are comparable to the classical theory.

...read moreread less

Journal ArticleDOI

Model-Based Clustering, Discriminant Analysis, and Density Estimation

Chris Fraley, +1 more

- 01 Jun 2002 -

Journal of the American Statistical Asso...

TL;DR: This work reviews a general methodology for model-based clustering that provides a principled statistical approach to important practical questions that arise in cluster analysis, such as how many clusters are there, which clustering method should be used, and how should outliers be handled.

...read moreread less

Journal ArticleDOI

Hierarchical mixtures of experts and the EM algorithm

Michael I. Jordan, +1 more

- 01 Mar 1994 -

Neural Computation

TL;DR: An Expectation-Maximization (EM) algorithm for adjusting the parameters of the tree-structured architecture for supervised learning and an on-line learning algorithm in which the parameters are updated incrementally.

...read moreread less

Journal ArticleDOI

Model-based Gaussian and non-Gaussian clustering

Jeffrey D. Banfield, +1 more

- 01 Sep 1993 -

Biometrics

TL;DR: The classification maximum likelihood approach is sufficiently general to encompass many current clustering algorithms, including those based on the sum of squares criterion and on the criterion of Friedman and Rubin (1967), but it is restricted to Gaussian distributions and it does not allow for noise.

...read moreread less

Journal ArticleDOI

On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion)

Sylvia Richardson, +1 more

- 01 Jan 1997 -

Journal of The Royal Statistical Society...

TL;DR: In this paper, a hierarchical prior model is proposed to deal with weak prior information while avoiding the mathematical pitfalls of using improper priors in the mixture context, which can be used as a basis for a thorough presentation of many aspects of the posterior distribution.

...read moreread less

Collapse

A Product Partition Model With Regression on Covariates.

Citations

How to find an appropriate clustering for mixed-type variables with application to socio-economic stratification

Bayesian Nonparametric Inference – Why and How

Mixture Models With a Prior on the Number of Components

Mixture models with a prior on the number of components

PReMiuM: An R Package for Profile Regression Mixture Models Using Dirichlet Processes

References

A Bayesian Analysis of Some Nonparametric Problems

Model-Based Clustering, Discriminant Analysis, and Density Estimation

Hierarchical mixtures of experts and the EM algorithm

Model-based Gaussian and non-Gaussian clustering

On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion)

Related Papers (5)

A Bayesian Analysis of Some Nonparametric Problems

Markov Chain Sampling Methods for Dirichlet Process Mixture Models

Gibbs sampling methods for stick-breaking priors

The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator

Ferguson Distributions Via Polya Urn Schemes