Merging Mixture Components for Cell Population Identification in Flow Cytometry
Reads0
Chats0
TLDR
The cluster merging algorithm under this framework improves model fit and provides a better estimate of the number of distinct cell subpopulations than either Gaussian mixture models or flowClust, especially for complicated flow cytometry data distributions.Abstract:
We present a framework for the identification of cell subpopulations in flow cytometry data based on merging mixture components using the flowClust methodology. We show that the cluster merging algorithm under our framework improves model fit and provides a better estimate of the number of distinct cell subpopulations than either Gaussian mixture models or flowClust, especially for complicated flow cytometry data distributions. Our framework allows the automated selection of the number of distinct cell subpopulations and we are able to identify cases where the algorithm fails, thus making it suitable for application in a high throughput FCM analysis pipeline. Furthermore, we demonstrate a method for summarizing complex merged cell subpopulations in a simple manner that integrates with the existing flowClust framework and enables downstream data analysis. We demonstrate the performance of our framework on simulated and real FCM data. The software is available in the flowMerge package through the Bioconductor project.read more
Citations
More filters
Proceedings ArticleDOI
A clustering hybrid method to identify cellular populations and their phenotypic signatures
TL;DR: A hybrid clustering algorithm is presented that generates a 2-dimensional distillation of flow cy-tometry data and then automatically extracts the subtypes and their phenotypic signatures based on the markers' distribution.
Computational exploratory analysis of high-dimensional Flow Cytometry data for diagnosis and biomarker discovery
TL;DR: This thesis presents three computational tools that once merged together provide a complete pipeline for analysis and visualization of FCM data, and demonstrates the utility of this approach in a large (n = 466), retrospective, 14-parameter PFC study of early HIV infection.
Dissertation
Machine learning for flow cytometry data analysis
Clayton Scott,Gyemin Lee +1 more
TL;DR: This thesis presents novel algorithms for fitting multivariate Gaussian mixture models to data that is truncated, censored, or truncated and censored and proposes a transfer learning technique combined with the low-density separation principle.
Uncertainty Quantification in Multivariate Mixed Models for Mass Cytometry Data
Christof Seiler,Lisa M. Kronstad,Laura J. Simpson,Mathieu Le Gars,Elena Vendrame,Catherine A. Blish,Susan Holmes +6 more
TL;DR: This article proposes two models: a multivariate Poisson log-normal mixed model and a logistic linear mixed model that are complementary and that either model can account for different confounders and uses Hamiltonian Monte Carlo to provide Bayesian uncertainty quantification.
Journal ArticleDOI
Clinical Outcome Prediction Using Single-Cell Data
TL;DR: A hybrid learning approach to predict clinical outcome using samples' single-cell FCM data and the method is robust and the experimental results indicate promising performance.
References
More filters
Journal ArticleDOI
Estimating the Dimension of a Model
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Estimating the dimension of a model
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Journal ArticleDOI
Bioconductor: open software development for computational biology and bioinformatics
Robert Gentleman,Vincent J. Carey,Douglas M. Bates,Benjamin M. Bolstad,Marcel Dettling,Sandrine Dudoit,Byron Ellis,Laurent Gautier,Yongchao Ge,Jeff Gentry,Kurt Hornik,Torsten Hothorn,Wolfgang Huber,Stefano Maria Iacus,Rafael A. Irizarry,Friedrich Leisch,Cheng Li,Martin Maechler,A. J. Rossini,Günther Sawitzki,Colin A. Smith,Gordon K. Smyth,Luke Tierney,Jean Yang,Jianhua Zhang +24 more
TL;DR: Details of the aims and methods of Bioconductor, the collaborative creation of extensible software for computational biology and bioinformatics, and current challenges are described.
Journal ArticleDOI
On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion)
TL;DR: In this paper, a hierarchical prior model is proposed to deal with weak prior information while avoiding the mathematical pitfalls of using improper priors in the mixture context, which can be used as a basis for a thorough presentation of many aspects of the posterior distribution.
Journal ArticleDOI
Assessing a mixture model for clustering with the integrated completed likelihood
TL;DR: An assessing method of mixture model in a cluster analysis setting with integrated completed likelihood appears to be more robust to violation of some of the mixture model assumptions and it can select a number of dusters leading to a sensible partitioning of the data.