Merging Mixture Components for Cell Population Identification in Flow Cytometry

doi:10.1155/2009/247646

Home
/
Papers
/
Merging Mixture Components for Cell Population Identification in Flow Cytometry

Journal Article•DOI•

Merging Mixture Components for Cell Population Identification in Flow Cytometry

Greg Finak, Ali Bashashati, Ryan R. Brinkman, Raphael Gottardo¹•Institutions (1)

Université de Montréal¹

12 Nov 2009-Advances in Bioinformatics (Hindawi Publishing Corporation)-Vol. 2009, pp 247646-247646

TL;DR: The cluster merging algorithm under this framework improves model fit and provides a better estimate of the number of distinct cell subpopulations than either Gaussian mixture models or flowClust, especially for complicated flow cytometry data distributions.

read less

Abstract: We present a framework for the identification of cell subpopulations in flow cytometry data based on merging mixture components using the flowClust methodology. We show that the cluster merging algorithm under our framework improves model fit and provides a better estimate of the number of distinct cell subpopulations than either Gaussian mixture models or flowClust, especially for complicated flow cytometry data distributions. Our framework allows the automated selection of the number of distinct cell subpopulations and we are able to identify cases where the algorithm fails, thus making it suitable for application in a high throughput FCM analysis pipeline. Furthermore, we demonstrate a method for summarizing complex merged cell subpopulations in a simple manner that integrates with the existing flowClust framework and enables downstream data analysis. We demonstrate the performance of our framework on simulated and real FCM data. The software is available in the flowMerge package through the Bioconductor project.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

FlowSOM: Using self-organizing maps for visualization and interpretation of cytometry data.

[...]

Sofie Van Gassen¹, Sofie Van Gassen², Britt Callebaut¹, Mary J. van Helden², Bart N. Lambrecht², Piet Demeester¹, Tom Dhaene¹, Yvan Saeys² - Show less +4 more•Institutions (2)

Ghent University¹, Ghent University Hospital²

01 Jul 2015-Cytometry Part A

TL;DR: A new visualization technique is introduced, called FlowSOM, which analyzes Flow or mass cytometry data using a Self‐Organizing Map, using a two‐level clustering and star charts, to obtain a clear overview of how all markers are behaving on all cells, and to detect subsets that might be missed otherwise.

...read moreread less

Abstract: The number of markers measured in both flow and mass cytometry keeps increasing steadily. Although this provides a wealth of information, it becomes infeasible to analyze these datasets manually. When using 2D scatter plots, the number of possible plots increases exponentially with the number of markers and therefore, relevant information that is present in the data might be missed. In this article, we introduce a new visualization technique, called FlowSOM, which analyzes Flow or mass cytometry data using a Self-Organizing Map. Using a two-level clustering and star charts, our algorithm helps to obtain a clear overview of how all markers are behaving on all cells, and to detect subsets that might be missed otherwise. R code is available at https://github.com/SofieVG/FlowSOM and will be made available at Bioconductor.

...read moreread less

1,109 citations

Journal Article•DOI•

Guidelines for the use of flow cytometry and cell sorting in immunological studies (second edition)

[...]

Andrea Cossarizza¹, Hyun-Dong Chang, Andreas Radbruch, Andreas Acs² +459 more•Institutions (160)

01 Oct 2019-European Journal of Immunology

TL;DR: These guidelines are a consensus work of a considerable number of members of the immunology and flow cytometry community providing the theory and key practical aspects offlow cytometry enabling immunologists to avoid the common errors that often undermine immunological data.

...read moreread less

Abstract: These guidelines are a consensus work of a considerable number of members of the immunology and flow cytometry community. They provide the theory and key practical aspects of flow cytometry enabling immunologists to avoid the common errors that often undermine immunological data. Notably, there are comprehensive sections of all major immune cell types with helpful Tables detailing phenotypes in murine and human cells. The latest flow cytometry techniques and applications are also described, featuring examples of the data that can be generated and, importantly, how the data can be analysed. Furthermore, there are sections detailing tips, tricks and pitfalls to avoid, all written and peer-reviewed by leading experts in the field, making this an essential research companion.

...read moreread less

698 citations

Journal Article•DOI•

Critical assessment of automated flow cytometry data analysis techniques

[...]

Nima Aghaeepour¹, Greg Finak², Holger H. Hoos³, Tim R. Mosmann⁴, Ryan R. Brinkman³, Raphael Gottardo², Richard H. Scheuermann⁵ - Show less +3 more•Institutions (5)

BC Cancer Agency¹, Fred Hutchinson Cancer Research Center², University of British Columbia³, University of Rochester⁴, J. Craig Venter Institute⁵

01 Mar 2013-Nature Methods

TL;DR: Several methods performed well as compared to manual gating or external variables using statistical performance measures, which suggests that automated methods have reached a sufficient level of maturity and accuracy for reliable use in FCM data analysis.

...read moreread less

Abstract: Traditional methods for flow cytometry (FCM) data processing rely on subjective manual gating. Recently, several groups have developed computational methods for identifying cell populations in multidimensional FCM data. The Flow Cytometry: Critical Assessment of Population Identification Methods (FlowCAP) challenges were established to compare the performance of these methods on two tasks: (i) mammalian cell population identification, to determine whether automated algorithms can reproduce expert manual gating and (ii) sample classification, to determine whether analysis pipelines can identify characteristics that correlate with external variables (such as clinical outcome). This analysis presents the results of the first FlowCAP challenges. Several methods performed well as compared to manual gating or external variables using statistical performance measures, which suggests that automated methods have reached a sufficient level of maturity and accuracy for reliable use in FCM data analysis.

...read moreread less

562 citations

Journal Article•DOI•

Guidelines for the use of flow cytometry and cell sorting in immunological studies

[...]

Andrea Cossarizza¹, Hyun-Dong Chang, Andreas Radbruch, Mübeccel Akdis² +243 more•Institutions (111)

01 Oct 2017-European Journal of Immunology

TL;DR: A rapid search in PubMed shows that using "flow cytometry immunology" as a search term yields more than 68 000 articles, the first of which is not about lymphocytes as mentioned in this paper.

...read moreread less

Abstract: The marriage between immunology and cytometry is one of the most stable and productive in the recent history of science. A rapid search in PubMed shows that, as of July 2017, using “flow cytometry immunology” as a search term yields more than 68 000 articles, the first of which, interestingly, is not about lymphocytes. It might be stated that, after a short engagement, the exchange of the wedding rings between immunology and cytometry officially occurred when the idea to link fluorochromes to monoclonal antibodies came about. After this, recognizing different types of cells became relatively easy and feasible not only by using a simple fluorescence microscope, but also by a complex and sometimes esoteric instrument, the flow cytometer that is able to count hundreds of cells in a single second, and can provide repetitive results in a tireless manner. Given this, the possibility to analyse immune phenotypes in a variety of clinical conditions has changed the use of the flow cytometer, which was incidentally invented in the late 1960s to measure cellular DNA by using intercalating dyes, such as ethidium bromide. The epidemics of HIV/AIDS in the 1980s then gave a dramatic impulse to the technology of counting specific cells, since it became clear that the quantification of the number of peripheral blood CD4+ T cells was crucial to follow the course of the infection, and eventually for monitoring the therapy. As a consequence, the development of flow cytometers that had to be easy-to-use in all clinical laboratories helped to widely disseminate this technology. Nowadays, it is rare to find an immunological paper or read a conference abstract in which the authors did not use flow cytometry as the main tool to dissect the immune system and identify its fine and complex functions. Of note, recent developments have created the sophisticated technology of mass cytometry, which is able to simultaneously identify dozens of molecules at the single cell level and allows us to better understand the complexity and beauty of the immune system.

...read moreread less

454 citations

Guidelines for the use of flow cytometry and cell sorting in immunological studies - Cossarizza - 2017 - European Journal of Immunology - Wiley Online Library

[...]

Andrea Cossarizza, Yvonne Samstag

01 Jan 2017

TL;DR: It is rare to find an immunological paper or read a conference abstract in which the authors did not use flow cytometry as the main tool to dissect the immune system and identify its fine and complex functions, and recent developments have created the sophisticated technology of mass cytometry.

...read moreread less

423 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Estimating the Dimension of a Model

[...]

Gideon Schwarz

01 Mar 1978-Annals of Statistics

TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.

...read moreread less

Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

...read moreread less

38,681 citations

Estimating the dimension of a model

[...]

Gideon Schwarz

01 Jan 2005

...read moreread less

36,760 citations

"Merging Mixture Components for Cell..." refers methods in this paper

...These model-based gating methods effectively amount to clustering of the data and generally employ likelihoodbased measures such as the Bayesian information criterion (BIC) or Akaike information criterion (AIC) to select an appropriate model (number of clusters) from a range of possibilities [10]....
[...]

Journal Article•DOI•

Bioconductor: open software development for computational biology and bioinformatics

[...]

Robert Gentleman¹, Vincent J. Carey², Douglas M. Bates³, Benjamin M. Bolstad⁴, Marcel Dettling, Sandrine Dudoit⁴, Byron Ellis¹, Laurent Gautier⁵, Yongchao Ge⁶, Jeff Gentry¹, Kurt Hornik⁷, Torsten Hothorn⁸, Wolfgang Huber⁹, Stefano Maria Iacus¹⁰, Rafael A. Irizarry¹¹, Friedrich Leisch⁷, Cheng Li¹, Martin Maechler, A. J. Rossini¹², Günther Sawitzki, Colin A. Smith¹³, Gordon K. Smyth¹⁴, Luke Tierney¹⁵, Jean Yang, Jianhua Zhang¹ - Show less +21 more•Institutions (15)

Harvard University¹, Brigham and Women's Hospital², University of Wisconsin-Madison³, University of California, Berkeley⁴, Technical University of Denmark⁵, Icahn School of Medicine at Mount Sinai⁶, Vienna University of Technology⁷, University of Erlangen-Nuremberg⁸, German Cancer Research Center⁹, University of Milan¹⁰, Johns Hopkins University¹¹, University of Washington¹², Scripps Research Institute¹³, Walter and Eliza Hall Institute of Medical Research¹⁴, University of Iowa¹⁵

15 Sep 2004-Genome Biology

TL;DR: Details of the aims and methods of Bioconductor, the collaborative creation of extensible software for computational biology and bioinformatics, and current challenges are described.

...read moreread less

Abstract: The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples.

...read moreread less

12,142 citations

"Merging Mixture Components for Cell..." refers methods in this paper

...We embed the cluster merging algorithm within the flowClust framework available in BioConductor [6, 14]....
[...]

Journal Article•DOI•

On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion)

[...]

Sylvia Richardson¹, Peter H.R. Green²•Institutions (2)

French Institute of Health and Medical Research¹, University of Bristol²

01 Jan 1997-Journal of The Royal Statistical Society Series B-statistical Methodology

TL;DR: In this paper, a hierarchical prior model is proposed to deal with weak prior information while avoiding the mathematical pitfalls of using improper priors in the mixture context, which can be used as a basis for a thorough presentation of many aspects of the posterior distribution.

...read moreread less

Abstract: New methodology for fully Bayesian mixture analysis is developed, making use of reversible jump Markov chain Monte Carlo methods that are capable of jumping between the parameter subspaces corresponding to different numbers of components in the mixture A sample from the full joint distribution of all unknown variables is thereby generated, and this can be used as a basis for a thorough presentation of many aspects of the posterior distribution The methodology is applied here to the analysis of univariate normal mixtures, using a hierarchical prior model that offers an approach to dealing with weak prior information while avoiding the mathematical pitfalls of using improper priors in the mixture context

...read moreread less

2,018 citations

Journal Article•DOI•

Assessing a mixture model for clustering with the integrated completed likelihood

[...]

Christophe Biernacki, Gilles Celeux¹, Gérard Govaert²•Institutions (2)

French Institute for Research in Computer Science and Automation¹, University of Technology of Compiègne²

01 Jul 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An assessing method of mixture model in a cluster analysis setting with integrated completed likelihood appears to be more robust to violation of some of the mixture model assumptions and it can select a number of dusters leading to a sensible partitioning of the data.

...read moreread less

Abstract: We propose an assessing method of mixture model in a cluster analysis setting with integrated completed likelihood. For this purpose, the observed data are assigned to unknown clusters using a maximum a posteriori operator. Then, the integrated completed likelihood (ICL) is approximated using the Bayesian information criterion (BIC). Numerical experiments on simulated and real data of the resulting ICL criterion show that it performs well both for choosing a mixture model and a relevant number of clusters. In particular, ICL appears to be more robust than BIC to violation of some of the mixture model assumptions and it can select a number of dusters leading to a sensible partitioning of the data.

...read moreread less

1,418 citations

"Merging Mixture Components for Cell..." refers methods in this paper

...An alternative measure recently proposed for model selection is the Integrated Complete Likelihood (ICL)[11]....
[...]
...BIC favors models with more mixture components in order to provide a better fit to the data distribution [11]....
[...]