scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Merging Mixture Components for Cell Population Identification in Flow Cytometry

12 Nov 2009-Advances in Bioinformatics (Hindawi Publishing Corporation)-Vol. 2009, pp 247646-247646
TL;DR: The cluster merging algorithm under this framework improves model fit and provides a better estimate of the number of distinct cell subpopulations than either Gaussian mixture models or flowClust, especially for complicated flow cytometry data distributions.
Abstract: We present a framework for the identification of cell subpopulations in flow cytometry data based on merging mixture components using the flowClust methodology. We show that the cluster merging algorithm under our framework improves model fit and provides a better estimate of the number of distinct cell subpopulations than either Gaussian mixture models or flowClust, especially for complicated flow cytometry data distributions. Our framework allows the automated selection of the number of distinct cell subpopulations and we are able to identify cases where the algorithm fails, thus making it suitable for application in a high throughput FCM analysis pipeline. Furthermore, we demonstrate a method for summarizing complex merged cell subpopulations in a simple manner that integrates with the existing flowClust framework and enables downstream data analysis. We demonstrate the performance of our framework on simulated and real FCM data. The software is available in the flowMerge package through the Bioconductor project.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: A new visualization technique is introduced, called FlowSOM, which analyzes Flow or mass cytometry data using a Self‐Organizing Map, using a two‐level clustering and star charts, to obtain a clear overview of how all markers are behaving on all cells, and to detect subsets that might be missed otherwise.
Abstract: The number of markers measured in both flow and mass cytometry keeps increasing steadily. Although this provides a wealth of information, it becomes infeasible to analyze these datasets manually. When using 2D scatter plots, the number of possible plots increases exponentially with the number of markers and therefore, relevant information that is present in the data might be missed. In this article, we introduce a new visualization technique, called FlowSOM, which analyzes Flow or mass cytometry data using a Self-Organizing Map. Using a two-level clustering and star charts, our algorithm helps to obtain a clear overview of how all markers are behaving on all cells, and to detect subsets that might be missed otherwise. R code is available at https://github.com/SofieVG/FlowSOM and will be made available at Bioconductor.

1,109 citations

Journal ArticleDOI
Andrea Cossarizza1, Hyun-Dong Chang, Andreas Radbruch, Andreas Acs2  +459 moreInstitutions (160)
TL;DR: These guidelines are a consensus work of a considerable number of members of the immunology and flow cytometry community providing the theory and key practical aspects offlow cytometry enabling immunologists to avoid the common errors that often undermine immunological data.
Abstract: These guidelines are a consensus work of a considerable number of members of the immunology and flow cytometry community. They provide the theory and key practical aspects of flow cytometry enabling immunologists to avoid the common errors that often undermine immunological data. Notably, there are comprehensive sections of all major immune cell types with helpful Tables detailing phenotypes in murine and human cells. The latest flow cytometry techniques and applications are also described, featuring examples of the data that can be generated and, importantly, how the data can be analysed. Furthermore, there are sections detailing tips, tricks and pitfalls to avoid, all written and peer-reviewed by leading experts in the field, making this an essential research companion.

698 citations

Journal ArticleDOI
TL;DR: Several methods performed well as compared to manual gating or external variables using statistical performance measures, which suggests that automated methods have reached a sufficient level of maturity and accuracy for reliable use in FCM data analysis.
Abstract: Traditional methods for flow cytometry (FCM) data processing rely on subjective manual gating. Recently, several groups have developed computational methods for identifying cell populations in multidimensional FCM data. The Flow Cytometry: Critical Assessment of Population Identification Methods (FlowCAP) challenges were established to compare the performance of these methods on two tasks: (i) mammalian cell population identification, to determine whether automated algorithms can reproduce expert manual gating and (ii) sample classification, to determine whether analysis pipelines can identify characteristics that correlate with external variables (such as clinical outcome). This analysis presents the results of the first FlowCAP challenges. Several methods performed well as compared to manual gating or external variables using statistical performance measures, which suggests that automated methods have reached a sufficient level of maturity and accuracy for reliable use in FCM data analysis.

562 citations

Journal ArticleDOI
TL;DR: A rapid search in PubMed shows that using "flow cytometry immunology" as a search term yields more than 68 000 articles, the first of which is not about lymphocytes as mentioned in this paper.
Abstract: The marriage between immunology and cytometry is one of the most stable and productive in the recent history of science. A rapid search in PubMed shows that, as of July 2017, using “flow cytometry immunology” as a search term yields more than 68 000 articles, the first of which, interestingly, is not about lymphocytes. It might be stated that, after a short engagement, the exchange of the wedding rings between immunology and cytometry officially occurred when the idea to link fluorochromes to monoclonal antibodies came about. After this, recognizing different types of cells became relatively easy and feasible not only by using a simple fluorescence microscope, but also by a complex and sometimes esoteric instrument, the flow cytometer that is able to count hundreds of cells in a single second, and can provide repetitive results in a tireless manner. Given this, the possibility to analyse immune phenotypes in a variety of clinical conditions has changed the use of the flow cytometer, which was incidentally invented in the late 1960s to measure cellular DNA by using intercalating dyes, such as ethidium bromide. The epidemics of HIV/AIDS in the 1980s then gave a dramatic impulse to the technology of counting specific cells, since it became clear that the quantification of the number of peripheral blood CD4+ T cells was crucial to follow the course of the infection, and eventually for monitoring the therapy. As a consequence, the development of flow cytometers that had to be easy-to-use in all clinical laboratories helped to widely disseminate this technology. Nowadays, it is rare to find an immunological paper or read a conference abstract in which the authors did not use flow cytometry as the main tool to dissect the immune system and identify its fine and complex functions. Of note, recent developments have created the sophisticated technology of mass cytometry, which is able to simultaneously identify dozens of molecules at the single cell level and allows us to better understand the complexity and beauty of the immune system.

454 citations

01 Jan 2017
TL;DR: It is rare to find an immunological paper or read a conference abstract in which the authors did not use flow cytometry as the main tool to dissect the immune system and identify its fine and complex functions, and recent developments have created the sophisticated technology of mass cytometry.

423 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

38,681 citations

01 Jan 2005
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

36,760 citations


"Merging Mixture Components for Cell..." refers methods in this paper

  • ...These model-based gating methods effectively amount to clustering of the data and generally employ likelihoodbased measures such as the Bayesian information criterion (BIC) or Akaike information criterion (AIC) to select an appropriate model (number of clusters) from a range of possibilities [10]....

    [...]

Journal ArticleDOI
TL;DR: Details of the aims and methods of Bioconductor, the collaborative creation of extensible software for computational biology and bioinformatics, and current challenges are described.
Abstract: The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples.

12,142 citations


"Merging Mixture Components for Cell..." refers methods in this paper

  • ...We embed the cluster merging algorithm within the flowClust framework available in BioConductor [6, 14]....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a hierarchical prior model is proposed to deal with weak prior information while avoiding the mathematical pitfalls of using improper priors in the mixture context, which can be used as a basis for a thorough presentation of many aspects of the posterior distribution.
Abstract: New methodology for fully Bayesian mixture analysis is developed, making use of reversible jump Markov chain Monte Carlo methods that are capable of jumping between the parameter subspaces corresponding to different numbers of components in the mixture A sample from the full joint distribution of all unknown variables is thereby generated, and this can be used as a basis for a thorough presentation of many aspects of the posterior distribution The methodology is applied here to the analysis of univariate normal mixtures, using a hierarchical prior model that offers an approach to dealing with weak prior information while avoiding the mathematical pitfalls of using improper priors in the mixture context

2,018 citations

Journal ArticleDOI
TL;DR: An assessing method of mixture model in a cluster analysis setting with integrated completed likelihood appears to be more robust to violation of some of the mixture model assumptions and it can select a number of dusters leading to a sensible partitioning of the data.
Abstract: We propose an assessing method of mixture model in a cluster analysis setting with integrated completed likelihood. For this purpose, the observed data are assigned to unknown clusters using a maximum a posteriori operator. Then, the integrated completed likelihood (ICL) is approximated using the Bayesian information criterion (BIC). Numerical experiments on simulated and real data of the resulting ICL criterion show that it performs well both for choosing a mixture model and a relevant number of clusters. In particular, ICL appears to be more robust than BIC to violation of some of the mixture model assumptions and it can select a number of dusters leading to a sensible partitioning of the data.

1,418 citations


"Merging Mixture Components for Cell..." refers methods in this paper

  • ...An alternative measure recently proposed for model selection is the Integrated Complete Likelihood (ICL)[11]....

    [...]

  • ...BIC favors models with more mixture components in order to provide a better fit to the data distribution [11]....

    [...]