scispace - formally typeset
Search or ask a question
Journal ArticleDOI

flowCore: a Bioconductor package for high throughput flow cytometry.

TL;DR: A set of flexible open source computational tools in the R package flowCore that constitutes a shared and extensible research platform that enables collaboration between bioinformaticians, computer scientists, statisticians, biologists and clinicians will foster the development of novel analytic methods for flow cytometry.
Abstract: Background: Recent advances in automation technologies have enabled the use of flow cytometry for high throughput screening, generating large complex data sets often in clinical trials or drug discovery settings. However, data management and data analysis methods have not advanced sufficiently far from the initial small-scale studies to support modeling in the presence of multiple covariates. Results: We developed a set of flexible open source computational tools in the R package flowCore to facilitate the analysis of these complex data. A key component of which is having suitable data structures that support the application of similar operations to a collection of samples or a clinical cohort. In addition, our software constitutes a shared and extensible research platform that enables collaboration between bioinformaticians, computer scientists, statisticians, biologists and clinicians. This platform will foster the development of novel analytic methods for flow cytometry. Conclusion: The software has been applied in the analysis of various data sets and its data structures have proven to be highly efficient in capturing and organizing the analytic work flow. Finally, a number of additional Bioconductor packages successfully build on the infrastructure provided by flowCore, open new avenues for flow data analysis.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: Several methods performed well as compared to manual gating or external variables using statistical performance measures, which suggests that automated methods have reached a sufficient level of maturity and accuracy for reliable use in FCM data analysis.
Abstract: Traditional methods for flow cytometry (FCM) data processing rely on subjective manual gating. Recently, several groups have developed computational methods for identifying cell populations in multidimensional FCM data. The Flow Cytometry: Critical Assessment of Population Identification Methods (FlowCAP) challenges were established to compare the performance of these methods on two tasks: (i) mammalian cell population identification, to determine whether automated algorithms can reproduce expert manual gating and (ii) sample classification, to determine whether analysis pipelines can identify characteristics that correlate with external variables (such as clinical outcome). This analysis presents the results of the first FlowCAP challenges. Several methods performed well as compared to manual gating or external variables using statistical performance measures, which suggests that automated methods have reached a sufficient level of maturity and accuracy for reliable use in FCM data analysis.

562 citations

Journal ArticleDOI
TL;DR: Figure layouts created on Cytobank are designed to allow transparent access to the underlying experiment annotation and data processing steps, and can be viewed or edited by anyone with the proper permission, from any computer with Internet access.
Abstract: Cytobank is a Web-based application for storage, analysis, and sharing of flow cytometry experiments. Researchers use a Web browser to log in and use a wide range of tools developed for basic and advanced flow cytometry. In addition to providing access to standard cytometry tools from any computer, Cytobank creates a platform and community for developing new analysis and publication tools. Figure layouts created on Cytobank are designed to allow transparent access to the underlying experiment annotation and data processing steps. Since all flow cytometry files and analysis data are stored on a central server, experiments and figures can be viewed or edited by anyone with the proper permission, from any computer with Internet access. Once a primary researcher has performed the initial analysis of the data, collaborators can engage in experiment analysis and make their own figure layouts using the gated, compensated experiment files. Cytobank is available to the scientific community at http://www.cytobank.org.

465 citations

Journal ArticleDOI
TL;DR: This Review provides non-experts with a broad and practical overview of the many recent developments in computational flow cytometry.
Abstract: Recent advances in flow cytometry allow scientists to measure an increasing number of parameters per cell, generating huge and high-dimensional datasets. To analyse, visualize and interpret these data, newly available computational techniques should be adopted, evaluated and improved upon by the immunological community. Computational flow cytometry is emerging as an important new field at the intersection of immunology and computational biology; it allows new biological knowledge to be extracted from high-throughput single-cell data. This Review provides non-experts with a broad and practical overview of the many recent developments in computational flow cytometry.

393 citations

Journal ArticleDOI
TL;DR: The phenotypic signature of hu MG was identified, which was distinct from peripheral myeloid cells but was comparable to fresh huMG, and microglia regional heterogeneity was identified.
Abstract: Microglia, the specialized innate immune cells of the CNS, play crucial roles in neural development and function. Different phenotypes and functions have been ascribed to rodent microglia, but little is known about human microglia (huMG) heterogeneity. Difficulties in procuring huMG and their susceptibility to cryopreservation damage have limited large-scale studies. Here we applied multiplexed mass cytometry for a comprehensive characterization of postmortem huMG (103 - 104 cells). We determined expression levels of 57 markers on huMG isolated from up to five different brain regions of nine donors. We identified the phenotypic signature of huMG, which was distinct from peripheral myeloid cells but was comparable to fresh huMG. We detected microglia regional heterogeneity using a hybrid workflow combining Cytobank and R/Bioconductor for multidimensional data analysis. Together, these methodologies allowed us to perform high-dimensional, large-scale immunophenotyping of huMG at the single-cell level, which facilitates their unambiguous profiling in health and disease.

262 citations

References
More filters
Journal ArticleDOI
TL;DR: Details of the aims and methods of Bioconductor, the collaborative creation of extensible software for computational biology and bioinformatics, and current challenges are described.
Abstract: The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples.

12,142 citations


"flowCore: a Bioconductor package fo..." refers methods in this paper

  • ...The R software environment is freely available at http://www.r-project.org. flowCore and its dependencies (flowQ, flowViz) are available on the Bioconductor project website http://biocon ductor.org as freely distributed and open source software packages with an Artistic license....

    [...]

  • ...In addition to the flowCore package that offers basic infrastructure, we have implemented a range of additional Bioconductor packages that are dedicated to more specific tasks of FCM data analysis....

    [...]

  • ...The computational tools we have developed are distributed in the R software language [6] as the Bioconductor [7] package flowCore....

    [...]

  • ...They are fully integrated into the R/Bioconductor environment for statistical computing and bioinformatics and run on operating systems Windows, Mac OS X, and Unix....

    [...]

  • ...The flowCore package and its associated packages are part of the R/Bioconductor project, an environment for statistical computing and bioinformatics....

    [...]

Book
12 Mar 2008
TL;DR: This book describes Lattice, a powerful and elegant high level data visualization system that is sufficient for most everyday graphics needs, yet flexible enough to be easily extended to handle demands of cutting edge research.
Abstract: R is rapidly growing in popularity as the environment of choice for data analysis and graphics both in academia and industry. Lattice brings the proven design of Trellis graphics (originally developed for S by William S. Cleveland and colleagues at Bell Labs) to R, considerably expanding its capabilities in the process. Lattice is a powerful and elegant high level data visualization system that is sufficient for most everyday graphics needs, yet flexible enough to be easily extended to handle demands of cutting edge research. Written by the author of the lattice system, this book describes it in considerable depth, beginning with the essentials and systematically delving into specific low levels details as necessary. No prior experience with lattice is required to read the book, although basic familiarity with R is assumed. The book contains close to150 figures produced with lattice. Many of the examples emphasize principles of good graphical design; almost all use real data sets that are publicly available in various R packages. All code and figures in the book are also available online, along with supplementary material covering more advanced topics.

1,093 citations

Journal ArticleDOI

931 citations


"flowCore: a Bioconductor package fo..." refers methods in this paper

  • ...As exemplified in the previous section, the flowViz package [12] provides sophisticated data visualization tools, that make use off multivariate trellis plotting [15]....

    [...]

Journal ArticleDOI
TL;DR: An account of points of consensus and discord, including the relative heterogeneity of T cell subpopulations during infections with distinct pathogens, the relationship between phenotypic and functional T cell attributes, and the pathway(s) of Tcell differentiation are provided.
Abstract: In recent years, a tremendous effort has been devoted to the detailed characterization of the phenotype and function of distinct T cell subpopulations in humans, as well as to their pathway(s) of differentiation and role in immune responses But these studies seem to have generated more questions than definitive answers To clarify issues related to the function and differentiation of T cell subsets, one session of the MASIR 2008 conference was dedicated to this topic Several points of consensus and discord were highlighted in the work presented during this session We provide here an account of these points, including the relative heterogeneity of T cell subpopulations during infections with distinct pathogens, the relationship between phenotypic and functional T cell attributes, and the pathway(s) of T cell differentiation Finally, we discuss the problems which still limit general agreement

711 citations

Journal ArticleDOI
TL;DR: A flexible statistical model‐based clustering approach for identifying cell populations in flow cytometry data based on t‐mixture models with a Box–Cox transformation, which generalizes the popular Gaussian mixture models to account for outliers and allow for nonelliptical clusters.
Abstract: The capability of flow cytometry to offer rapid quantification of multidimensional characteristics for millions of cells has made this technology indispensable for health research, medical diagnosis, and treatment. However, the lack of statistical and bioinformatics tools to parallel recent high-throughput technological advancements has hindered this technology from reaching its full potential. We propose a flexible statistical model-based clustering approach for identifying cell populations in flow cytometry data based on t-mixture models with a Box–Cox transformation. This approach generalizes the popular Gaussian mixture models to account for outliers and allow for nonelliptical clusters. We describe an Expectation-Maximization (EM) algorithm to simultaneously handle parameter estimation and transformation selection. Using two publicly available datasets, we demonstrate that our proposed methodology provides enough flexibility and robustness to mimic manual gating results performed by an expert researcher. In addition, we present results from a simulation study, which show that this new clustering framework gives better results in terms of robustness to model misspecification and estimation of the number of clusters, compared to the popular mixture models. The proposed clustering methodology is well adapted to automated analysis of flow cytometry data. It tends to give more reproducible results, and helps reduce the significant subjectivity and human time cost encountered in manual gating analysis. © 2008 International Society for Analytical Cytology

281 citations


"flowCore: a Bioconductor package fo..." refers background or methods in this paper

  • ...Data transformation is essential for both data visualization and modeling [11]....

    [...]

  • ...This design allows for the straightforward extension of flowCore's capabilities, and has already fostered the development of a number of valuable add-ons [11,12]....

    [...]

  • ...More recently, [11] have developed an automatic gating approach via robust model-based clustering using flowCore's data model and infrastructure which is implemented in the Bioconductor package flowClust....

    [...]

  • ...Automated or data-driven gating has the potential to estimate the gating regions from the underlying data, thus providing a fast objective solution to the analysis of potentially very large and diverse data sets [11]....

    [...]

Related Papers (5)