scispace - formally typeset

Posted ContentDOI

A rule-based data-informed cellular consensus map of the human mononuclear phagocyte cell space

03 Jun 2019-bioRxiv (Cold Spring Harbor Laboratory)-pp 658179

TL;DR: A rule-based data-informed approach to build next generation cellular consensus maps, using the human dendritic-cell and monocyte compartment in peripheral blood as an example, and providing a generalizable method for building consensus maps for the life sciences.

AbstractSingle-cell genomic techniques are opening new avenues to understand the basic units of life. Large international efforts, such as those to derive a Human Cell Atlas, are driving progress in this area; here, cellular map generation is key. To expedite the inevitable iterations of these underlying maps, we have developed a rule-based data-informed approach to build next generation cellular consensus maps. Using the human dendritic-cell and monocyte compartment in peripheral blood as an example, we performed computational integration of previous, partially overlapping maps using an approach we termed ‘backmapping’, combined with multi-color flow-cytometry and index sorting-based single-cell RNA-sequencing. Our general strategy can be applied to any atlas generation for humans and other species. Graphical Highlights Defining a consensus of the human myeloid cell compartment in peripheral blood 3 monocytes subsets, pDC, cDC1, DC2, DC3 and precursor DC make up the compartment Distinguish myeloid cell compartment from other cell spaces, e.g. the NK cell space Providing a generalizable method for building consensus maps for the life sciences

Summary (5 min read)

Introduction

  • Such single-cell technologies allow for a fully data-driven analysis to establish cell maps of an organism, such as those proposed by the Human Cell Atlas consortium (Rozenblatt-Rosen et al., 2017).
  • Reliable consensus maps are a prerequisite to reconcile conflicting data that might have been generated based on different data generating approaches (Edney, 2019; Monmonier, 2015).
  • In order to establish a consensus map of the human mononuclear myeloid cell compartment the authors allow for the integration of prior knowledge in that they define a priori criteria for the cellular compartment under study in order to increase resolution and to allow 5 building of a consensus map.

Results

  • Integrated phenotypic characterization of the myeloid cell compartment in human peripheral blood CC-BY-NC-ND 4.0 International licensea certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • The copyright holder for this preprint (which was notthis version posted June 3, 2019.
  • To integrate the identified DC subsets in map 1 and map 2 with each other, the authors computed a UMAP topology from the original map 1 single-cell transcriptome data comprising the DC cell space and overlaid the signatures of the map 2 DC subsets (pDC, cDC1, cDC2, pre-DC) .
  • This analysis showed that if the totality of the Lin-CD16+ compartment is mapped back onto the Lin- UMAP topology , NK cells (CD56+), monocytes (CD56-CD16+/-) and granulocyte fractions (CD16high) are included in this cellular compartment.

Discussion

  • Consensus maps are an important instrument within an iterative process of producing cellular maps of all organs and tissues in different species, including humans.
  • Because the authors propose to include prior knowledge in the respective scientific field into the algorithm for generating such consensus maps, they define the overall strategy as being ‘data-informed’, combining prior knowledge and data-driven technologies including single-cell omics.
  • CC-BY-NC-ND 4.0 International licensea certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • The copyright holder for this preprint (which was notthis version posted June 3, 2019.
  • BioRxiv preprint 20 providing the next iteration of this particular subspace in the myeloid cell map of human peripheral blood.

Acknowledgments:

  • The authors thank Jessica Tamanini for critical review and editing of the manuscript.
  • This work was supported by the German Research Foundation to JLS (GRK 2168, INST 217/577-1, EXC2151/1), by the HGF grant sparse2big to JLS, the FASTGenomics grant of 5 the German Federal Ministry for Economic Affairs and Energy to JLS and the EU project SYSCID under grant number 733100, also known as Funding.
  • F.G is an EMBO YIP awardee and is supported by Singapore Immunology Network (SIgN) and Shanghai Institute of Immunology core funding.
  • The authors declare that there are no competing interests.

Figure Legends

  • Generating a new consensus map of the mononuclear myeloid cell compartment in human peripheral blood.
  • (B) Visualization of ~1.4 mio. live CD45+Lin(CD3, CD19, 5 CD20, CD56)- cells after UMAP dimensionality reduction of the flow cytometry panel introduced in A (left panel), mononuclear myeloid cell compartment (second panel), overlay of index-sorted cells (third panel), UMAP topology of the index-sorted cells based on the single-cell transcriptome data .
  • (B) Heatmap of 10 most significant marker genes for each of the 11 clusters identified and visualized in Figure 2A.
  • (G) UMAP topology of scRNA-seq data derived from the map1 DC and mono subsets (left panel) and overlay of the NK cell signature onto this UMAP topology.
  • 20 25 .CC-BY-NC-ND 4.0 International licensea certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

Tables S1:

  • Cell types classified in the respective studies Data Table S1: 5 Data Table S1.csv.
  • Gene signatures of the 11 clusters identified in their new scRNA-seq consensus map.

Data Table S2:

  • CC-BY-NC-ND 4.0 International licensea certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • The copyright holder for this preprint (which was notthis version posted June 3, 2019.
  • Cell types classified in the respective studies .
  • CC-BY-N -ND 4.0 Internatio al licensea certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

EXPERIMENTAL MODEL AND SUBJECT DETAILS

  • Peripheral blood mononuclear cells (PBMC) Buffy coats or venipuncture blood were obtained from healthy donors (University hospital Bonn, local ethics vote 203/09) after written consent was given according to the Declaration of Helsinki.
  • 10 Peripheral blood mononuclear cells (PBMC) were isolated by Pancoll (PAN-Biotech) density centrifugation from buffy coats.

METHOD DETAILS

  • Whole blood or buffy coat was diluted in room temperature PBS (1:2 or 1:5, respectively) and layered onto polysuccrose solution (Pancoll; PAN Biotech, Germany) for the enrichment of mononuclear cells by density gradient centrifugation according to the manufacturer's instructions.
  • Washed cells were incubated with L/D Marker DRAQ7 (BioLegend, USA) for 5 min at room temperature before acquisition and sorting of the cells using a BD FACSARIA III (BD BioSciences, USA).
  • CC-BY-NC-ND 4.0 International licensea certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • The authors new index-sorted single cell transcriptome dataset was based on the Smart-Seq2 protocol (Picelli et al., 2013).
  • CDNA was diluted to an average of 200pg/µl and 100pg cDNA from each cell was tagmented by adding 1µl TD and 0.5µl ATM from a Nextera XT DNA Library Preparation Kit to 0.5µl diluted cDNA in each well of a fresh 384-well plate.

Cytospin preparation and May-Grünwald/Giemsa staining

  • Cell populations of interest were sorted into 1.5 ml reaction tubes containing 200 µl FACS-buffer 5 using a BD FACSARIA III (BD BioSciences, USA).
  • Whole blood was diluted in room temperature PBS (1:2) and layered onto polysuccrose solution (Pancoll; PAN Biotech, Germany) for the enrichment of mononuclear cells by density gradient 15 centrifugation according to the manufacturer's instructions.
  • Sequenced single-cell data was demultiplexed using bcl2fastq2 v2.20.
  • Based on the pseudoalignment estimated by Kallisto, transcript levels were quantified as transcripts per million reads (TPM).

Quality control

  • Concerning their new index-sorted and Smart-Seq2-based single cell transcriptome dataset the following quality control scheme using various meta information was performed to obtain highquality transcriptome data: 1) We removed genes that are detected in less than 6 cells (0.2 percent of cells), 2) and removed cells that have less than 1,000 uniquely detected genes.the authors.the authors.
  • Next, 25 .CC-BY-NC-ND 4.0 International licensea certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • BioRxiv preprint 36 the authors filtered further outlier cells with 3) less than 50,000 unique reads, 4) less than 30% pseudoalignment of reads to the transcriptome, 5) a lower rate of endogenous-to-mitochondrial count rate of 2, 6).
  • To reduce the influence of variation of sequencing depth among samples the authors applied a lognormalization to the data and scaled each cells gene expression profile to a total count of 10,000.
  • The residuals of this regression are scaled and centered and used for further downstream analysis.

Dimensionality reduction and clustering

  • This resulted in a total of 2491 genes, which were used as input for a principal component (PC) analysis.
  • To test for cellular heterogeneity, the authors used a shared nearest neighbor (SNN)-graph based clustering algorithm implemented in the Seurat package.
  • The authors used the first 10 principal components for constructing the SNN-graph and set the resolution to 1.
  • Monocle was used to infer differentiation trajectories by using the Louvain clustering method, umap dimensionality reduction and the SimplePPT algorithm (Qiu et al., 2017) 25 .CC-BY-NC-ND 4.0 International licensea certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • The copyright holder for this preprint (which was notthis version posted June 3, 2019.

Additional analysis

  • Differentially expressed (DE) genes were defined using a Wilcoxon-based test for differential gene expression built in the Seurat pipeline (v.2.3.4) (Data Table S1).
  • Top10 DE genes have been visualized using heatmap of hierarchical clustered gene expression 5 profiles.
  • Gene signature enrichment analysis Single-cell RNA-Seq data is inherently sparse and a high-dropout rate is limiting the use of single marker genes to identify cell populations.
  • In order to increase the power, the authors use both up and downregulated gene signatures for the calculation of the gene expression scores.
  • The difference between these two is scaled and visualized.

To assess the single-cell RNA-Seq data of human dendritic cells and monocytes publicly available

  • Under the Gene Expression Omnibus accession number GSE94820, the authors applied the processing 25 .
  • CC-BY-NC-ND 4.0 International licensea certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • The copyright holder for this preprint (which was notthis version posted June 3, 2019.
  • Next, the authors followed the general data analysis scheme described at the Seurat package webpage 15 (https://satijalab.org/seurat/get_started_v1_4.html).
  • Briefly, the authors used the filtered cell-gene matrix provided by 10x Genomics and imported the data and performed the analysis with the Seurat package.

Backmapping

  • In order to compare the transcriptome profiles of monocytes isolated from the dataset derived 5 from GSE94820 (Villani et al., 2017) with the comprehensive PBMC dataset, the authors used the previously introduced canonical correlation alignment to combine datasets (Butler et al., 2018).
  • The authors determined the mutual highly variable genes as the overlap of the 4.000 genes from each dataset with highest dispersion.
  • The authors treated the different batches of the HCA dataset 25 as individual datasets and normalized them and the expression table of the consensus map .
  • CC-BY-NC-ND 4.0 International licensea certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • First, the authors repeated the steps above but without integration of the new consensus map data.

Data visualization

  • In general, the ggplot2 package was used to generate figures (Wickham, 2016).
  • 25 .CC-BY-NC-ND 4.0 International licensea certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • The copyright holder for this preprint (which was notthis version posted June 3, 2019.

QUANTIFICATION AND STATISTICAL ANALYSIS

  • Statistical analysis was performed using the R programming language.
  • Statistical tests used are described in the figure legend or methods part, respectively.
  • Differentially expressed genes have been identified using a Wilcoxon-based test for differential gene expression.
  • If not otherwise stated a significance level of 0.1 was applied to adjusted p-values (Benjamini Hochberg).

DATA AND SOFTWARE AVAILABILITY

  • Processed and raw scRNA-seq datasets are available through the Gene Expression Omnibus (GSE126422).
  • Additional Data tables are provided in form of EXCEL Tables (Data S1, S2) Data Table S1: Data Table S1.csv 10 Gene signatures of the 11 clusters identified in their new scRNA-seq consensus map.

ADDITIONAL RESOURCES

  • In addition, the authors provide an interactive web tool to visualize the single-cell RNA-Seq data together with the flow cytometry data at https://paguen.shinyapps.io/DC_MONO/ (external database S1).
  • .CC-BY-NC-ND 4.0 International licensea certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • The copyright holder for this preprint (which was notthis version posted June 3, 2019.

Did you find this useful? Give us your feedback

...read more

Content maybe subject to copyright    Report

Citations
More filters

Journal ArticleDOI
TL;DR: An insight is provided into the contribution of human monocytes to the progression of these diseases and their candidacy as potential therapeutic cell targets are highlighted.
Abstract: Human monocytes are divided in three major populations; classical (CD14+CD16-), non-classical (CD14dimCD16+), and intermediate (CD14+CD16+). Each of these subsets is distinguished from each other by the expression of distinct surface markers and by their functions in homeostasis and disease. In this review, we discuss the most up-to-date phenotypic classification of human monocytes that has been greatly aided by the application of novel single-cell transcriptomic and mass cytometry technologies. Furthermore, we shed light on the role of these plastic immune cells in already recognized and emerging human chronic diseases, such as obesity, atherosclerosis, chronic obstructive pulmonary disease, lung fibrosis, lung cancer, and Alzheimer's disease. Our aim is to provide an insight into the contribution of human monocytes to the progression of these diseases and highlight their candidacy as potential therapeutic cell targets.

168 citations


Journal ArticleDOI
17 Sep 2019-Immunity
TL;DR: High-dimensional single-cell protein and RNA expression data is integrated to identify distinct markers to delineate monocytes from conventional DC2 (cDC2s), further unravel the heterogeneity of DC subpopulations in health and disease and may pave the way for the identification of specific DC subset-targeting therapies.
Abstract: Summary Human mononuclear phagocytes comprise phenotypically and functionally overlapping subsets of dendritic cells (DCs) and monocytes, but the extent of their heterogeneity and distinct markers for subset identification remains elusive. By integrating high-dimensional single-cell protein and RNA expression data, we identified distinct markers to delineate monocytes from conventional DC2 (cDC2s). Using CD88 and CD89 for monocytes and HLA-DQ and FceRIα for cDC2s allowed for their specific identification in blood and tissues. We also showed that cDC2s could be subdivided into phenotypically and functionally distinct subsets based on CD5, CD163, and CD14 expression, including a distinct subset of circulating inflammatory CD5−CD163+CD14+ cells related to previously defined DC3s. These inflammatory DC3s were expanded in systemic lupus erythematosus patients and correlated with disease activity. These findings further unravel the heterogeneity of DC subpopulations in health and disease and may pave the way for the identification of specific DC subset-targeting therapies.

151 citations


Journal ArticleDOI
Abstract: Glioblastomas are aggressive primary brain cancers that recur as therapy-resistant tumors. Myeloid cells control glioblastoma malignancy, but their dynamics during disease progression remain poorly understood. Here, we employed single-cell RNA sequencing and CITE-seq to map the glioblastoma immune landscape in mouse tumors and in patients with newly diagnosed disease or recurrence. This revealed a large and diverse myeloid compartment, with dendritic cell and macrophage populations that were conserved across species and dynamic across disease stages. Tumor-associated macrophages (TAMs) consisted of microglia- or monocyte-derived populations, with both exhibiting additional heterogeneity, including subsets with conserved lipid and hypoxic signatures. Microglia- and monocyte-derived TAMs were self-renewing populations that competed for space and could be depleted via CSF1R blockade. Microglia-derived TAMs were predominant in newly diagnosed tumors, but were outnumbered by monocyte-derived TAMs following recurrence, especially in hypoxic tumor environments. Our results unravel the glioblastoma myeloid landscape and provide a framework for future therapeutic interventions.

36 citations


Journal ArticleDOI
TL;DR: The current understanding of MNP ontogeny, as well as the recently identified human intestinal MNP subsets, are reviewed, and their role in health and IBD is discussed.
Abstract: Inflammatory bowel disease (IBD), including Crohn's disease and ulcerative colitis, is a complex immune-mediated disease of the gastrointestinal tract that increases morbidity and negatively influences the quality of life. Intestinal mononuclear phagocytes (MNPs) have a crucial role in maintaining epithelial barrier integrity while controlling pathogen invasion by activating an appropriate immune response. However, in genetically predisposed individuals, uncontrolled immune activation to intestinal flora is thought to underlie the chronic mucosal inflammation that can ultimately result in IBD. Thus, MNPs are involved in fine-tuning mucosal immune system responsiveness and have a critical role in maintaining homeostasis or, potentially, the emergence of IBD. MNPs include monocytes, macrophages and dendritic cells, which are functionally diverse but highly complementary. Despite their crucial role in maintaining intestinal homeostasis, specific functions of human MNP subsets are poorly understood, especially during diseases such as IBD. Here we review the current understanding of MNP ontogeny, as well as the recently identified human intestinal MNP subsets, and discuss their role in health and IBD.

34 citations


Journal ArticleDOI
TL;DR: The current understanding of the developmental path of DCs from hematopoietic stem cells to fully functional DCs in their local tissue environment is summarized and a template for the identification ofDCs across various tissues is provided.
Abstract: Dendritic cells (DCs) orchestrate adaptive immune responses. In healthy individuals, DCs are drivers and fine-tuners of T cell responses directed against invading pathogens or cancer cells. In parallel, DCs control autoreactive T cells, thereby maintaining T cell tolerance. Under various disease conditions, a disruption of this delicate balance can lead to chronic infections, tumor evasion, or autoimmunity. While great efforts have been made to unravel the origin and development of this powerful cell type in mice, only little is known about the ontogeny of human DCs. Here, we summarize the current understanding of the developmental path of DCs from hematopoietic stem cells to fully functional DCs in their local tissue environment and provide a template for the identification of DCs across various tissues.

16 citations


References
More filters

Journal ArticleDOI
Abstract: Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

26,320 citations


Book
13 Aug 2009
TL;DR: This book describes ggplot2, a new data visualization package for R that uses the insights from Leland Wilkisons Grammar of Graphics to create a powerful and flexible system for creating data graphics.
Abstract: This book describes ggplot2, a new data visualization package for R that uses the insights from Leland Wilkisons Grammar of Graphics to create a powerful and flexible system for creating data graphics. With ggplot2, its easy to: produce handsome, publication-quality plots, with automatic legends created from the plot specification superpose multiple layers (points, lines, maps, tiles, box plots to name a few) from different data sources, with automatically adjusted common scales add customisable smoothers that use the powerful modelling capabilities of R, such as loess, linear models, generalised additive models and robust regression save any ggplot2 plot (or part thereof) for later modification or reuse create custom themes that capture in-house or journal style requirements, and that can easily be applied to multiple plots approach your graph from a visual perspective, thinking about how each component of the data is represented on the final plot. This book will be useful to everyone who has struggled with displaying their data in an informative and attractive way. You will need some basic knowledge of R (i.e. you should be able to get your data into R), but ggplot2 is a mini-language specifically tailored for producing graphics, and youll learn everything you need in the book. After reading this book youll be able to produce graphics customized precisely for your problems,and youll find it easy to get graphics out of your head and on to the screen or page.

23,839 citations


Journal ArticleDOI
TL;DR: An analytical strategy for integrating scRNA-seq data sets based on common sources of variation is introduced, enabling the identification of shared populations across data sets and downstream comparative analysis.
Abstract: Computational single-cell RNA-seq (scRNA-seq) methods have been successfully applied to experiments representing a single condition, technology, or species to discover and define cellular phenotypes. However, identifying subpopulations of cells that are present across multiple data sets remains challenging. Here, we introduce an analytical strategy for integrating scRNA-seq data sets based on common sources of variation, enabling the identification of shared populations across data sets and downstream comparative analysis. We apply this approach, implemented in our R toolkit Seurat (http://satijalab.org/seurat/), to align scRNA-seq data sets of peripheral blood mononuclear cells under resting and stimulated conditions, hematopoietic progenitors sequenced using two profiling technologies, and pancreatic cell 'atlases' generated from human and mouse islets. In each case, we learn distinct or transitional cell states jointly across data sets, while boosting statistical power through integrated analysis. Our approach facilitates general comparisons of scRNA-seq data sets, potentially deepening our understanding of how distinct cell states respond to perturbation, disease, and evolution.

4,666 citations


Journal ArticleDOI
TL;DR: Kallisto pseudoaligns reads to a reference, producing a list of transcripts that are compatible with each read while avoiding alignment of individual bases, which removes a major computational bottleneck in RNA-seq analysis.
Abstract: We present kallisto, an RNA-seq quantification program that is two orders of magnitude faster than previous approaches and achieves similar accuracy. Kallisto pseudoaligns reads to a reference, producing a list of transcripts that are compatible with each read while avoiding alignment of individual bases. We use kallisto to analyze 30 million unaligned paired-end RNA-seq reads in <10 min on a standard laptop computer. This removes a major computational bottleneck in RNA-seq analysis.

4,396 citations


Journal ArticleDOI
21 May 2015-Cell
TL;DR: Drop-seq will accelerate biological discovery by enabling routine transcriptional profiling at single-cell resolution by separating them into nanoliter-sized aqueous droplets, associating a different barcode with each cell's RNAs, and sequencing them all together.
Abstract: Cells, the basic units of biological structure and function, vary broadly in type and state. Single-cell genomics can characterize cell identity and function, but limitations of ease and scale have prevented its broad application. Here we describe Drop-seq, a strategy for quickly profiling thousands of individual cells by separating them into nanoliter-sized aqueous droplets, associating a different barcode with each cell's RNAs, and sequencing them all together. Drop-seq analyzes mRNA transcripts from thousands of individual cells simultaneously while remembering transcripts' cell of origin. We analyzed transcriptomes from 44,808 mouse retinal cells and identified 39 transcriptionally distinct cell populations, creating a molecular atlas of gene expression for known retinal cell classes and novel candidate cell subtypes. Drop-seq will accelerate biological discovery by enabling routine transcriptional profiling at single-cell resolution. VIDEO ABSTRACT.

4,167 citations


Related Papers (5)