scispace - formally typeset
Search or ask a question

Showing papers by "Joshua D. Welch published in 2020"


Journal ArticleDOI
TL;DR: It is shown that quiescent CXCL12-expressing BMSCs can convert into a skeletal stem cell-like state, and differentiate into cortical bone osteoblasts only in response to injury.
Abstract: Bone marrow stromal cells (BMSCs) are versatile mesenchymal cell populations underpinning the major functions of the skeleton, a majority of which adjoin sinusoidal blood vessels and express C-X-C motif chemokine ligand 12 (CXCL12). However, how these cells are activated during regeneration and facilitate osteogenesis remains largely unknown. Cell-lineage analysis using Cxcl12-creER mice reveals that quiescent Cxcl12-creER+ perisinusoidal BMSCs differentiate into cortical bone osteoblasts solely during regeneration. A combined single cell RNA-seq analysis demonstrate that these cells convert their identity into a skeletal stem cell-like state in response to injury, associated with upregulation of osteoblast-signature genes and activation of canonical Wnt signaling components along the single-cell trajectory. β-catenin deficiency in these cells indeed causes insufficiency in cortical bone regeneration. Therefore, quiescent Cxcl12-creER+ BMSCs transform into osteoblast precursor cells in a manner mediated by canonical Wnt signaling, highlighting a unique mechanism by which dormant stromal cells are enlisted for skeletal regeneration.

151 citations


Journal ArticleDOI
TL;DR: In this article, a step-by-step protocol for using linked inference of genomic experimental relationships (LIGER) to jointly define cell types from multiple single-cell datasets is presented.
Abstract: High-throughput single-cell sequencing technologies hold tremendous potential for defining cell types in an unbiased fashion using gene expression and epigenomic state. A key challenge in realizing this potential is integrating single-cell datasets from multiple protocols, biological contexts, and data modalities into a joint definition of cellular identity. We previously developed an approach, called linked inference of genomic experimental relationships (LIGER), that uses integrative nonnegative matrix factorization to address this challenge. Here, we provide a step-by-step protocol for using LIGER to jointly define cell types from multiple single-cell datasets. The main stages of the protocol are data preprocessing and normalization, joint factorization, quantile normalization and joint clustering, and visualization. We describe how to jointly define cell types from single-cell RNA-seq (scRNA-seq) and single-nucleus ATAC-seq (snATAC-seq) data, but similar steps apply across a wide range of other settings and data types, including cross-species analysis, single-nucleus DNA methylation, and spatial transcriptomics. Our protocol contains examples of expected results, describes common pitfalls, and relies only on our freely available, open-source R implementation of LIGER. We also provide R Markdown tutorials showing the outputs from each individual code segment. The analysis process can be performed in 1–4 h, depending on dataset size, and assumes no specialized bioinformatics training. Here, the authors describe step-by-step procedures for integrating single-cell sequencing datasets from different experiments or modalities to identify common and distinct cell types using the R-based software tool LIGER.

77 citations


Posted ContentDOI
Zizhen Yao1, Hanqing Liu2, Fangming Xie3, Stephan Fischer4, Ricky S. Adkins5, Andrew I. Aldrige2, Seth A. Ament5, Ann Bartlett2, M. Margarita Behrens2, Koen Van den Berge6, Koen Van den Berge3, Darren Bertagnolli1, Tommaso Biancalani7, A. Sina Booeshaghi8, Héctor Corrada Bravo9, Tamara Casper1, Carlo Colantuoni10, Jonathan Crabtree5, Heather Huot Creasy5, Kirsten Crichton1, Megan Crow4, Nick Dee1, Elizabeth L. Dougherty7, Wayne I. Doyle3, Sandrine Dudoit3, Rongxin Fang, Victor Felix5, Olivia Fong1, Michelle G. Giglio5, Jeff Goldy1, Michael Hawrylycz1, Hector Roux de Bézieux3, Brian R. Herb5, Ronna Hertzano5, Xiaomeng Hou11, Qiwen Hu12, Z. Josh Huang4, Jayaram Kancherla9, Matthew Kroll1, Kanan Lathia1, Yang Eric Li13, Jacinta Lucero2, Chongyuan Luo14, Anup Mahurkar5, Delissa McMillen1, Naeem Nadaf7, Joseph R. Nery2, Thuc Nghi Nguyen1, Sheng-Yong Niu15, Vasilis Ntranos15, Joshua Orvis5, Julia K. Osteen2, Thanh Pham1, Antonio Pinto-Duarte5, Olivier Poirion11, Sebastian Preissl11, Elizabeth Purdom3, Christine Rimorin1, Davide Risso16, Angeline Rivkin2, Kimberly A. Smith1, Kelly Street12, Josef Sulc1, Valentine Svensson8, Michael Tieu1, Amy Torkelson1, Herman Tung1, Eeshit Dhaval Vaishnav7, Charles R. Vanderburg7, Cindy T. J. van Velthoven1, Xinxin Wang17, Xinxin Wang11, Owen White5, Jesse Gillis4, Peter V. Kharchenko12, John Ngai3, Lior Pachter8, Aviv Regev18, Aviv Regev7, Aviv Regev19, Bosiljka Tasic1, Joshua D. Welch20, Joseph R. Ecker2, Evan Z. Macosko7, Bing Ren13, Hongkui Zeng1, Eran A. Mukamel3 
02 Mar 2020-bioRxiv
TL;DR: This study used a battery of single-cell transcriptome and epigenome measurements generated by the BICCN to comprehensively assess the molecular signatures of cell types in the mouse primary motor cortex (MOp), and developed computational and statistical methods to integrate these multimodal data and quantitatively validate the reproducibility of the cell types.
Abstract: Single cell transcriptomics has transformed the characterization of brain cell identity by providing quantitative molecular signatures for large, unbiased samples of brain cell populations. With the proliferation of taxonomies based on individual datasets, a major challenge is to integrate and validate results toward defining biologically meaningful cell types. We used a battery of single-cell transcriptome and epigenome measurements generated by the BRAIN Initiative Cell Census Network (BICCN) to comprehensively assess the molecular signatures of cell types in the mouse primary motor cortex (MOp). We further developed computational and statistical methods to integrate these multimodal data and quantitatively validate the reproducibility of the cell types. The reference atlas, based on more than 600,000 high quality single-cell or -nucleus samples assayed by six molecular modalities, is a comprehensive molecular account of the diverse neuronal and non-neuronal cell types in MOp. Collectively, our study indicates that the mouse primary motor cortex contains over 55 neuronal cell types that are highly replicable across analysis methods, sequencing technologies, and modalities. We find many concordant multimodal markers for each cell type, as well as thousands of genes and gene regulatory elements with discrepant transcriptomic and epigenomic signatures. These data highlight the complex molecular regulation of brain cell types and will directly enable design of reagents to target specific MOp cell types for functional analysis.

67 citations


Posted ContentDOI
17 Jan 2020-bioRxiv
TL;DR: An online learning algorithm for integrating large and continually arriving single-cell datasets, which obviates the need to recompute results each time additional cells are sequenced, dramatically increases convergence speed, and allows processing of datasets too large to fit in memory or on disk.
Abstract: Recent experimental advances have enabled high-throughput single-cell measurement of gene expression, chromatin accessibility and DNA methylation. We previously used integrative non-negative matrix factorization (iNMF) to jointly learn interpretable low-dimensional representations from multiple single-cell datasets using dataset-specific and shared metagene factors. These factors provide a principled, quantitative definition of cellular identity and how it varies across biological contexts. However, datasets exceeding 1 million cells are now widely available, creating computational barriers to scientific discovery. For instance, it is no longer feasible to analyze large datasets using standard pipelines on a personal computer with limited memory capacity. Moreover, there is a need for an algorithm capable of iteratively refining the definition of cellular identity as efforts to create a comprehensive human cell atlas continually sequence new cells. To address these challenges, we developed an online learning algorithm for integrating large and continually arriving single-cell datasets. We extended previous online learning approaches for NMF to minimize the expected cost of a surrogate function for the iNMF objective. We also derived a novel hierarchical alternating least squares algorithm for iNMF and incorporated it into an efficient online algorithm. Our online approach accesses the training data as mini-batches, decoupling memory usage from dataset size and allowing on-the-fly incorporation of new datasets as they are generated. The online implementation of iNMF converges much more quickly using a fraction of the memory required for the batch implementation, without sacrificing solution quality. Our new approach processes 1.3 million single cells from the entire mouse embryo on a laptop in 25 minutes using less than 500 MB of RAM. We also analyze large datasets without downloading them to disk by streaming them over the internet on demand. Furthermore, we construct a single-cell multi-omic cell atlas of the mouse motor cortex by iteratively incorporating eight single-cell RNA-seq, single-nucleus RNA-seq, single-nucleus ATAC-seq, and single-nucleus DNA methylation datasets generated by the BRAIN Initiative Cell Census Network. Our approach obviates the need to recompute results each time additional cells are sequenced, dramatically increases convergence speed, and allows processing of datasets too large to fit in memory or on disk. Most importantly, it facilitates continual refinement of cell identity as new single-cell datasets from different biological contexts and data modalities are generated.

8 citations


Book ChapterDOI
10 May 2020
TL;DR: An alternating nonnegative least squares (ANLS) algorithm is developed to solve the iNMF optimization problem and help solve the single-cell measurement of gene expression, chromatin accessibility and DNA methylation.
Abstract: Recent experimental advances have enabled high-throughput single-cell measurement of gene expression, chromatin accessibility and DNA methylation. We previously employed integrative non-negative matrix factorization (iNMF) to jointly align multiple single-cell datasets (\(X_i\)) and learn interpretable low-dimensional representations using dataset-specific (\(V_i)\) and shared metagene factors (W) and cell factor loadings (\(H_i\)). We developed an alternating nonnegative least squares (ANLS) algorithm to solve the iNMF optimization problem [2]:

5 citations


Posted ContentDOI
Ricky S. Adkins1, Andrew Aldridge2, Shona Allen3, Seth A. Ament1  +266 moreInstitutions (42)
21 Oct 2020-bioRxiv
TL;DR: In this article, the authors report the generation of a multimodal cell census and atlas of the mammalian primary motor cortex (MOp or M1) as the initial product of the BRAIN Initiative Cell Census Network (BICCN).
Abstract: We report the generation of a multimodal cell census and atlas of the mammalian primary motor cortex (MOp or M1) as the initial product of the BRAIN Initiative Cell Census Network (BICCN). This was achieved by coordinated large-scale analyses of single-cell transcriptomes, chromatin accessibility, DNA methylomes, spatially resolved single-cell transcriptomes, morphological and electrophysiological properties, and cellular resolution input-output mapping, integrated through cross-modal computational analysis. Together, our results advance the collective knowledge and understanding of brain cell type organization: First, our study reveals a unified molecular genetic landscape of cortical cell types that congruently integrates their transcriptome, open chromatin and DNA methylation maps. Second, cross-species analysis achieves a unified taxonomy of transcriptomic types and their hierarchical organization that are conserved from mouse to marmoset and human. Third, cross-modal analysis provides compelling evidence for the epigenomic, transcriptomic, and gene regulatory basis of neuronal phenotypes such as their physiological and anatomical properties, demonstrating the biological validity and genomic underpinning of neuron types and subtypes. Fourth, in situ single-cell transcriptomics provides a spatially-resolved cell type atlas of the motor cortex. Fifth, integrated transcriptomic, epigenomic and anatomical analyses reveal the correspondence between neural circuits and transcriptomic cell types. We further present an extensive genetic toolset for targeting and fate mapping glutamatergic projection neuron types toward linking their developmental trajectory to their circuit function. Together, our results establish a unified and mechanistic framework of neuronal cell type organization that integrates multi-layered molecular genetic and spatial information with multi-faceted phenotypic properties.

4 citations


Posted ContentDOI
14 Mar 2020-bioRxiv
TL;DR: Findings support the concept that perichondrial cells participate in endochondral bone development through a distinct route, by providing a complementary source of skeletal progenitor cells.
Abstract: Summary The perichondrium, a fibrous tissue surrounding the fetal cartilage, is an essential component of developing endochondral bones that provides a source of skeletal progenitor cells. However, perichondrial cells remain poorly characterized due to lack of knowledge on their cellular diversity and subset-specific mouse genetics tools. Single cell RNA-seq analyses reveal a contiguous nature of the fetal chondrocyte-perichondrial cell lineage that shares an overlapping set of marker genes. Subsequent cell-lineage analyses using multiple creER lines active in fetal perichondrial cells – Hes1-creER, Dlx5-creER – and chondrocytes – Fgfr3-creER – illustrate their distinctive contribution to endochondral bone development; postnatally, these cells contribute to the functionally distinct bone marrow stromal compartments. Particularly, Notch effector Hes1-creER marks an early skeletal progenitor cell population in the primordium that robustly populates multiple skeletal compartments. These findings support the concept that perichondrial cells participate in endochondral bone development through a distinct route, by providing a complementary source of skeletal progenitor cells.

4 citations


Posted ContentDOI
24 Jul 2020-bioRxiv
TL;DR: In mouse models and human gliomas, mIDH1 in the context of ATRX and TP53 inactivation results in global expansion of the granulocytic myeloid cells’ compartment, rendering them non-immunosuppressive; and having significant therapeutic implications.
Abstract: Mutation in isocitrate dehydrogenase (mIDH) is a gain of function mutation resulting in the production of the oncometabolite, R-2-hydroxyglutarate, that inhibits DNA and histone demethylases. The resultant hypermethylation phenotype reprograms the glioma cells’ transcriptome and elicits profound effects on glioma immunity. We report that in mouse models and human gliomas, mIDH1 in the context of ATRX and TP53 inactivation results in global expansion of the granulocytic myeloid cells’ compartment. Single-cell RNA-sequencing coupled with mass cytometry analysis revealed that these granulocytes are mainly non-immunosuppressive neutrophils and pre-neutrophils; with a small fraction of polymorphonuclear myeloid-derived suppressor cells. The mechanism of mIDH1 mediated pre-neutrophils expansion involves epigenetic reprogramming which leads to enhanced expression of the granulocyte colony-stimulating factor (G-CSF). Blocking G-CSF restored the inhibitory potential of PMN-MDSCs and enhanced tumor progression. Thus, G-CSF induces remodeling of the inhibitory PMN-MDSCs in mIDH1 glioma rendering them non-immunosuppressive; and having significant therapeutic implications. SIGNIFICANCE mIDH1 is the most common mutation in gliomas associated with improved prognosis. Gliomas harboring mIDH1, together with ATRX and TP53 inactivation, exhibit higher circulating levels of G-CSF, ensuing the recruitment and expansion of non-suppressive neutrophils, pre-neutrophils and small fraction of PMN-MDSCs to the TME leading to an immune permissive phenotype.

3 citations


Posted ContentDOI
08 Apr 2020-bioRxiv
TL;DR: This protocol describes how to jointly define cell types from single-cell RNA-seq and single-nucleus ATAC-seq data, but similar steps apply across a wide range of other settings and data types, including cross-species analysis, single-Nucleus DNA methylation, and spatial transcriptomics.
Abstract: High-throughput single-cell sequencing technologies hold tremendous potential for defining cell types in an unbiased fashion using gene expression and epigenomic state. A key challenge in realizing this potential is integrating single-cell datasets from multiple protocols, biological contexts, and data modalities into a joint definition of cellular identity. We previously developed an approach called Linked Inference of Genomic Experimental Relationships (LIGER) that uses integrative nonnegative matrix factorization to address this challenge. Here, we provide a step-by-step protocol for using LIGER to jointly define cell types from multiple single-cell datasets. The main steps of the protocol include data preprocessing and normalization, joint factorization, quantile normalization and joint clustering, and visualization. We describe how to jointly define cell types from single-cell RNA-seq and single-nucleus ATAC-seq data, but similar steps apply across a wide range of other settings and data types, including cross-species analysis, single-nucleus DNA methylation, and spatial transcriptomics. Our protocol contains examples of expected results, describes common pitfalls, and relies only on our freely available, open-source R implementation of LIGER. We also provide Rmarkdown tutorials showing the outputs from each individual code segment. The analysis process can be performed in 1 - 4 h depending on dataset size and assumes no specialized bioinformatics training.

3 citations


Posted ContentDOI
12 Feb 2020-bioRxiv
TL;DR: Using GAM data from mouse embryonic stem cells, new discoveries are made about the structure of the major mammalian histone gene locus, which is incorporated into the Histone Locus Body (HLB), including structural fluctuations and putative causal molecular mechanisms.
Abstract: Although each cell within an organism contains a nearly identical genome sequence, the three-dimensional (3D) packing of the genome varies among individual cells, influencing cell-type-specific gene expression. Genome Architecture Mapping (GAM) is the first genome-wide experimental method for capturing 3D proximities between any number of genomic loci without ligation. GAM overcomes several limitations of 3C-based methods by sequencing DNA from a large collection of thin sections sliced from individual nuclei. The GAM technique measures locus co-segregation, extracts radial positions, infers chromatin compaction, requires small numbers of cells, does not depend on ligation, and provides rich single-cell information. However, previous analyses of GAM data focused exclusively on population averages, neglecting the variation in 3D topology among individual cells. We present the first single-cell analysis of GAM data, demonstrating that the slices from individual cells reveal intercellular heterogeneity in chromosome conformation. By simultaneously clustering both slices and genomic loci, we identify topological variation among single cells, including differential compaction of cell cycle genes. We also develop a geometric model of the nucleus, allowing prediction of the 3D positions of each slice. Using GAM data from mouse embryonic stem cells, we make new discoveries about the structure of the major mammalian histone gene locus, which is incorporated into the Histone Locus Body (HLB), including structural fluctuations and putative causal molecular mechanisms. Our methods are packaged as SluiceBox, a toolkit for mining GAM data. Our approach represents a new method of investigating variation in 3D genome topology among individual cells across space and time.

2 citations