scispace - formally typeset
Search or ask a question
Author

William Stephenson

Bio: William Stephenson is an academic researcher from Harvard University. The author has contributed to research in topics: RNA & Folding (chemistry). The author has an hindex of 10, co-authored 21 publications receiving 1456 citations. Previous affiliations of William Stephenson include State University of New York System & Drexel University.

Papers
More filters
Journal ArticleDOI
TL;DR: In this article, a method called cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) is proposed, in which oligonucleotide-labeled antibodies are used to integrate cellular protein and transcriptome measurements into an efficient, single-cell readout.
Abstract: High-throughput single-cell RNA sequencing has transformed our understanding of complex cell populations, but it does not provide phenotypic information such as cell-surface protein levels. Here, we describe cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), a method in which oligonucleotide-labeled antibodies are used to integrate cellular protein and transcriptome measurements into an efficient, single-cell readout. CITE-seq is compatible with existing single-cell sequencing approaches and scales readily with throughput increases.

1,904 citations

Journal ArticleDOI
TL;DR: A 3D-printed, low-cost droplet microfluidic control instrument is developed and deployed in a clinical environment to perform single-cell transcriptome profiling of disaggregated synovial tissue from five rheumatoid arthritis patients.
Abstract: Droplet-based single-cell RNA-seq has emerged as a powerful technique for massively parallel cellular profiling. While this approach offers the exciting promise to deconvolute cellular heterogeneity in diseased tissues, the lack of cost-effective and user-friendly instrumentation has hindered widespread adoption of droplet microfluidic techniques. To address this, we developed a 3D-printed, low-cost droplet microfluidic control instrument and deploy it in a clinical environment to perform single-cell transcriptome profiling of disaggregated synovial tissue from five rheumatoid arthritis patients. We sequence 20,387 single cells revealing 13 transcriptomically distinct clusters. These encompass an unsupervised draft atlas of the autoimmune infiltrate that contribute to disease biology. Additionally, we identify previously uncharacterized fibroblast subpopulations and discern their spatial location within the synovium. We envision that this instrument will have broad utility in both research and clinical settings, enabling low-cost and routine application of microfluidic techniques.

249 citations

Journal ArticleDOI
TL;DR: The data reveal a transcriptional compendium of progenitor states in human cord blood, representing four committed lineages downstream from HSC, alongside the transcriptional dynamics underlying fate commitment, and it is demonstrated that Drop‐seq data can be utilized to identify new heterogeneous surface markers of cell state that correlate with functional output.
Abstract: Hematopoietic stem cells (HSCs) give rise to diverse cell types in the blood system, yet our molecular understanding of the early trajectories that generate this enormous diversity in humans remains incomplete. Here, we leverage Drop‐seq, a massively parallel single‐cell RNA sequencing (scRNA‐seq) approach, to individually profile 20,000 progenitor cells from human cord blood, without prior enrichment or depletion for individual lineages based on surface markers. Our data reveal a transcriptional compendium of progenitor states in human cord blood, representing four committed lineages downstream from HSC, alongside the transcriptional dynamics underlying fate commitment. We identify intermediate stages that simultaneously co‐express “primed” programs for multiple downstream lineages, and also observe striking heterogeneity in the early molecular transitions between myeloid subsets. Integrating our data with a recently published scRNA‐seq dataset from human bone marrow, we illustrate the molecular similarity between these two commonly used systems and further explore the chromatin dynamics of “primed” transcriptional programs based on ATAC‐seq. Finally, we demonstrate that Drop‐seq data can be utilized to identify new heterogeneous surface markers of cell state that correlate with functional output.

119 citations

Posted ContentDOI
01 Jun 2020-bioRxiv
TL;DR: Direct RNA nanopore sequencing is demonstrated to detect endogenous and exogenous RNA modifications over long sequence distance at the single molecule level and a recently described small adduct-generating 2’-O-acylation reagent, acetylimidazole, is characterized for exogenously labeling flexible nucleotides in RNA.
Abstract: Many methods exist to detect RNA modifications by short-read sequencing, relying on either antibody enrichment of transcripts bearing modified bases or mutational profiling approaches which require conversion to cDNA. Endogenous modifications are present on several major classes of RNA including tRNA, rRNA and mRNA and can modulate diverse biological processes such as genetic recoding, mRNA export and RNA folding. In addition, exogenous modifications can be introduced to RNA molecules to reveal RNA structure and dynamics. Limitations on read length and library size inherent in short-read-based methods dissociate modifications from their native context, preventing single molecule analysis and modification phasing. Here we demonstrate direct RNA nanopore sequencing to detect endogenous and exogenous RNA modifications over long sequence distance at the single molecule level. We demonstrate comprehensive detection of endogenous modifications in E. coli and S. cerevisiae ribosomal RNA (rRNA) using current signal deviations. Notably 2’-O-methyl (Nm) modifications generated a discernible shift in current signal and event level dwell times. We show that dwell times are mediated by the RNA motor protein which sits atop the nanopore. Further, we characterize a recently described small adduct-generating 2’-O-acylation reagent, acetylimidazole (AcIm) for exogenously labeling flexible nucleotides in RNA. Finally, we demonstrate the utility of AcIm for single molecule RNA structural probing using nanopore sequencing.

44 citations

Journal ArticleDOI
TL;DR: It is indicated that unpaired flanking nucleotides play essential roles in the formation of otherwise unstable two-base-pair RNA tertiary interactions.
Abstract: In minimal RNA kissing complexes formed between hairpins with cognate GACG tetraloops, the two tertiary GC pairs are likely stabilized by the stacking of 5'-unpaired adenines at each end of the short helix. To test this hypothesis, we mutated the flanking adenines to various nucleosides and examined their effects on the kissing interaction. Electrospray ionization mass spectrometry was used to detect kissing dimers in a multiequilibria mixture, whereas optical tweezers were applied to monitor the (un)folding trajectories of single RNA molecules. The experimental findings were rationalized by molecular dynamics simulations. Together, the results showed that the stacked adenines are indispensable for the tertiary interaction. By shielding the tertiary base pairs from solvent and reducing their fraying, the stacked adenines made terminal pairs act more like interior base pairs. The purine double-ring of adenine was essential for effective stacking, whereas additional functional groups modulated the stabilizing effects through varying hydrophobic and electrostatic forces. Furthermore, formation of the kissing complex was dominated by base pairing, whereas its dissociation was significantly influenced by the flanking bases. Together, these findings indicate that unpaired flanking nucleotides play essential roles in the formation of otherwise unstable two-base-pair RNA tertiary interactions.

33 citations


Cited by
More filters
Journal ArticleDOI
13 Jun 2019-Cell
TL;DR: A strategy to "anchor" diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities.

7,892 citations

Journal ArticleDOI
24 Jun 2021-Cell
TL;DR: Weighted-nearest neighbor analysis as mentioned in this paper is an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities.

3,369 citations

Posted ContentDOI
12 Oct 2020-bioRxiv
TL;DR: ‘weighted-nearest neighbor’ analysis is introduced, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities.
Abstract: The simultaneous measurement of multiple modalities, known as multimodal analysis, represents an exciting frontier for single-cell genomics and necessitates new computational methods that can define cellular states based on multiple data types. Here, we introduce ‘weighted-nearest neighbor’ analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of hundreds of thousands of human white blood cells alongside a panel of 228 antibodies to construct a multimodal reference atlas of the circulating immune system. We demonstrate that integrative analysis substantially improves our ability to resolve cell states and validate the presence of previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets, and to interpret immune responses to vaccination and COVID-19. Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets, including paired measurements of RNA and chromatin state, and to look beyond the transcriptome towards a unified and multimodal definition of cellular identity. Availability Installation instructions, documentation, tutorials, and CITE-seq datasets are available at http://www.satijalab.org/seurat

2,924 citations

Posted ContentDOI
02 Nov 2018-bioRxiv
TL;DR: This work presents a strategy for comprehensive integration of single cell data, including the assembly of harmonized references, and the transfer of information across datasets, and demonstrates how anchoring can harmonize in-situ gene expression and scRNA-seq datasets.
Abstract: Single cell transcriptomics (scRNA-seq) has transformed our ability to discover and annotate cell types and states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, including high-dimensional immunophenotypes, chromatin accessibility, and spatial positioning, a key analytical challenge is to integrate these datasets into a harmonized atlas that can be used to better understand cellular identity and function. Here, we develop a computational strategy to "anchor" diverse datasets together, enabling us to integrate and compare single cell measurements not only across scRNA-seq technologies, but different modalities as well. After demonstrating substantial improvement over existing methods for data integration, we anchor scRNA-seq experiments with scATAC-seq datasets to explore chromatin differences in closely related interneuron subsets, and project single cell protein measurements onto a human bone marrow atlas to annotate and characterize lymphocyte populations. Lastly, we demonstrate how anchoring can harmonize in-situ gene expression and scRNA-seq datasets, allowing for the transcriptome-wide imputation of spatial gene expression patterns, and the identification of spatial relationships between mapped cell types in the visual cortex. Our work presents a strategy for comprehensive integration of single cell data, including the assembly of harmonized references, and the transfer of information across datasets. Availability: Installation instructions, documentation, and tutorials are available at: https://www.satijalab.org/seurat

2,037 citations

Journal ArticleDOI
TL;DR: It is proposed that the Pearson residuals from “regularized negative binomial regression,” where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity.
Abstract: Single-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from “regularized negative binomial regression,” where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an unconstrained negative binomial model may overfit scRNA-seq data, and overcome this by pooling information across genes with similar abundances to obtain stable parameter estimates. Our procedure omits the need for heuristic steps including pseudocount addition or log-transformation and improves common downstream analytical tasks such as variable gene selection, dimensional reduction, and differential expression. Our approach can be applied to any UMI-based scRNA-seq dataset and is freely available as part of the R package sctransform, with a direct interface to our single-cell toolkit Seurat.

1,898 citations