Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells

doi:10.1038/NSMB.2660

Home
/
Papers
/
Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells

Journal Article•DOI•

Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells

Liying Yan¹, Mingyu Yang¹, Hongshan Guo¹, Lu Yang¹, Jun Wu¹, Rong Li¹, Rong Li², Ping Liu¹, Ying Lian¹, Xiaoying Zheng¹, Jie Yan¹, Jin Huang¹, Ming Li¹, Xinglong Wu¹, Lu Wen¹, Kaiqin Lao³, Ruiqiang Li¹, Jie Qiao², Jie Qiao¹, Fuchou Tang¹ - Show less +16 more•Institutions (3)

Peking University¹, Chinese Ministry of Education², Applied Biosystems³

01 Sep 2013-Nature Structural & Molecular Biology (Nat Struct Mol Biol)-Vol. 20, Iss: 9, pp 1131-1139

TL;DR: It is found that EPI cells and primary hESC outgrowth have dramatically different transcriptomes, with 1,498 genes showing differential expression between them, and this work provides a comprehensive framework of the transcriptome landscapes of human early embryos and hESCs.

read less

Abstract: Measuring gene expression in individual cells is crucial for understanding the gene regulatory network controlling human embryonic development. Here we apply single-cell RNA sequencing (RNA-Seq) analysis to 124 individual cells from human preimplantation embryos and human embryonic stem cells (hESCs) at different passages. The number of maternally expressed genes detected in our data set is 22,687, including 8,701 long noncoding RNAs (lncRNAs), which represents a significant increase from 9,735 maternal genes detected previously by cDNA microarray. We discovered 2,733 novel lncRNAs, many of which are expressed in specific developmental stages. To address the long-standing question whether gene expression signatures of human epiblast (EPI) and in vitro hESCs are the same, we found that EPI cells and primary hESC outgrowth have dramatically different transcriptomes, with 1,498 genes showing differential expression between them. This work provides a comprehensive framework of the transcriptome landscapes of human early embryos and hESCs.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Droplet Barcoding for Single-Cell Transcriptomics Applied to Embryonic Stem Cells

[...]

Allon M. Klein¹, Linas Mazutis¹, Linas Mazutis², Ilke Akartuna¹, Naren Tallapragada¹, Adrian Veres¹, Victor C. Li¹, Leonid Peshkin¹, David A. Weitz¹, Marc W. Kirschner¹ - Show less +6 more•Institutions (2)

Harvard University¹, Vilnius University²

21 May 2015-Cell

TL;DR: This work has developed a high-throughput droplet-microfluidic approach for barcoding the RNA from thousands of individual cells for subsequent analysis by next-generation sequencing, which shows a surprisingly low noise profile and is readily adaptable to other sequencing-based assays.

...read moreread less

2,894 citations

Cites background from "Single-cell RNA-Seq profiling of hu..."

...…ª2015 Elsevier Inc. Previous studies have indicated that ES cells are heterogeneous in gene expression (Guo et al., 2010; Hayashi et al., 2008; MacArthur et al., 2012; Martinez Arias and Brickman, 2011; Ohnishi et al., 2014; Singer et al., 2014; Torres-Padilla and Chambers, 2014; Yan et al., 2013)....
[...]

Journal Article•DOI•

Full-length RNA-seq from single cells using Smart-seq2

[...]

Simone Picelli¹, Omid R. Faridani¹, Åsa K. Björklund¹, Gösta Winberg¹, Sven Sagasser¹, Rickard Sandberg¹ - Show less +2 more•Institutions (1)

Ludwig Institute for Cancer Research¹

01 Jan 2014-Nature Protocols

TL;DR: In this article, the authors presented a detailed protocol for Smart-seq2 that allows the generation of full-length cDNA and sequencing libraries by using standard reagents, and the entire protocol takes ∼2 d from cell picking to having a final library ready for sequencing; sequencing will require an additional 1-3 d depending on the strategy and sequencer.

...read moreread less

Abstract: Emerging methods for the accurate quantification of gene expression in individual cells hold promise for revealing the extent, function and origins of cell-to-cell variability. Different high-throughput methods for single-cell RNA-seq have been introduced that vary in coverage, sensitivity and multiplexing ability. We recently introduced Smart-seq for transcriptome analysis from single cells, and we subsequently optimized the method for improved sensitivity, accuracy and full-length coverage across transcripts. Here we present a detailed protocol for Smart-seq2 that allows the generation of full-length cDNA and sequencing libraries by using standard reagents. The entire protocol takes ∼2 d from cell picking to having a final library ready for sequencing; sequencing will require an additional 1-3 d depending on the strategy and sequencer. The current limitations are the lack of strand specificity and the inability to detect nonpolyadenylated (polyA(-)) RNA.

...read moreread less

2,845 citations

Full-length RNA-seq from single cells using

[...]

Omid R. Faridani, Åsa K. Björklund, Gösta Winberg, Sven Sagasser, Rickard Sandberg - Show less +1 more

01 Jan 2014

TL;DR: A detailed protocol is presented for Smart-seq2 that allows the generation of full-length cDNA and sequencing libraries by using standard reagents and the lack of strand specificity and the inability to detect nonpolyadenylated (polyA−) RNA.

...read moreread less

Abstract: Emerging methods for the accurate quantification of gene expression in individual cells hold promise for revealing the extent, function and origins of cell-to-cell variability. Different high-throughput methods for single-cell RNA-seq have been introduced that vary in coverage, sensitivity and multiplexing ability. We recently introduced Smart-seq for transcriptome analysis from single cells, and we subsequently optimized the method for improved sensitivity, accuracy and full-length coverage across transcripts. Here we present a detailed protocol for Smart-seq2 that allows the generation of full-length cDNA and sequencing libraries by using standard reagents. The entire protocol takes ∼2 d from cell picking to having a final library ready for sequencing; sequencing will require an additional 1–3 d depending on the strategy and sequencer. The current limitations are the lack of strand specificity and the inability to detect nonpolyadenylated (polyA−) RNA.

...read moreread less

2,238 citations

Journal Article•DOI•

Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells

[...]

Florian Buettner¹, Kedar Nath Natarajan¹, Kedar Nath Natarajan², F Paolo Casale¹, Valentina Proserpio¹, Valentina Proserpio², Antonio Scialdone¹, Antonio Scialdone², Fabian J. Theis³, Sarah A. Teichmann², Sarah A. Teichmann¹, John C. Marioni¹, John C. Marioni², Oliver Stegle¹ - Show less +10 more•Institutions (3)

European Bioinformatics Institute¹, Wellcome Trust Sanger Institute², Technische Universität München³

01 Feb 2015-Nature Biotechnology

TL;DR: It is shown that the single-cell latent variable model (scLVM) allows the identification of otherwise undetectable subpopulations of cells that correspond to different stages during the differentiation of naive T cells into T helper 2 cells.

...read moreread less

Abstract: Hidden cell sub-populations are detected by accounting for confounding variation inthe analysis of single-cell RNA-seq data. Recent technical developments have enabled the transcriptomes of hundreds of cells to be assayed in an unbiased manner, opening up the possibility that new subpopulations of cells can be found. However, the effects of potential confounding factors, such as the cell cycle, on the heterogeneity of gene expression and therefore on the ability to robustly identify subpopulations remain unclear. We present and validate a computational approach that uses latent variable models to account for such hidden factors. We show that our single-cell latent variable model (scLVM) allows the identification of otherwise undetectable subpopulations of cells that correspond to different stages during the differentiation of naive T cells into T helper 2 cells. Our approach can be used not only to identify cellular subpopulations but also to tease apart different sources of gene expression heterogeneity in single-cell transcriptomes.

...read moreread less

1,132 citations

Journal Article•DOI•

SC3: consensus clustering of single-cell RNA-seq data

[...]

Vladimir Yu. Kiselev¹, Kristina Kirschner², Michael T. Schaub³, Michael T. Schaub⁴, Tallulah S. Andrews¹, Andrew Yiu¹, Tamir Chandra⁵, Tamir Chandra¹, Kedar Nath Natarajan⁶, Kedar Nath Natarajan¹, Wolf Reik², Wolf Reik⁵, Wolf Reik¹, Mauricio Barahona⁷, Anthony R. Green², Martin Hemberg¹ - Show less +12 more•Institutions (7)

Wellcome Trust Sanger Institute¹, University of Cambridge², Université catholique de Louvain³, Université de Namur⁴, Babraham Institute⁵, European Bioinformatics Institute⁶, Imperial College London⁷

01 May 2017-Nature Methods

TL;DR: It is demonstrated that SC3 is capable of identifying subclones from the transcriptomes of neoplastic cells collected from patients and achieves high accuracy and robustness by combining multiple clustering solutions through a consensus approach.

...read moreread less

Abstract: Single-cell RNA-seq enables the quantitative characterization of cell types based on global transcriptome profiles. We present single-cell consensus clustering (SC3), a user-friendly tool for unsupervised clustering, which achieves high accuracy and robustness by combining multiple clustering solutions through a consensus approach (http://bioconductor.org/packages/SC3). We demonstrate that SC3 is capable of identifying subclones from the transcriptomes of neoplastic cells collected from patients.

...read moreread less

1,120 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Controlling the false discovery rate: a practical and powerful approach to multiple testing

[...]

Yoav Benjamini, Yosef Hochberg

01 Jan 1995-Journal of the royal statistical society series b-methodological

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.

...read moreread less

Abstract: SUMMARY The common approach to the multiplicity problem calls for controlling the familywise error rate (FWER). This approach, though, has faults, and we point out a few. A different approach to problems of multiple significance testing is presented. It calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate. This error rate is equivalent to the FWER when all hypotheses are true but is smaller otherwise. Therefore, in problems where the control of the false discovery rate rather than that of the FWER is desired, there is potential for a gain in power. A simple sequential Bonferronitype procedure is proved to control the false discovery rate for independent test statistics, and a simulation study shows that the gain in power is substantial. The use of the new procedure and the appropriateness of the criterion are illustrated with examples.

...read moreread less

83,420 citations

Journal Article•DOI•

Fast and accurate short read alignment with Burrows–Wheeler transform

[...]

Heng Li¹, Richard Durbin¹•Institutions (1)

Wellcome Trust Sanger Institute¹

01 Jul 2009-Bioinformatics

TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.

...read moreread less

Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

...read moreread less

43,862 citations

Journal Article•DOI•

Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.

[...]

Da-Wei Huang¹, Brad T. Sherman¹, Richard A. Lempicki¹•Institutions (1)

Science Applications International Corporation¹

01 Jan 2009-Nature Protocols

TL;DR: By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.

...read moreread less

Abstract: DAVID bioinformatics resources consists of an integrated biological knowledgebase and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists. This protocol explains how to use DAVID, a high-throughput and integrated data-mining environment, to analyze gene lists derived from high-throughput genomic experiments. The procedure first requires uploading a gene list containing any number of common gene identifiers followed by analysis using one or more text and pathway-mining tools such as gene functional classification, functional annotation chart or clustering and functional annotation table. By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.

...read moreread less

31,015 citations

Journal Article•DOI•

Cluster analysis and display of genome-wide expression patterns

[...]

Michael B. Eisen¹, Paul T. Spellman¹, Patrick O. Brown¹, David Botstein¹•Institutions (1)

Stanford University¹

08 Dec 1998-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.

...read moreread less

Abstract: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is de- scribed that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be inter- preted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly charac- terized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.

...read moreread less

16,371 citations

Journal Article•DOI•

Full-length transcriptome assembly from RNA-Seq data without a reference genome.

[...]

Manfred Grabherr¹, Brian J. Haas¹, Moran Yassour², Moran Yassour¹, Joshua Z. Levin¹, Dawn Thompson¹, Ido Amit¹, Xian Adiconis¹, Lin Fan¹, Raktima Raychowdhury¹, Qiandong Zeng¹, Zehua Chen¹, Evan Mauceli¹, Nir Hacohen¹, Andreas Gnirke¹, Nicholas Rhind³, Federica Di Palma¹, Bruce W. Birren¹, Chad Nusbaum¹, Kerstin Lindblad-Toh⁴, Kerstin Lindblad-Toh¹, Nir Friedman², Aviv Regev¹ - Show less +19 more•Institutions (4)

Massachusetts Institute of Technology¹, Hebrew University of Jerusalem², University of Massachusetts Medical School³, Science for Life Laboratory⁴

01 Jul 2011-Nature Biotechnology

TL;DR: The Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available, providing a unified solution for transcriptome reconstruction in any sample.

...read moreread less

Abstract: Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.

...read moreread less

15,665 citations