Human housekeeping genes, revisited.

doi:10.1016/J.TIG.2013.05.010

Home
/
Papers
/
Human housekeeping genes, revisited.

Journal Article•DOI•

Human housekeeping genes, revisited.

Eli Eisenberg¹, Erez Y. Levanon²•Institutions (2)

Tel Aviv University¹, Bar-Ilan University²

01 Oct 2013-Trends in Genetics (Elsevier)-Vol. 29, Iss: 10, pp 569-574

TL;DR: This work describes housekeeping gene detection in the era of massive parallel sequencing and RNA-seq and provides a list of 3804 human genes that are expressed uniformly across a panel of tissues.

read less

About: This article is published in Trends in Genetics.The article was published on 2013-10-01. It has received 1063 citations till now. The article focuses on the topics: Housekeeping gene.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

The landscape of long noncoding RNAs in the human transcriptome

[...]

Matthew K. Iyer¹, Yashar S. Niknafs¹, Rohit Malik¹, Udit Singhal¹, Anirban Sahu¹, Yasuyuki Hosono¹, Terrence R. Barrette¹, John R. Prensner¹, Joseph R. Evans¹, Shuang G. Zhao¹, Anton Poliakov¹, Xuhong Cao¹, Saravana M. Dhanasekaran¹, Yi-Mi Wu¹, Dan R. Robinson¹, David G. Beer¹, Felix Y. Feng¹, Hariharan K. Iyer², Arul M. Chinnaiyan¹ - Show less +15 more•Institutions (2)

University of Michigan¹, Colorado State University²

01 Mar 2015-Nature Genetics

TL;DR: The lncRNA landscape characterized here may shed light on normal biology and cancer pathogenesis and may be valuable for future biomarker development.

...read moreread less

Abstract: Long noncoding RNAs (lncRNAs) are emerging as important regulators of tissue physiology and disease processes including cancer. To delineate genome-wide lncRNA expression, we curated 7,256 RNA sequencing (RNA-seq) libraries from tumors, normal tissues and cell lines comprising over 43 Tb of sequence from 25 independent studies. We applied ab initio assembly methodology to this data set, yielding a consensus human transcriptome of 91,013 expressed genes. Over 68% (58,648) of genes were classified as lncRNAs, of which 79% were previously unannotated. About 1% (597) of the lncRNAs harbored ultraconserved elements, and 7% (3,900) overlapped disease-associated SNPs. To prioritize lineage-specific, disease-associated lncRNA expression, we employed non-parametric differential expression testing and nominated 7,942 lineage- or cancer-associated lncRNA genes. The lncRNA landscape characterized here may shed light on normal biology and cancer pathogenesis and may be valuable for future biomarker development.

...read moreread less

2,209 citations

Journal Article•DOI•

Araport11: a complete reannotation of the Arabidopsis thaliana reference genome

[...]

Chia Yi Cheng¹, Vivek Krishnakumar¹, Agnes P. Chan¹, Françoise Thibaud-Nissen², Seth Schobel¹, Christopher D. Town¹ - Show less +2 more•Institutions (2)

J. Craig Venter Institute¹, National Institutes of Health²

01 Feb 2017-Plant Journal

TL;DR: This updated Arabidopsis genome annotation with a substantially increased resolution of gene models will not only further the understanding of the biological processes of this plant model but also of other species.

...read moreread less

Abstract: Summary The flowering plant Arabidopsis thaliana is a dicot model organism for research in many aspects of plant biology. A comprehensive annotation of its genome paves the way for understanding the functions and activities of all types of transcripts, including mRNA, the various classes of non-coding RNA, and small RNA. The TAIR10 annotation update had a profound impact on Arabidopsis research but was released more than 5 years ago. Maintaining the accuracy of the annotation continues to be a prerequisite for future progress. Using an integrative annotation pipeline, we assembled tissue-specific RNA-Seq libraries from 113 datasets and constructed 48 359 transcript models of protein-coding genes in eleven tissues. In addition, we annotated various classes of non-coding RNA including microRNA, long intergenic RNA, small nucleolar RNA, natural antisense transcript, small nuclear RNA, and small RNA using published datasets and in-house analytic results. Altogether, we identified 635 novel protein-coding genes, 508 novel transcribed regions, 5178 non-coding RNAs, and 35 846 small RNA loci that were formerly unannotated. Analysis of the splicing events and RNA-Seq based expression profiles revealed the landscapes of gene structures, untranslated regions, and splicing activities to be more intricate than previously appreciated. Furthermore, we present 692 uniformly expressed housekeeping genes, 43% of whose human orthologs are also housekeeping genes. This updated Arabidopsis genome annotation with a substantially increased resolution of gene models will not only further our understanding of the biological processes of this plant model but also of other species.

...read moreread less

769 citations

Cites background from "Human housekeeping genes, revisited..."

...…that 297 of these 692 housekeeping genes have human orthologs also characterized as housekeeping genes with uniform expression across 16 human tissues using RNA-Seq data (Eisenberg and Levanon, 2013), indicating a considerable conservation of core processes between plant and mammalian systems....
[...]
...It is noteworthy that 297 of these 692 housekeeping genes have human orthologs also characterized as housekeeping genes with uniform expression across 16 human tissues using RNA-Seq data (Eisenberg and Levanon, 2013), indicating a considerable conservation of core processes between plant and mammalian systems....
[...]

Journal Article•DOI•

Molecular and pharmacological modulators of the tumor immune contexture revealed by deconvolution of RNA-seq data

[...]

Francesca Finotello¹, Clemens Mayer¹, Christina Plattner¹, Gerhard Laschober¹, Dietmar Rieder¹, Hubert Hackl¹, Anne Krogsdam¹, Zuzana Loncova¹, Wilfried Posch¹, Doris Wilflingseder¹, Sieghart Sopper¹, Marieke E. Ijsselsteijn², Thomas P. Brouwer², Douglas B. Johnson³, Douglas B. Johnson⁴, Yaomin Xu⁴, Yu Wang⁴, Melinda E. Sanders⁴, Monica V. Estrada⁴, Paula Ericsson-Gonzalez⁴, Pornpimol Charoentong⁵, Pornpimol Charoentong⁶, Justin M. Balko³, Justin M. Balko⁴, Noel F C C de Miranda², Zlatko Trajanoski¹ - Show less +22 more•Institutions (6)

Innsbruck Medical University¹, Leiden University², Vanderbilt University³, Vanderbilt University Medical Center⁴, University Hospital Heidelberg⁵, German Cancer Research Center⁶

24 May 2019-Genome Medicine

TL;DR: QuanTIseq as discussed by the authors is a method to quantify the fractions of ten immune cell types from bulk RNA-sequencing data, which is extensively validated in blood and tumor samples using simulated, flow cytometry, and immunohistochemistry data.

...read moreread less

Abstract: We introduce quanTIseq, a method to quantify the fractions of ten immune cell types from bulk RNA-sequencing data. quanTIseq was extensively validated in blood and tumor samples using simulated, flow cytometry, and immunohistochemistry data. quanTIseq analysis of 8000 tumor samples revealed that cytotoxic T cell infiltration is more strongly associated with the activation of the CXCR3/CXCL9 axis than with mutational load and that deconvolution-based cell scores have prognostic value in several solid cancers. Finally, we used quanTIseq to show how kinase inhibitors modulate the immune contexture and to reveal immune-cell types that underlie differential patients’ responses to checkpoint blockers. Availability: quanTIseq is available at http://icbi.at/quantiseq.

...read moreread less

572 citations

Journal Article•DOI•

Integrative functional genomic analysis of human brain development and neuropsychiatric risks

[...]

Mingfeng Li¹, Gabriel Santpere¹, Yuka Imamura Kawasawa¹, Yuka Imamura Kawasawa², Oleg V. Evgrafov³, Forrest O. Gulden¹, Sirisha Pochareddy¹, Susan M. Sunkin⁴, Zhen Li¹, Yurae Shin¹, Yurae Shin⁵, Ying Zhu¹, André M. M. Sousa¹, Donna M. Werling⁶, Robert R. Kitchen¹, Hyo Jung Kang¹, Hyo Jung Kang⁷, Mihovil Pletikos¹, Mihovil Pletikos⁸, Jinmyung Choi¹, Sydney Muchnik¹, Xuming Xu¹, Daifeng Wang⁹, Belen Lorente-Galdos¹, Shuang Liu¹, Paola Giusti-Rodríguez¹⁰, Hyejung Won¹⁰, Christiaan de Leeuw¹¹, Antonio F. Pardiñas¹², PsychENCODE Developmental Subgroup¹⁰, Ming Hu¹³, Fulai Jin¹⁴, Yun Li¹⁰, Michael John Owen¹², Michael Conlon O'Donovan¹², James T.R. Walters¹², Danielle Posthuma¹¹, Mark Reimers¹⁵, Pat Levitt¹⁶, Pat Levitt¹⁷, Daniel R. Weinberger¹⁸, Thomas M. Hyde¹⁸, Joel E. Kleinman¹⁸, Daniel H. Geschwind¹⁹, Michael Hawrylycz⁴, Matthew W. State⁶, Stephen Sanders⁶, Patrick F. Sullivan⁹, Mark Gerstein, Ed S. Lein⁴, James A. Knowles³, Nenad Sestan - Show less +48 more•Institutions (19)

Yale University¹, Pennsylvania State University², SUNY Downstate Medical Center³, Allen Institute for Brain Science⁴, National Research Foundation of South Africa⁵, University of California, San Francisco⁶, Chung-Ang University⁷, Boston University⁸, Stony Brook University⁹, University of North Carolina at Chapel Hill¹⁰, VU University Amsterdam¹¹, Cardiff University¹², Cleveland Clinic¹³, Case Western Reserve University¹⁴, Michigan State University¹⁵, University of Southern California¹⁶, Children's Hospital Los Angeles¹⁷, Johns Hopkins University¹⁸, University of California, Los Angeles¹⁹

14 Dec 2018-Science

TL;DR: The generation and analysis of a variety of genomic data modalities at the tissue and single-cell levels, including transcriptome, DNA methylation, and histone modifications across multiple brain regions ranging in age from embryonic development through adulthood, reveal insights into neurodevelopment and the genomic basis of neuropsychiatric risks.

...read moreread less

Abstract: To broaden our understanding of human neurodevelopment, we profiled transcriptomic and epigenomic landscapes across brain regions and/or cell types for the entire span of prenatal and postnatal development. Integrative analysis revealed temporal, regional, sex, and cell type-specific dynamics. We observed a global transcriptomic cup-shaped pattern, characterized by a late fetal transition associated with sharply decreased regional differences and changes in cellular composition and maturation, followed by a reversal in childhood-adolescence, and accompanied by epigenomic reorganizations. Analysis of gene coexpression modules revealed relationships with epigenomic regulation and neurodevelopmental processes. Genes with genetic associations to brain-based traits and neuropsychiatric disorders (including MEF2C, SATB2, SOX5, TCF4, and TSHZ3) converged in a small number of modules and distinct cell types, revealing insights into neurodevelopment and the genomic basis of neuropsychiatric risks.

...read moreread less

532 citations

Journal Article•DOI•

Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations.

[...]

Charles P. Fulco¹, Charles P. Fulco², Joseph Nasser¹, Thouis R. Jones¹, Glen Munson¹, Drew T. Bergman¹, Vidya Subramanian¹, Sharon R. Grossman³, Sharon R. Grossman¹, Rockwell Anyoha¹, Benjamin R. Doughty¹, Tejal A. Patwardhan¹, Tung T. Nguyen¹, Michael Kane¹, Elizabeth M. Perez¹, Neva C. Durand, Caleb A. Lareau¹, Elena K. Stamenova¹, Erez Lieberman Aiden, Eric S. Lander², Eric S. Lander¹, Eric S. Lander³, Jesse M. Engreitz², Jesse M. Engreitz¹ - Show less +20 more•Institutions (3)

Broad Institute¹, Harvard University², Massachusetts Institute of Technology³

03 Jul 2019-Nature Genetics

TL;DR: A simple activity-by-contact model substantially outperformed previous methods at predicting the complex connections in the CRISPR dataset and allows systematic mapping of enhancer–gene connections in a given cell type, on the basis of chromatin-state measurements.

...read moreread less

Abstract: Enhancer elements in the human genome control how genes are expressed in specific cell types and harbor thousands of genetic variants that influence risk for common diseases1-4. Yet, we still do not know how enhancers regulate specific genes, and we lack general rules to predict enhancer-gene connections across cell types5,6. We developed an experimental approach, CRISPRi-FlowFISH, to perturb enhancers in the genome, and we applied it to test >3,500 potential enhancer-gene connections for 30 genes. We found that a simple activity-by-contact model substantially outperformed previous methods at predicting the complex connections in our CRISPR dataset. This activity-by-contact model allows us to construct genome-wide maps of enhancer-gene connections in a given cell type, on the basis of chromatin state measurements. Together, CRISPRi-FlowFISH and the activity-by-contact model provide a systematic approach to map and predict which enhancers regulate which genes, and will help to interpret the functions of the thousands of disease risk variants in the noncoding genome.

...read moreread less

525 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Fast gapped-read alignment with Bowtie 2

[...]

Ben Langmead¹, Steven L. Salzberg², Steven L. Salzberg³, Steven L. Salzberg¹•Institutions (3)

University of Maryland, College Park¹, Johns Hopkins University School of Medicine², Johns Hopkins University³

01 Apr 2012-Nature Methods

TL;DR: Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

...read moreread less

Abstract: As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

...read moreread less

37,898 citations

Journal Article•DOI•

Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.

[...]

Da-Wei Huang¹, Brad T. Sherman¹, Richard A. Lempicki¹•Institutions (1)

Science Applications International Corporation¹

01 Jan 2009-Nature Protocols

TL;DR: By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.

...read moreread less

Abstract: DAVID bioinformatics resources consists of an integrated biological knowledgebase and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists. This protocol explains how to use DAVID, a high-throughput and integrated data-mining environment, to analyze gene lists derived from high-throughput genomic experiments. The procedure first requires uploading a gene list containing any number of common gene identifiers followed by analysis using one or more text and pathway-mining tools such as gene functional classification, functional annotation chart or clustering and functional annotation table. By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.

...read moreread less

31,015 citations

Journal Article•DOI•

Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes

[...]

Jo Vandesompele¹, Katleen De Preter¹, Filip Pattyn¹, Bruce Poppe¹, Nadine Van Roy¹, Anne De Paepe¹, Franki Speleman¹ - Show less +3 more•Institutions (1)

Ghent University Hospital¹

18 Jun 2002-Genome Biology

TL;DR: The normalization strategy presented here is a prerequisite for accurate RT-PCR expression profiling, which opens up the possibility of studying the biological relevance of small expression differences.

...read moreread less

Abstract: Gene-expression analysis is increasingly important in biological research, with real-time reverse transcription PCR (RT-PCR) becoming the method of choice for high-throughput and accurate expression profiling of selected genes. Given the increased sensitivity, reproducibility and large dynamic range of this methodology, the requirements for a proper internal control gene for normalization have become increasingly stringent. Although housekeeping gene expression has been reported to vary considerably, no systematic survey has properly determined the errors related to the common practice of using only one control gene, nor presented an adequate way of working around this problem. We outline a robust and innovative strategy to identify the most stably expressed control genes in a given set of tissues, and to determine the minimum number of genes required to calculate a reliable normalization factor. We have evaluated ten housekeeping genes from different abundance and functional classes in various human tissues, and demonstrated that the conventional use of a single gene for normalization leads to relatively large errors in a significant proportion of samples tested. The geometric mean of multiple carefully selected housekeeping genes was validated as an accurate normalization factor by analyzing publicly available microarray data. The normalization strategy presented here is a prerequisite for accurate RT-PCR expression profiling, which, among other things, opens up the possibility of studying the biological relevance of small expression differences.

...read moreread less

18,261 citations

Journal Article•DOI•

Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation

[...]

Cole Trapnell¹, Cole Trapnell², Brian A. Williams³, Geo Pertea¹, Ali Mortazavi³, Gordon Kwan³, Marijke J. van Baren⁴, Steven L. Salzberg¹, Barbara J. Wold³, Lior Pachter² - Show less +6 more•Institutions (4)

University of Maryland, College Park¹, University of California, Berkeley², California Institute of Technology³, Washington University in St. Louis⁴

01 May 2010-Nature Biotechnology

TL;DR: The results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.

...read moreread less

Abstract: High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed >430 million paired 75-bp RNA-Seq reads from a mouse myoblast cell line over a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Over the time series, 330 genes showed complete switches in the dominant transcription start site (TSS) or splice isoform, and we observed more subtle shifts in 1,304 other genes. These results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.

...read moreread less

13,337 citations

Journal Article•DOI•

Mapping and quantifying mammalian transcriptomes by RNA-Seq.

[...]

Ali Mortazavi¹, Brian A. Williams¹, Kenneth McCue¹, Lorian Schaeffer¹, Barbara J. Wold¹ - Show less +1 more•Institutions (1)

California Institute of Technology¹

29 Jun 2008-Nature Methods

TL;DR: Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors.

...read moreread less

Abstract: We have mapped and quantified mouse transcriptomes by deeply sequencing them and recording how frequently each gene is represented in the sequence sample (RNA-Seq). This provides a digital measure of the presence and prevalence of transcripts from known and previously unknown genes. We report reference measurements composed of 41–52 million mapped 25-base-pair reads for poly(A)-selected RNA from adult mouse brain, liver and skeletal muscle tissues. We used RNA standards to quantify transcript prevalence and to test the linear range of transcript detection, which spanned five orders of magnitude. Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors. RNA splice events, which are not readily measured by standard gene expression microarray or serial analysis of gene expression methods, were detected directly by mapping splice-crossing sequence reads. We observed 1.45 × 10 5 distinct splices, and alternative splices were prominent, with 3,500 different genes expressing one or more alternate internal splices. The mRNA population specifies a cell’s identity and helps to govern its present and future activities. This has made transcriptome analysis a general phenotyping method, with expression microarrays of many kinds in routine use. Here we explore the possibility that transcriptome analysis, transcript discovery and transcript refinement can be done effectively in large and complex mammalian genomes by ultra-high-throughput sequencing. Expression microarrays are currently the most widely used methodology for transcriptome analysis, although some limitations persist. These include hybridization and cross-hybridization artifacts 1–3 , dye-based detection issues and design constraints that preclude or seriously limit the detection of RNA splice patterns and previously unmapped genes. These issues have made it difficult for standard array designs to provide full sequence comprehensiveness (coverage of all possible genes, including unknown ones, in large genomes) or transcriptome comprehensiveness (reliable detection of all RNAs of all prevalence classes, including the least abundant ones that are physiologically relevant). Other

...read moreread less

12,293 citations