Home
/
Authors
/
Manolis Kellis

Author

Manolis Kellis

Other affiliations: Broad Institute, Epigenomics AG, Harvard University ...read more

Bio: Manolis Kellis is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Genome & Gene. The author has an hindex of 128, co-authored 405 publications receiving 112181 citations. Previous affiliations of Manolis Kellis include Broad Institute & Epigenomics AG.

Topics: Genome, Gene, Chromatin, Genomics, Genome-wide association study ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2005
2004
2003

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes.

[...]

Michael F. Lin¹, Joseph W. Carlson, Madeline A. Crosby, Beverley B. Matthews, Charles Yu, Soo Park, Kenneth H. Wan, Andrew J. Schroeder, L. Sian Gramates, Susan E. St. Pierre, Margaret Roark, Kenneth L. Wiley, Rob J. Kulathinal, Peili Zhang, Kyl V. Myrick, Jerry V. Antone, Susan E. Celniker, William M. Gelbart, Manolis Kellis - Show less +15 more•Institutions (1)

Broad Institute¹

01 Dec 2007-Genome Research

TL;DR: Direct genome-wide searches for unusual protein-coding structures are performed, discovering 149 possible examples of stop codon readthrough, 125 new candidate ORFs of polycistronic mRNAs, and several candidate translational frameshifts.

...read moreread less

Abstract: The availability of sequenced genomes from 12 Drosophila species has enabled the use of comparative genomics for the systematic discovery of functional elements conserved within this genus. We have developed quantitative metrics for the evolutionary signatures specific to protein-coding regions and applied them genome-wide, resulting in 1193 candidate new protein-coding exons in the D. melanogaster genome. We have reviewed these predictions by manual curation and validated a subset by directed cDNA screening and sequencing, revealing both new genes and new alternative splice forms of known genes. We also used these evolutionary signatures to evaluate existing gene annotations, resulting in the validation of 87% of genes lacking descriptive names and identifying 414 poorly conserved genes that are likely to be spurious predictions, noncoding, or species-specific genes. Furthermore, our methods suggest a variety of refinements to hundreds of existing gene models, such as modifications to translation start codons and exon splice boundaries. Finally, we performed directed genome-wide searches for unusual protein-coding structures, discovering 149 possible examples of stop codon readthrough, 125 new candidate ORFs of polycistronic mRNAs, and several candidate translational frameshifts. These results affect >10% of annotated fly genes and demonstrate the power of comparative genomics to enhance our understanding of genome organization, even in a model organism as intensively studied as Drosophila melanogaster.

...read moreread less

160 citations

Journal Article•DOI•

Deep learning for regulatory genomics.

[...]

Yongjin Park¹, Manolis Kellis¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Aug 2015-Nature Biotechnology

TL;DR: Computational modeling of DNA and RNA targets of regulatory proteins is improved by a deep-learning approach and shows good results in terms of uniformity, accuracy, and efficiency.

...read moreread less

Abstract: Computational modeling of DNA and RNA targets of regulatory proteins is improved by a deep-learning approach.

...read moreread less

160 citations

Journal Article•DOI•

Regulatory genomic circuitry of human disease loci by integrative epigenomics

[...]

Carles Boix¹, Carles Boix², Benjamin T James¹, Benjamin T James², Yongjin Park³, Yongjin Park², Yongjin Park¹, Wouter Meuleman, Manolis Kellis², Manolis Kellis¹ - Show less +6 more•Institutions (3)

Massachusetts Institute of Technology¹, Broad Institute², University of British Columbia³

03 Feb 2021-Nature

TL;DR: EpiMap as mentioned in this paper is a compendium of 10,000 epigenomic maps across more than 800 biosamples for the annotation of genome-wide association study circuitry, which are used to define chromatin states, high-resolution enhancers, enhancer modules, upstream regulators and downstream target genes.

...read moreread less

Abstract: Annotating the molecular basis of human disease remains an unsolved challenge, as 93% of disease loci are non-coding and gene-regulatory annotations are highly incomplete1–3. Here we present EpiMap, a compendium comprising 10,000 epigenomic maps across 800 samples, which we used to define chromatin states, high-resolution enhancers, enhancer modules, upstream regulators and downstream target genes. We used this resource to annotate 30,000 genetic loci that were associated with 540 traits4, predicting trait-relevant tissues, putative causal nucleotide variants in enriched tissue enhancers and candidate tissue-specific target genes for each. We partitioned multifactorial traits into tissue-specific contributing factors with distinct functional enrichments and disease comorbidity patterns, and revealed both single-factor monotropic and multifactor pleiotropic loci. Top-scoring loci frequently had multiple predicted driver variants, converging through multiple enhancers with a common target gene, multiple genes in common tissues, or multiple genes and multiple tissues, indicating extensive pleiotropy. Our results demonstrate the importance of dense, rich, high-resolution epigenomic annotations for the investigation of complex traits. The authors present EpiMap, a compendium that comprises 10,000 epigenomic maps across more than 800 biosamples for the annotation of genome-wide association study circuitry.

...read moreread less

160 citations

Journal Article•DOI•

Spatial expression of transcription factors in Drosophila embryonic organ development.

[...]

Ann S. Hammonds¹, Christopher A. Bristow², Christopher A. Bristow³, William W. Fisher¹, Richard Weiszmann¹, Siqi Wu¹, Siqi Wu⁴, Volker Hartenstein⁵, Manolis Kellis², Bin Yu⁴, Erwin Frise¹, Susan E. Celniker¹ - Show less +8 more•Institutions (5)

Lawrence Berkeley National Laboratory¹, Massachusetts Institute of Technology², University of Texas MD Anderson Cancer Center³, University of California, Berkeley⁴, University of California, Los Angeles⁵

20 Dec 2013-Genome Biology

TL;DR: A systematic characterization of spatiotemporal gene expression patterns for all known or predicted Drosophila TFs throughout embryogenesis, the first such comprehensive study for any metazoan animal, and a reference TF dataset for the investigation of gene regulatory networks in embryogenesis is produced.

...read moreread less

Abstract: Site-specific transcription factors (TFs) bind DNA regulatory elements to control expression of target genes, forming the core of gene regulatory networks. Despite decades of research, most studies focus on only a small number of TFs and the roles of many remain unknown. We present a systematic characterization of spatiotemporal gene expression patterns for all known or predicted Drosophila TFs throughout embryogenesis, the first such comprehensive study for any metazoan animal. We generated RNA expression patterns for all 708 TFs by in situ hybridization, annotated the patterns using an anatomical controlled vocabulary, and analyzed TF expression in the context of organ system development. Nearly all TFs are expressed during embryogenesis and more than half are specifically expressed in the central nervous system. Compared to other genes, TFs are enriched early in the development of most organ systems, and throughout the development of the nervous system. Of the 535 TFs with spatially restricted expression, 79% are dynamically expressed in multiple organ systems while 21% show single-organ specificity. Of those expressed in multiple organ systems, 77 TFs are restricted to a single organ system either early or late in development. Expression patterns for 354 TFs are characterized for the first time in this study. We produced a reference TF dataset for the investigation of gene regulatory networks in embryogenesis, and gained insight into the expression dynamics of the full complement of TFs controlling the development of each organ system.

...read moreread less

152 citations

Journal Article•DOI•

Comparative validation of the D. melanogaster modENCODE transcriptome annotation

[...]

Zhen-Xia Chen¹, David Sturgill¹, Jiaxin Qu², Huaiyang Jiang², Soo Park³, Nathan Boley⁴, Ana Maria Suzuki, Anthony R. Fletcher⁵, David C. Plachetzki⁶, Peter C. FitzGerald¹, Carlo G. Artieri¹, Joel Atallah⁶, Olga Barmina⁶, James B. Brown⁴, Kerstin P. Blankenburg², Emily Clough¹, Abhijit Dasgupta¹, Sai Gubbala², Yi Han², Joy Jayaseelan², Divya Kalra², Yoo-Ah Kim¹, Christie Kovar², Sandra L. Lee², Mingmei Li², James D. Malley⁵, John H. Malone¹, Tittu Mathew², Nicolas R. Mattiuzzo¹, Mala Munidasa², Donna M. Muzny², Fiona Ongeri², Lora Perales², Teresa M. Przytycka¹, Ling Ling Pu², Garrett Robinson⁴, Rebecca Thornton², Nehad Saada², Steven E. Scherer², Harold E. Smith¹, Charles Vinson¹, Crystal B. Warner², Kim C. Worley², Yuan Qing Wu², Xiaoyan Zou², Peter Cherbas⁷, Manolis Kellis⁸, Michael B. Eisen⁴, Fabio Piano⁹, Karin Kionte⁹, David H. A. Fitch⁹, Paul W. Sternberg¹⁰, Asher D. Cutter¹¹, Michael O. Duff¹², Roger A. Hoskins³, Brenton R. Graveley¹², Richard A. Gibbs², Peter J. Bickel⁴, Artyom Kopp⁶, Piero Carninci, Susan E. Celniker³, Brian Oliver¹, Stephen Richards² - Show less +59 more•Institutions (12)

National Institutes of Health¹, Baylor College of Medicine², Lawrence Berkeley National Laboratory³, University of California, Berkeley⁴, Center for Information Technology⁵, University of California, Davis⁶, Indiana University⁷, Massachusetts Institute of Technology⁸, New York University⁹, California Institute of Technology¹⁰, University of Toronto¹¹, University of Connecticut Health Center¹²

01 Jul 2014-Genome Research

TL;DR: The vast majority of elements in the D. melanogaster genome annotation are evolutionarily conserved, indicating that the annotation will be an important springboard for functional genetic testing by the Drosophila community.

...read moreread less

Abstract: Accurate gene model annotation of reference genomes is critical for making them useful. The modENCODE project has improved the D. melanogaster genome annotation by using deep and diverse high-throughput data. Since transcriptional activity that has been evolutionarily conserved is likely to have an advantageous function, we have performed large-scale interspecific comparisons to increase confidence in predicted annotations. To support comparative genomics, we filled in divergence gaps in the Drosophila phylogeny by generating draft genomes for eight new species. For comparative transcriptome analysis, we generated mRNA expression profiles on 81 samples from multiple tissues and developmental stages of 15 Drosophila species, and we performed cap analysis of gene expression in D. melanogaster and D. pseudoobscura. We also describe conservation of four distinct core promoter structures composed of combinations of elements at three positions. Overall, each type of genomic feature shows a characteristic divergence rate relative to neutral models, highlighting the value of multispecies alignment in annotating a target genome that should prove useful in the annotation of other high priority genomes, especially human and other mammalian genomes that are rich in noncoding sequences. We report that the vast majority of elements in the annotation are evolutionarily conserved, indicating that the annotation will be an important springboard for functional genetic testing by the Drosophila community.

...read moreread less

149 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
…
24
25
26
27
28
29
30
…
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles

[...]

Aravind Subramanian¹, Pablo Tamayo¹, Vamsi K. Mootha², Sayan Mukherjee³, Benjamin L. Ebert², Michael A. Gillette², Amanda G. Paulovich⁴, Scott L. Pomeroy², Todd R. Golub², Eric S. Lander¹, Jill P. Mesirov¹ - Show less +7 more•Institutions (4)

Massachusetts Institute of Technology¹, Harvard University², Duke University³, Fred Hutchinson Cancer Research Center⁴

25 Oct 2005-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The Gene Set Enrichment Analysis (GSEA) method as discussed by the authors focuses on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation.

...read moreread less

Abstract: Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

...read moreread less

34,830 citations

Journal Article•DOI•

STAR: ultrafast universal RNA-seq aligner

[...]

Alexander Dobin¹, Carrie A. Davis¹, Felix Schlesinger¹, Jorg Drenkow¹, Chris Zaleski¹, Sonali Jha¹, Philippe Batut¹, Mark Chaisson¹, Thomas R. Gingeras¹ - Show less +5 more•Institutions (1)

Cold Spring Harbor Laboratory¹

01 Jan 2013-Bioinformatics

TL;DR: The Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure outperforms other aligners by a factor of >50 in mapping speed.

...read moreread less

Abstract: Motivation Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. Results To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. Availability and implementation STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

...read moreread less

30,684 citations

疟原虫var基因转换速率变化导致抗原变异[英]／Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A

[...]

宁北芳, 朱淮民

28 Jul 2005

TL;DR: PfPMP1）与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作�ly.

...read moreread less

Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1（PfPMP1）与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员，通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

...read moreread less

18,940 citations

Journal Article•DOI•

MicroRNAs: Target Recognition and Regulatory Functions

[...]

David P. Bartel¹•Institutions (1)

Massachusetts Institute of Technology¹

23 Jan 2009-Cell

TL;DR: The current understanding of miRNA target recognition in animals is outlined and the widespread impact of miRNAs on both the expression and evolution of protein-coding genes is discussed.

...read moreread less

18,036 citations

Journal Article•DOI•

Full-length transcriptome assembly from RNA-Seq data without a reference genome.

[...]

Manfred Grabherr¹, Brian J. Haas¹, Moran Yassour¹, Moran Yassour², Joshua Z. Levin¹, Dawn Thompson¹, Ido Amit¹, Xian Adiconis¹, Lin Fan¹, Raktima Raychowdhury¹, Qiandong Zeng¹, Zehua Chen¹, Evan Mauceli¹, Nir Hacohen¹, Andreas Gnirke¹, Nicholas Rhind³, Federica Di Palma¹, Bruce W. Birren¹, Chad Nusbaum¹, Kerstin Lindblad-Toh¹, Kerstin Lindblad-Toh⁴, Nir Friedman², Aviv Regev¹ - Show less +19 more•Institutions (4)

Massachusetts Institute of Technology¹, Hebrew University of Jerusalem², University of Massachusetts Medical School³, Science for Life Laboratory⁴

01 Jul 2011-Nature Biotechnology

TL;DR: The Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available, providing a unified solution for transcriptome reconstruction in any sample.

...read moreread less

Abstract: Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.

...read moreread less

15,665 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse