Home
/
Authors
/
Manolis Kellis

Author

Manolis Kellis

Other affiliations: Broad Institute, Epigenomics AG, Harvard University ...read more

Bio: Manolis Kellis is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Genome & Gene. The author has an hindex of 128, co-authored 405 publications receiving 112181 citations. Previous affiliations of Manolis Kellis include Broad Institute & Epigenomics AG.

Topics: Genome, Gene, Chromatin, Genomics, Genome-wide association study ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2005
2004
2003

Papers

PDF

Open Access

More filters

Posted Content•DOI•

Phylogenetic Identification and Functional Characterization of Orthologs and Paralogs across Human, Mouse, Fly, and Worm

[...]

Yi-Chieh Wu¹, Mukul S. Bansal², Rasmussen³, Javier Herrero⁴, Manolis Kellis¹ - Show less +1 more•Institutions (4)

Massachusetts Institute of Technology¹, University of Connecticut², Cornell University³, University College London⁴

31 May 2014-bioRxiv

TL;DR: This work presents a phylogenomics-based approach for the identification of orthologous and paralogous genes in human, mouse, fly, and worm, which forms the foundation of the comparative analyses of the modENCODE and mouse ENCODE projects.

...read moreread less

Abstract: Model organisms can serve the biological and medical community by enabling the study of conserved gene families and pathways in experimentally-tractable systems. Their use, however, hinges on the ability to reliably identify evolutionary orthologs and paralogs with high accuracy, which can be a great challenge at both small and large evolutionary distances. Here, we present a phylogenomics-based approach for the identification of orthologous and paralogous genes in human, mouse, fly, and worm, which forms the foundation of the comparative analyses of the modENCODE and mouse ENCODE projects. We study a median of 16,101 genes across 2 mammalian genomes (human, mouse), 12 Drosophila genomes, 5 Caenorhabditis genomes, and an outgroup yeast genome, and demonstrate that accurate inference of evolutionary relationships and events across these species must account for frequent gene-tree topology errors due to both incomplete lineage sorting and insufficient phylogenetic signal. Furthermore, we show that integration of two separate phylogenomic pipelines yields increased accuracy, suggesting that their sources of error are independent, and finally, we leverage the resulting annotation of homologous genes to study the functional impact of gene duplication and loss in the context of rich gene expression and functional genomic datasets of the modENCODE, mouse ENCODE, and human ENCODE projects.

...read moreread less

17 citations

Journal Article•DOI•

Response to Comment on “Evidence of Abundant Purifying Selection in Humans for Recently Acquired Regulatory Functions”

[...]

Lucas D. Ward¹, Manolis Kellis², Manolis Kellis¹•Institutions (2)

Massachusetts Institute of Technology¹, Broad Institute²

10 May 2013-Science

TL;DR: Improved methodology supports the initial conclusion of extensive lineage-specific constraint concentrated in ENCODE elements and clarifies that the estimate is dependent on the constrained and neutral references used, which can further increase the number of nucleotides involved.

...read moreread less

Abstract: Green and Ewing propose corrections to our methodology, which we incorporate and extend here. The improved methodology supports our initial conclusion of extensive lineage-specific constraint concentrated in ENCODE elements. We clarify that our estimate is dependent on the constrained and neutral references used, which can further increase the number of nucleotides involved, because a particularly stringent definition was initially used.

...read moreread less

17 citations

Book Chapter•DOI•

Reconciliation revisited: handling multiple optima when reconciling with duplication, transfer, and loss

[...]

Mukul S. Bansal¹, Eric J. Alm¹, Manolis Kellis¹•Institutions (1)

Massachusetts Institute of Technology¹

07 Apr 2013

TL;DR: This work presents an algorithm to efficiently sample the space of optimal reconciliations uniformly at random in O(mn2) time, where m and n denote the number of genes and species, respectively.

...read moreread less

Abstract: Phylogenetic tree reconciliation is a powerful approach for inferring evolutionary events like gene duplication, horizontal gene transfer, and gene loss, which are fundamental to our understanding of molecular evolution. While Duplication-Loss (DL) reconciliation leads to a unique maximum-parsimony solution, Duplication-Transfer-Loss (DTL) reconciliation yields a multitude of optimal solutions, making it difficult the infer the true evolutionary history of the gene family. Here, we present an effective, efficient, and scalable method for dealing with this fundamental problem in DTL reconciliation. Our approach works by sampling the space of optimal reconciliations uniformly at random and aggregating the results. We present an algorithm to efficiently sample the space of optimal reconciliations uniformly at random in O(mn2) time, where m and n denote the number of genes and species, respectively. We use these samples to understand how different optimal reconciliations vary in their node mapping and event assignments, and to investigate the impact of varying event costs.

...read moreread less

17 citations

Journal Article•DOI•

GENCODE: reference annotation for the human and mouse genomes in 2023

[...]

Adam Frankish, S. Carbonell-Sala, Mark Diekhans, Irwin Jungreis, Jane E. Loveland, Jonathan M. Mudge, Cristina Sisu, James C. Wright, Carme Arnan, If H. A. Barnes, Abhimanyu Banerjee, Ruth Bennett, Andrew Berry, Alexandra Bignell, Carles Boix, Ferriol Calvet, Daniel Cerdán-Vélez, Fiona Cunningham, R. Davidson, Sarah Donaldson, Cagatay Dursun, Reham Fatima, Stefano Giorgetti, Carlos García Girón, José M. González, Matthew P. Hardy, Peter W. Harrison, Thibaut Hourlier, Zoe Hollis, Toby Hunt, Benjamin James, Yunzhe Jiang, Rory Johnson, M. Kay, Julien Lagarde, Fergal J. Martin, Laura Martínez Gómez, Surag Nair, Pengyu Ni, Fernando Pozo, Vivekanandan Ramalingam, Magali Ruffier, Bianca M. Schmitt, Jacob M. Schreiber, Emily Steed, Marie-Marthe Suner, Dulika S. Sumathipala, Irina Sergeyevna Sycheva, Barbara Uszczynska-Ratajczak, Elizabeth Wass, Yucheng T. Yang, Andrew D. Yates, Zahoor Zafrulla, Jyoti S. Choudhary, Mark Gerstein, Roderic Guigó, Tim Hubbard, Manolis Kellis, Anshul Kundaje, Benedict Paten, Michael L. Tress, Paul Flicek - Show less +58 more

24 Nov 2022-Nucleic Acids Research

TL;DR: The GENCODE consortium produces high quality gene and transcript annotation for the human and mouse genomes as mentioned in this paper , which is supported by experimental data and serves as a reference for genome biology and clinical genomics.

...read moreread less

Abstract: Abstract GENCODE produces high quality gene and transcript annotation for the human and mouse genomes. All GENCODE annotation is supported by experimental data and serves as a reference for genome biology and clinical genomics. The GENCODE consortium generates targeted experimental data, develops bioinformatic tools and carries out analyses that, along with externally produced data and methods, support the identification and annotation of transcript structures and the determination of their function. Here, we present an update on the annotation of human and mouse genes, including developments in the tools, data, analyses and major collaborations which underpin this progress. For example, we report the creation of a set of non-canonical ORFs identified in GENCODE transcripts, the LRGASP collaboration to assess the use of long transcriptomic data to build transcript models, the progress in collaborations with RefSeq and UniProt to increase convergence in the annotation of human and mouse protein-coding genes, the propagation of GENCODE across the human pan-genome and the development of new tools to support annotation of regulatory features by GENCODE. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.

...read moreread less

17 citations

Posted Content•DOI•

Single-cell profiling of the human primary motor cortex in ALS and FTLD

[...]

Sergio Sebastian Pineda, Hyeseung Lee¹, Hyeseung Lee², Brent Eugene Fitzwalter², Brent Eugene Fitzwalter¹, Shahin Mohammadi³, Shahin Mohammadi¹, Luc Pregent⁴, Mahammad E Gardashli⁴, Julio Mantero³, Erica Engelberg-Cook⁴, Mariely DeJesus-Hernandez⁴, Marka van Blitterswijk⁴, Cyril Pottier⁵, Rosa Rademakers⁴, Rosa Rademakers⁵, Bjorn Oskarsson⁴, Jaimin S. Shah⁴, Ronald C. Petersen⁴, Neill R. Graff-Radford⁴, Bradley F. Boeve⁴, David S. Knopman⁴, Keith A. Josephs⁴, Michael DeTure⁴, Melissa E. Murray⁴, Dennis W. Dickson⁴, Myriam Heiman¹, Myriam Heiman³, Myriam Heiman², Veronique V. Belzil⁴, Manolis Kellis³, Manolis Kellis¹ - Show less +28 more•Institutions (5)

Broad Institute¹, Picower Institute for Learning and Memory², Massachusetts Institute of Technology³, Mayo Clinic⁴, University of Antwerp⁵

07 Jul 2021-bioRxiv

TL;DR: In this paper, the authors report a single-cell atlas of the human primary motor cortex (MCX) and its transcriptional alterations in ALS and FTLD across ~380,000 nuclei from 64 individuals, including 17 control samples and 47 sporadic and C9orf72-associated ALS/FTLD patient samples.

...read moreread less

Abstract: Amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD) are two devastating and fatal neurodegenerative conditions. While distinct, they share many clinical, genetic, and pathological characteristics1, and both show selective vulnerability of layer 5b extratelencephalic-projecting cortical populations, including Betz cells in ALS2,3 and von Economo neurons (VENs) in FTLD4,5. Here, we report the first high resolution single-cell atlas of the human primary motor cortex (MCX) and its transcriptional alterations in ALS and FTLD across ~380,000 nuclei from 64 individuals, including 17 control samples and 47 sporadic and C9orf72-associated ALS and FTLD patient samples. We identify 46 transcriptionally distinct cellular subtypes including two Betz-cell subtypes, and we observe a previously unappreciated molecular similarity between Betz cells and VENs of the prefrontal cortex (PFC) and frontal insula. Many of the dysregulated genes and pathways are shared across excitatory neurons, including stress response, ribosome function, oxidative phosphorylation, synaptic vesicle cycle, endoplasmic reticulum protein processing, and autophagy. Betz cells and SCN4B+ long-range projecting L3/L5 cells are the most transcriptionally affected in both ALS and FTLD. Lastly, we find that the VEN/Betz cell-enriched transcription factor, POU3F1, has altered subcellular localization, co-localizes with TDP-43 aggregates, and may represent a cell type-specific vulnerability factor in the Betz cells of ALS and FTLD patient tissues.

...read moreread less

17 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
…
54
55
56
57
58
59
60
…
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles

[...]

Aravind Subramanian¹, Pablo Tamayo¹, Vamsi K. Mootha², Sayan Mukherjee³, Benjamin L. Ebert², Michael A. Gillette², Amanda G. Paulovich⁴, Scott L. Pomeroy², Todd R. Golub², Eric S. Lander¹, Jill P. Mesirov¹ - Show less +7 more•Institutions (4)

Massachusetts Institute of Technology¹, Harvard University², Duke University³, Fred Hutchinson Cancer Research Center⁴

25 Oct 2005-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The Gene Set Enrichment Analysis (GSEA) method as discussed by the authors focuses on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation.

...read moreread less

Abstract: Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

...read moreread less

34,830 citations

Journal Article•DOI•

STAR: ultrafast universal RNA-seq aligner

[...]

Alexander Dobin¹, Carrie A. Davis¹, Felix Schlesinger¹, Jorg Drenkow¹, Chris Zaleski¹, Sonali Jha¹, Philippe Batut¹, Mark Chaisson¹, Thomas R. Gingeras¹ - Show less +5 more•Institutions (1)

Cold Spring Harbor Laboratory¹

01 Jan 2013-Bioinformatics

TL;DR: The Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure outperforms other aligners by a factor of >50 in mapping speed.

...read moreread less

Abstract: Motivation Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. Results To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. Availability and implementation STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

...read moreread less

30,684 citations

疟原虫var基因转换速率变化导致抗原变异[英]／Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A

[...]

宁北芳, 朱淮民

28 Jul 2005

TL;DR: PfPMP1）与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作�ly.

...read moreread less

Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1（PfPMP1）与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员，通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

...read moreread less

18,940 citations

Journal Article•DOI•

MicroRNAs: Target Recognition and Regulatory Functions

[...]

David P. Bartel¹•Institutions (1)

Massachusetts Institute of Technology¹

23 Jan 2009-Cell

TL;DR: The current understanding of miRNA target recognition in animals is outlined and the widespread impact of miRNAs on both the expression and evolution of protein-coding genes is discussed.

...read moreread less

18,036 citations

Journal Article•DOI•

Full-length transcriptome assembly from RNA-Seq data without a reference genome.

[...]

Manfred Grabherr¹, Brian J. Haas¹, Moran Yassour¹, Moran Yassour², Joshua Z. Levin¹, Dawn Thompson¹, Ido Amit¹, Xian Adiconis¹, Lin Fan¹, Raktima Raychowdhury¹, Qiandong Zeng¹, Zehua Chen¹, Evan Mauceli¹, Nir Hacohen¹, Andreas Gnirke¹, Nicholas Rhind³, Federica Di Palma¹, Bruce W. Birren¹, Chad Nusbaum¹, Kerstin Lindblad-Toh¹, Kerstin Lindblad-Toh⁴, Nir Friedman², Aviv Regev¹ - Show less +19 more•Institutions (4)

Massachusetts Institute of Technology¹, Hebrew University of Jerusalem², University of Massachusetts Medical School³, Science for Life Laboratory⁴

01 Jul 2011-Nature Biotechnology

TL;DR: The Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available, providing a unified solution for transcriptome reconstruction in any sample.

...read moreread less

Abstract: Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.

...read moreread less

15,665 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse