scispace - formally typeset
Search or ask a question
Author

Manolis Kellis

Other affiliations: Broad Institute, Epigenomics AG, Harvard University  ...read more
Bio: Manolis Kellis is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Genome & Gene. The author has an hindex of 128, co-authored 405 publications receiving 112181 citations. Previous affiliations of Manolis Kellis include Broad Institute & Epigenomics AG.


Papers
More filters
Journal ArticleDOI
12 Dec 2014-Science
TL;DR: A dual mechanism for the diversity of behaviorally regulated genes across different brain regions in vivo is proposed, finding that singing was associated with differential regulation of about 10% of all genes in the avian genome that came in several waves across time.
Abstract: INTRODUCTION Brain activity drives both behavior and regulated gene expression in neurons. Although past studies have identified activity-induced signaling and gene regulation cascades in cultured neurons, much less is known about how activity- dependent transcriptional networks are affected by the variations in cell-type composition, network interconnections, and firing patterns that comprise behaviorally active brain circuits in vivo. Dual mechanism model for behaviorally regulated gene expression diversity. ( Left ) Song brain circuit and zebra finch song motif (transcribed using https://my.scorecloud.com). ( Middle ) Song nucleus–specific (RA, red; Area X, blue) singing- regulated genes (gene A and gene B) in response to neural f ring (yellow). ( Right ) Region-general EATF and region-specific TF only bind to genomic DNA (lines) with region-specific acetylated histone 3 (H3K27ac peaks) and then transcribe their mRNAs (green arrow). RATIONALE We tested the hypothesis that behaviorally regulated gene expression is anatomically and temporally diverse and that the key determinants of this diversity are networks of transcription factors, their genomic binding sites, and epigenetic chromatin states. We analyzed genome-wide, singing-regulated gene expression across time in the four major forebrain regions of the song control system in songbirds, a model of speech production in humans. We then performed a transcription factor motif analysis to identify gene regulatory networks enriched in each song nucleus and measured acetylation of histone 3 at lysine 27 (H3K27ac) to identify chromatin regions that were transcriptionally active in the genomes of song nuclei before and after singing. RESULTS We found that singing was associated with differential regulation of about 10% of all genes in the avian genome that came in several waves across time. Less than 1% of these genes were comparably regulated in all song nuclei tested, and these comprised a core set dominated by immediate-early gene (IEG) transcription factors. By contrast, the vast majority of singing-regulated genes were regulated in only one or a subset of song nuclei, such that each song nucleus had its own dominant subset of genes regulated with defined temporal profiles, controlling a variety of functions. The promoters of many of the singing-regulated genes contained binding motifs for known early-activated transcription factors (EATFs) that become active in response to neural firing, some of which were expressed differentially between song nuclei at baseline. One EATF, calcium-response factor ( CaRF ), was tested with RNA interference knockdown in cultured neurons and found to regulate the predicted genes in response to neural activity, but was also found to modulate their expression even at baseline. More strikingly, we found with H3K27ac analysis that many song nucleus–specific singing-regulated genes did not show increased chromatin regulatory element activity after singing but rather already had primed region-specific regulatory activity before singing began. CONCLUSIONS We propose a dual mechanism for the diversity of behaviorally regulated genes across different brain regions in vivo (see the figure). First, the neural activity associated with singing activates EATFs, and some TFs differentially expressed in brain regions at baseline, to trigger region-specific expression of their target genes. Second, brain region–specific enhancers near activity- regulated genes are waiting in an epigenetically primed state, ready to modulate transcription of general and song nucleus–specific genes at a moment’s notice when the neurons fire. The combination of these two mechanisms underlies a great diversity of behaviorally regulated gene expression, permitting each nucleus to perform its particular function in this complex behavior.

108 citations

01 Mar 2015
TL;DR: A comparative analysis with 15 immune diseases showed that T1D is more similar genetically to other autoantibody-positive diseases, significantly most similar to juvenile idiopathic arthritis and significantly least similar to ulcerative colitis, and provided support for three additional new T1d risk loci.
Abstract: Genetic studies of type 1 diabetes (T1D) have identified 50 susceptibility regions, finding major pathways contributing to risk, with some loci shared across immune disorders. To make genetic comparisons across autoimmune disorders as informative as possible, a dense genotyping array, the Immunochip, was developed, from which we identified four new T1D-associated regions (P < 5 × 10−8). A comparative analysis with 15 immune diseases showed that T1D is more similar genetically to other autoantibody-positive diseases, significantly most similar to juvenile idiopathic arthritis and significantly least similar to ulcerative colitis, and provided support for three additional new T1D risk loci. Using a Bayesian approach, we defined credible sets for the T1D-associated SNPs. The associated SNPs localized to enhancer sequences active in thymus, T and B cells, and CD34+ stem cells. Enhancer-promoter interactions can now be analyzed in these cell types to identify which particular genes and regulatory sequences are causal.

107 citations

Journal ArticleDOI
TL;DR: High-resolution enhancer function quantification and high-resolution dissection for millions of accessible DNA fragments, revealing driver nucleotides and helping interpret non-coding disease variants are reported.
Abstract: Genome-wide epigenomic maps have revealed millions of putative enhancers and promoters, but experimental validation of their function and high-resolution dissection of their driver nucleotides remain limited. Here, we present HiDRA (High-resolution Dissection of Regulatory Activity), a combined experimental and computational method for high-resolution genome-wide testing and dissection of putative regulatory regions. We test ~7 million accessible DNA fragments in a single experiment, by coupling accessible chromatin extraction with self-transcribing episomal reporters (ATAC-STARR-seq). By design, fragments are highly overlapping in densely-sampled accessible regions, enabling us to pinpoint driver regulatory nucleotides by exploiting differences in activity between partially-overlapping fragments using a machine learning model (SHARPR-RE). In GM12878 lymphoblastoid cells, we find ~65,000 regions showing enhancer function, and pinpoint ~13,000 high-resolution driver elements. These are enriched for regulatory motifs, evolutionarily-conserved nucleotides, and disease-associated genetic variants from genome-wide association studies. Overall, HiDRA provides a high-throughput, high-resolution approach for dissecting regulatory regions and driver nucleotides.

105 citations

Journal ArticleDOI
09 Sep 2020-Neuron
TL;DR: It is revealed that the activation of innate immune signaling in the most vulnerable HD neurons provides a novel framework to understand the basis of mHTT toxicity and raises new therapeutic opportunities.

105 citations

Journal ArticleDOI
TL;DR: A metric of TFBS variability is introduced that takes into account changes in motif match associated with mutation and makes it possible to investigate TFBS functional constraints instance-by-instance as well as in sets that share common biological properties.
Abstract: Background Advances in sequencing technology have boosted population genomics and made it possible to map the positions of transcription factor binding sites (TFBSs) with high precision. Here we investigate TFBS variability by combining transcription factor binding maps generated by ENCODE, modENCODE, our previously published data and other sources with genomic variation data for human individuals and Drosophila isogenic lines.

105 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The Gene Set Enrichment Analysis (GSEA) method as discussed by the authors focuses on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation.
Abstract: Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

34,830 citations

Journal ArticleDOI
TL;DR: The Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure outperforms other aligners by a factor of >50 in mapping speed.
Abstract: Motivation Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. Results To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. Availability and implementation STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

30,684 citations

28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Journal ArticleDOI
23 Jan 2009-Cell
TL;DR: The current understanding of miRNA target recognition in animals is outlined and the widespread impact of miRNAs on both the expression and evolution of protein-coding genes is discussed.

18,036 citations

Journal ArticleDOI
TL;DR: The Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available, providing a unified solution for transcriptome reconstruction in any sample.
Abstract: Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.

15,665 citations