scispace - formally typeset
Search or ask a question

Showing papers on "Gene published in 2015"


Journal ArticleDOI
29 Jan 2015-Nature
TL;DR: Structural-guided engineering of a CRISPR-Cas9 complex to mediate efficient transcriptional activation at endogenous genomic loci is described and the potential of Cas9-based activators as a powerful genetic perturbation technology is demonstrated.
Abstract: Systematic interrogation of gene function requires the ability to perturb gene expression in a robust and generalizable manner. Here we describe structure-guided engineering of a CRISPR-Cas9 complex to mediate efficient transcriptional activation at endogenous genomic loci. We used these engineered Cas9 activation complexes to investigate single-guide RNA (sgRNA) targeting rules for effective transcriptional activation, to demonstrate multiplexed activation of ten genes simultaneously, and to upregulate long intergenic non-coding RNA (lincRNA) transcripts. We also synthesized a library consisting of 70,290 guides targeting all human RefSeq coding isoforms to screen for genes that, upon activation, confer resistance to a BRAF inhibitor. The top hits included genes previously shown to be able to confer resistance, and novel candidates were validated using individual sgRNA and complementary DNA overexpression. A gene expression signature based on the top screening hits correlated with markers of BRAF inhibitor resistance in cell lines and patient-derived samples. These results collectively demonstrate the potential of Cas9-based activators as a powerful genetic perturbation technology.

2,186 citations


Journal ArticleDOI
TL;DR: A new role for circRNAs in regulating gene expression in the nucleus is revealed, in which EIciRNAs enhance the expression of their parental genes in cis, and a regulatory strategy for transcriptional control via specific RNA-RNA interaction between U1 snRNA and EICIRNAs is highlighted.
Abstract: Noncoding RNAs (ncRNAs) have numerous roles in development and disease, and one of the prominent roles is to regulate gene expression A vast number of circular RNAs (circRNAs) have been identified, and some have been shown to function as microRNA sponges in animal cells Here, we report a class of circRNAs associated with RNA polymerase II in human cells In these circRNAs, exons are circularized with introns 'retained' between exons; we term them exon-intron circRNAs or EIciRNAs EIciRNAs predominantly localize in the nucleus, interact with U1 snRNP and promote transcription of their parental genes Our findings reveal a new role for circRNAs in regulating gene expression in the nucleus, in which EIciRNAs enhance the expression of their parental genes in cis, and highlight a regulatory strategy for transcriptional control via specific RNA-RNA interaction between U1 snRNA and EIciRNAs

2,077 citations



Journal ArticleDOI
19 Feb 2015-Nature
TL;DR: A fine-mapping algorithm is developed to identify candidate causal variants for 21 autoimmune diseases from genotyping data, and it is found that most non-coding risk variants, including those that alter gene expression, affect non-canonical sequence determinants not well-explained by current gene regulatory models.
Abstract: Genome-wide association studies have identified loci underlying human diseases, but the causal nucleotide changes and mechanisms remain largely unknown. Here we developed a fine-mapping algorithm to identify candidate causal variants for 21 autoimmune diseases from genotyping data. We integrated these predictions with transcription and cis-regulatory element annotations, derived by mapping RNA and chromatin in primary immune cells, including resting and stimulated CD4(+) T-cell subsets, regulatory T cells, CD8(+) T cells, B cells, and monocytes. We find that ∼90% of causal variants are non-coding, with ∼60% mapping to immune-cell enhancers, many of which gain histone acetylation and transcribe enhancer-associated RNA upon immune stimulation. Causal variants tend to occur near binding sites for master regulators of immune differentiation and stimulus-dependent gene activation, but only 10-20% directly alter recognizable transcription factor binding motifs. Rather, most non-coding risk variants, including those that alter gene expression, affect non-canonical sequence determinants not well-explained by current gene regulatory models.

1,622 citations


Journal ArticleDOI
15 Jan 2015-Nature
TL;DR: These observations indicate that the underlying DNA sequence largely accounts for local patterns of methylation, which is highly informative when studying gene regulation in normal and diseased cells, and it can potentially function as a biomarker.
Abstract: Cytosine methylation is a DNA modification generally associated with transcriptional silencing. Factors that regulate methylation have been linked to human disease, yet how they contribute to malignances remains largely unknown. Genomic maps of DNA methylation have revealed unexpected dynamics at gene regulatory regions, including active demethylation by TET proteins at binding sites for transcription factors. These observations indicate that the underlying DNA sequence largely accounts for local patterns of methylation. As a result, this mark is highly informative when studying gene regulation in normal and diseased cells, and it can potentially function as a biomarker. Although these findings challenge the view that methylation is generally instructive for gene silencing, several open questions remain, including how methylation is targeted and recognized and in what context it affects genome readout.

1,564 citations


Journal ArticleDOI
TL;DR: A robust CRISPR/Cas9 vector system, utilizing a plant codon optimized Cas9 gene, for convenient and high-efficiency multiplex genome editing in monocot and dicot plants and provides examples of loss-of-function gene mutations in T0 rice and Arabidopsis plants.

1,451 citations


Journal ArticleDOI
TL;DR: The results demonstrate that PrediXcan can detect known and new genes associated with disease traits and provide insights into the mechanism of these associations.
Abstract: Genome-wide association studies (GWAS) have identified thousands of variants robustly associated with complex traits. However, the biological mechanisms underlying these associations are, in general, not well understood. We propose a gene-based association method called PrediXcan that directly tests the molecular mechanisms through which genetic variation affects phenotype. The approach estimates the component of gene expression determined by an individual's genetic profile and correlates 'imputed' gene expression with the phenotype under investigation to identify genes involved in the etiology of the phenotype. Genetically regulated gene expression is estimated using whole-genome tissue-dependent prediction models trained with reference transcriptome data sets. PrediXcan enjoys the benefits of gene-based approaches such as reduced multiple-testing burden and a principled approach to the design of follow-up experiments. Our results demonstrate that PrediXcan can detect known and new genes associated with disease traits and provide insights into the mechanism of these associations.

1,372 citations


Journal ArticleDOI
27 Nov 2015-Science
TL;DR: Using the bacterial clustered regularly interspaced short palindromic repeats (CRISPR) system, this article constructed a genome-wide single-guide RNA library to screen for genes required for proliferation and survival in a human cancer cell line.
Abstract: Large-scale genetic analysis of lethal phenotypes has elucidated the molecular underpinnings of many biological processes. Using the bacterial clustered regularly interspaced short palindromic repeats (CRISPR) system, we constructed a genome-wide single-guide RNA library to screen for genes required for proliferation and survival in a human cancer cell line. Our screen revealed the set of cell-essential genes, which was validated with an orthogonal gene-trap-based screen and comparison with yeast gene knockouts. This set is enriched for genes that encode components of fundamental pathways, are expressed at high levels, and contain few inactivating polymorphisms in the human population. We also uncovered a large group of uncharacterized genes involved in RNA processing, a number of whose products localize to the nucleolus. Last, screens in additional cell lines showed a high degree of overlap in gene essentiality but also revealed differences specific to each cell line and cancer type that reflect the developmental origin, oncogenic drivers, paralogous gene expression pattern, and chromosomal structure of each line. These results demonstrate the power of CRISPR-based screens and suggest a general strategy for identifying liabilities in cancer cells.

1,371 citations


Journal ArticleDOI
09 Jan 2015-Science
TL;DR: A computational model is developed that scores how strongly genetic variants affect RNA splicing, a critical step in gene expression whose disruption contributes to many diseases, including cancers and neurological disorders, and provides insights into the role of aberrant splicing in disease.
Abstract: To facilitate precision medicine and whole genome annotation, we developed a machine learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of over 650,000 intronic and exonic variants reveals widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations alter splicing nine times more often than common variants, and missense exonic disease mutations that least impact protein function are five times more likely to alter splicing than others. Tens of thousands of disease-causing mutations are detected, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole genome sequencing of individuals with autism reveals mis-spliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine.

1,113 citations


01 Nov 2015
TL;DR: A genome-wide single-guide RNA library is constructed to screen for genes required for proliferation and survival in a human cancer cell line and reveals a set of cell-essential genes, which was validated with an orthogonal gene-trap–based screen and comparison with yeast gene knockouts.
Abstract: Large-scale genetic analysis of lethal phenotypes has elucidated the molecular underpinnings of many biological processes. Using the bacterial clustered regularly interspaced short palindromic repeats (CRISPR) system, we constructed a genome-wide single-guide RNA library to screen for genes required for proliferation and survival in a human cancer cell line. Our screen revealed the set of cell-essential genes, which was validated with an orthogonal gene-trap-based screen and comparison with yeast gene knockouts. This set is enriched for genes that encode components of fundamental pathways, are expressed at high levels, and contain few inactivating polymorphisms in the human population. We also uncovered a large group of uncharacterized genes involved in RNA processing, a number of whose products localize to the nucleolus. Last, screens in additional cell lines showed a high degree of overlap in gene essentiality but also revealed differences specific to each cell line and cancer type that reflect the developmental origin, oncogenic drivers, paralogous gene expression pattern, and chromosomal structure of each line. These results demonstrate the power of CRISPR-based screens and suggest a general strategy for identifying liabilities in cancer cells.

1,113 citations


Journal ArticleDOI
13 Jul 2015-Nature
TL;DR: It is shown that egfl7 mutants are less sensitive than their wild-type siblings to Egfl7 knockdown, arguing against residual protein function in the mutants or significant off-target effects of the morpholinos when used at a moderate dose, and the activation of a compensatory network to buffer against deleterious mutations was not observed after translational or transcriptional knockdown.
Abstract: Cells sense their environment and adapt to it by fine-tuning their transcriptome. Wired into this network of gene expression control are mechanisms to compensate for gene dosage. The increasing use of reverse genetics in zebrafish, and other model systems, has revealed profound differences between the phenotypes caused by genetic mutations and those caused by gene knockdowns at many loci, an observation previously reported in mouse and Arabidopsis. To identify the reasons underlying the phenotypic differences between mutants and knockdowns, we generated mutations in zebrafish egfl7, an endothelial extracellular matrix gene of therapeutic interest, as well as in vegfaa. Here we show that egfl7 mutants do not show any obvious phenotypes while animals injected with egfl7 morpholino (morphants) exhibit severe vascular defects. We further observe that egfl7 mutants are less sensitive than their wild-type siblings to Egfl7 knockdown, arguing against residual protein function in the mutants or significant off-target effects of the morpholinos when used at a moderate dose. Comparing egfl7 mutant and morphant proteomes and transcriptomes, we identify a set of proteins and genes that are upregulated in mutants but not in morphants. Among them are extracellular matrix genes that can rescue egfl7 morphants, indicating that they could be compensating for the loss of Egfl7 function in the phenotypically wild-type egfl7 mutants. Moreover, egfl7 CRISPR interference, which obstructs transcript elongation and causes severe vascular defects, does not cause the upregulation of these genes. Similarly, vegfaa mutants but not morphants show an upregulation of vegfab. Taken together, these data reveal the activation of a compensatory network to buffer against deleterious mutations, which was not observed after translational or transcriptional knockdown.

Journal ArticleDOI
TL;DR: In this article, the authors use Capture Hi-C (CHi-C) to examine the long-range interactions of almost 22,000 promoters in 2 human blood cell types and identify over 1.6 million shared and cell type-restricted interactions spanning hundreds of kilobases between promoters and distal loci.
Abstract: Transcriptional control in large genomes often requires looping interactions between distal DNA elements, such as enhancers and target promoters. Current chromosome conformation capture techniques do not offer sufficiently high resolution to interrogate these regulatory interactions on a genomic scale. Here we use Capture Hi-C (CHi-C), an adapted genome conformation assay, to examine the long-range interactions of almost 22,000 promoters in 2 human blood cell types. We identify over 1.6 million shared and cell type-restricted interactions spanning hundreds of kilobases between promoters and distal loci. Transcriptionally active genes contact enhancer-like elements, whereas transcriptionally inactive genes interact with previously uncharacterized elements marked by repressive features that may act as long-range silencers. Finally, we show that interacting loci are enriched for disease-associated SNPs, suggesting how distal mutations may disrupt the regulation of relevant genes. This study provides new insights and accessible tools to dissect the regulatory interactions that underlie normal and aberrant gene regulation.

Journal ArticleDOI
TL;DR: A highly effective autonomous Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-associated protein 9 (Cas9)-mediated gene-drive system in the Asian malaria vector Anopheles stephensi, adapted from the mutagenic chain reaction (MCR).
Abstract: Genetic engineering technologies can be used both to create transgenic mosquitoes carrying antipathogen effector genes targeting human malaria parasites and to generate gene-drive systems capable of introgressing the genes throughout wild vector populations. We developed a highly effective autonomous Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-associated protein 9 (Cas9)-mediated gene-drive system in the Asian malaria vector Anopheles stephensi, adapted from the mutagenic chain reaction (MCR). This specific system results in progeny of males and females derived from transgenic males exhibiting a high frequency of germ-line gene conversion consistent with homology-directed repair (HDR). This system copies an ∼17-kb construct from its site of insertion to its homologous chromosome in a faithful, site-specific manner. Dual anti-Plasmodium falciparum effector genes, a marker gene, and the autonomous gene-drive components are introgressed into ∼99.5% of the progeny following outcrosses of transgenic lines to wild-type mosquitoes. The effector genes remain transcriptionally inducible upon blood feeding. In contrast to the efficient conversion in individuals expressing Cas9 only in the germ line, males and females derived from transgenic females, which are expected to have drive component molecules in the egg, produce progeny with a high frequency of mutations in the targeted genome sequence, resulting in near-Mendelian inheritance ratios of the transgene. Such mutant alleles result presumably from nonhomologous end-joining (NHEJ) events before the segregation of somatic and germ-line lineages early in development. These data support the design of this system to be active strictly within the germ line. Strains based on this technology could sustain control and elimination as part of the malaria eradication agenda.

Journal ArticleDOI
17 Dec 2015-Cell
TL;DR: 3D genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation, and provides unique insights in the topological mechanism of human variations and diseases.

Journal ArticleDOI
TL;DR: A draft genome using 181-fold paired-end sequences assisted by fivefold BAC-to-BAC sequences and a high-resolution genetic map is produced for G. hirsutum, revealing conserved gene order and concerted evolution of different regulatory mechanisms for Cellulose synthase and 1-Aminocyclopropane-1-carboxylic acid oxidase1 and 3 may be important for enhanced fiber production.
Abstract: Gossypium hirsutum has proven difficult to sequence owing to its complex allotetraploid (AtDt) genome. Here we produce a draft genome using 181-fold paired-end sequences assisted by fivefold BAC-to-BAC sequences and a high-resolution genetic map. In our assembly 88.5% of the 2,173-Mb scaffolds, which cover 89.6%∼96.7% of the AtDt genome, are anchored and oriented to 26 pseudochromosomes. Comparison of this G. hirsutum AtDt genome with the already sequenced diploid Gossypium arboreum (AA) and Gossypium raimondii (DD) genomes revealed conserved gene order. Repeated sequences account for 67.2% of the AtDt genome, and transposable elements (TEs) originating from Dt seem more active than from At. Reduction in the AtDt genome size occurred after allopolyploidization. The A or At genome may have undergone positive selection for fiber traits. Concerted evolution of different regulatory mechanisms for Cellulose synthase (CesA) and 1-Aminocyclopropane-1-carboxylic acid oxidase1 and 3 (ACO1,3) may be important for enhanced fiber production in G. hirsutum.

Journal ArticleDOI
15 Jan 2015-Cell
TL;DR: By extending guide RNAs to include effector protein recruitment sites, this work constructs modular scaffold RNAs that encode both target locus and regulatory action and applies this approach to flexibly redirect flux through a complex branched metabolic pathway in yeast.

Journal ArticleDOI
TL;DR: The human body contains several hundred cell types, all of which share the same genome, and much of the regulatory code that drives cell type-specific gene expression is located in distal elements called enhancers, which influences the functions of enhancers and super-enhancers.
Abstract: The human body contains several hundred cell types, all of which share the same genome. In metazoans, much of the regulatory code that drives cell type-specific gene expression is located in distal elements called enhancers. Although mammalian genomes contain millions of potential enhancers, only a small subset of them is active in a given cell type. Cell type-specific enhancer selection involves the binding of lineage-determining transcription factors that prime enhancers. Signal-dependent transcription factors bind to primed enhancers, which enables these broadly expressed factors to regulate gene expression in a cell type-specific manner. The expression of genes that specify cell type identity and function is associated with densely spaced clusters of active enhancers known as super-enhancers. The functions of enhancers and super-enhancers are influenced by, and affect, higher-order genomic organization.

Journal ArticleDOI
TL;DR: A high-resolution sequencing–based method is presented to detect G4s in the human genome and observed a high G4 density in functional regions, as well as in genes previously not predicted to contain these structures (such as BRCA2).
Abstract: G-quadruplexes (G4s) are nucleic acid secondary structures that form within guanine-rich DNA or RNA sequences. G4 formation can affect chromatin architecture and gene regulation and has been associated with genomic instability, genetic diseases and cancer progression. Here we present a high-resolution sequencing-based method to detect G4s in the human genome. We identified 716,310 distinct G4 structures, 451,646 of which were not predicted by computational methods. These included previously uncharacterized noncanonical long loop and bulged structures. We observed a high G4 density in functional regions, such as 5' untranslated regions and splicing sites, as well as in genes previously not predicted to contain these structures (such as BRCA2). G4 formation was significantly associated with oncogenes, tumor suppressors and somatic copy number alterations related to cancer development. The G4s identified in this study may therefore represent promising targets for cancer intervention.

Journal ArticleDOI
TL;DR: In this paper, an adeno-associated viral (AAV)-associated endonuclease (Cas)9 was used to edit single or multiple genes in replicating eukaryotic cells, resulting in frame-shifting insertion/deletion (indel) mutations and subsequent protein depletion.
Abstract: Probing gene function in the mammalian brain can be greatly assisted with methods to manipulate the genome of neurons in vivo. The clustered, regularly interspaced, short palindromic repeats (CRISPR)-associated endonuclease (Cas)9 from Streptococcus pyogenes (SpCas9)1 can be used to edit single or multiple genes in replicating eukaryotic cells, resulting in frame-shifting insertion/deletion (indel) mutations and subsequent protein depletion. Here, we delivered SpCas9 and guide RNAs using adeno-associated viral (AAV) vectors to target single (Mecp2) as well as multiple genes (Dnmt1, Dnmt3a and Dnmt3b) in the adult mouse brain in vivo. We characterized the effects of genome modifications in postmitotic neurons using biochemical, genetic, electrophysiological and behavioral readouts. Our results demonstrate that AAV-mediated SpCas9 genome editing can enable reverse genetic studies of gene function in the brain.

Journal ArticleDOI
27 Nov 2015-Science
TL;DR: A synthetic lethality network focused on the secretory pathway based exclusively on mutations was created and revealed a genetic cross-talk governing Golgi homeostasis, an additional subunit of the human oligosaccharyltransferase complex, and a phosphatidylinositol 4-kinase β adaptor hijacked by viruses.
Abstract: Although the genes essential for life have been identified in less complex model organisms, their elucidation in human cells has been hindered by technical barriers. We used extensive mutagenesis in haploid human cells to identify approximately 2000 genes required for optimal fitness under culture conditions. To study the principles of genetic interactions in human cells, we created a synthetic lethality network focused on the secretory pathway based exclusively on mutations. This revealed a genetic cross-talk governing Golgi homeostasis, an additional subunit of the human oligosaccharyltransferase complex, and a phosphatidylinositol 4-kinase β adaptor hijacked by viruses. The synthetic lethality map parallels observations made in yeast and projects a route forward to reveal genetic networks in diverse aspects of human cell biology.

Journal ArticleDOI
29 Jan 2015-Cell
TL;DR: It is reported that rapid evolution of enhancers is a universal feature of mammalian genomes and most of the recently evolved enhancers arise from ancestral DNA exaptation, rather than lineage-specific expansions of repeat elements.

Journal ArticleDOI
TL;DR: Efficient genome engineering in human CD4+ T cells using Cas9:single-guide RNA ribonucleoproteins (Cas9 RNPs) is reported, establishing Cas9 RNP technology for diverse experimental and therapeutic genome engineering applications in primary human T cells.
Abstract: T-cell genome engineering holds great promise for cell-based therapies for cancer, HIV, primary immune deficiencies, and autoimmune diseases, but genetic manipulation of human T cells has been challenging. Improved tools are needed to efficiently "knock out" genes and "knock in" targeted genome modifications to modulate T-cell function and correct disease-associated mutations. CRISPR/Cas9 technology is facilitating genome engineering in many cell types, but in human T cells its efficiency has been limited and it has not yet proven useful for targeted nucleotide replacements. Here we report efficient genome engineering in human CD4(+) T cells using Cas9:single-guide RNA ribonucleoproteins (Cas9 RNPs). Cas9 RNPs allowed ablation of CXCR4, a coreceptor for HIV entry. Cas9 RNP electroporation caused up to ∼40% of cells to lose high-level cell-surface expression of CXCR4, and edited cells could be enriched by sorting based on low CXCR4 expression. Importantly, Cas9 RNPs paired with homology-directed repair template oligonucleotides generated a high frequency of targeted genome modifications in primary T cells. Targeted nucleotide replacement was achieved in CXCR4 and PD-1 (PDCD1), a regulator of T-cell exhaustion that is a validated target for tumor immunotherapy. Deep sequencing of a target site confirmed that Cas9 RNPs generated knock-in genome modifications with up to ∼20% efficiency, which accounted for up to approximately one-third of total editing events. These results establish Cas9 RNP technology for diverse experimental and therapeutic genome engineering applications in primary human T cells.

Journal ArticleDOI
TL;DR: The current knowledge of the mechanisms controlling R loops and their putative relationship with disease is reviewed and several DNA and RNA metabolism factors prevent R-loop formation in cells.
Abstract: R loops are nucleic acid structures composed of an RNA-DNA hybrid and a displaced single-stranded DNA. Recently, evidence has emerged that R loops occur more often in the genome and have greater physiological relevance, including roles in transcription and chromatin structure, than was previously predicted. Importantly, however, R loops are also a major threat to genome stability. For this reason, several DNA and RNA metabolism factors prevent R-loop formation in cells. Dysfunction of these factors causes R-loop accumulation, which leads to replication stress, genome instability, chromatin alterations or gene silencing, phenomena that are frequently associated with cancer and a number of genetic diseases. We review the current knowledge of the mechanisms controlling R loops and their putative relationship with disease.

Journal ArticleDOI
TL;DR: The examples reported in this study demonstrate the utility of Cas9-guide RNA technology as a plant genome editing tool to enhance plant breeding and crop research needed to meet growing agriculture demands of the future.
Abstract: Targeted mutagenesis, editing of endogenous maize (Zea mays) genes, and site-specific insertion of a trait gene using clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas)-guide RNA technology are reported in maize. DNA vectors expressing maize codon-optimized Streptococcus pyogenes Cas9 endonuclease and single guide RNAs were cointroduced with or without DNA repair templates into maize immature embryos by biolistic transformation targeting five different genomic regions: upstream of the liguleless1 (LIG1) gene, male fertility genes (Ms26 and Ms45), and acetolactate synthase (ALS) genes (ALS1 and ALS2). Mutations were subsequently identified at all sites targeted, and plants containing biallelic multiplex mutations at LIG1, Ms26, and Ms45 were recovered. Biolistic delivery of guide RNAs (as RNA molecules) directly into immature embryo cells containing preintegrated Cas9 also resulted in targeted mutations. Editing the ALS2 gene using either single-stranded oligonucleotides or double-stranded DNA vectors as repair templates yielded chlorsulfuron-resistant plants. Double-strand breaks generated by RNA-guided Cas9 endonuclease also stimulated insertion of a trait gene at a site near LIG1 by homology-directed repair. Progeny showed expected Mendelian segregation of mutations, edits, and targeted gene insertions. The examples reported in this study demonstrate the utility of Cas9-guide RNA technology as a plant genome editing tool to enhance plant breeding and crop research needed to meet growing agriculture demands of the future.

Journal ArticleDOI
TL;DR: RNA sequencing and single-nucleotide polymorphism array analysis of 675 human cancer cell lines is described and multiple genome and transcriptome features are combined in a pathway-based approach to enhance prediction of response to targeted therapeutics.
Abstract: Tumor-derived cell lines have served as vital models to advance our understanding of oncogene function and therapeutic responses. Although substantial effort has been made to define the genomic constitution of cancer cell line panels, the transcriptome remains understudied. Here we describe RNA sequencing and single-nucleotide polymorphism (SNP) array analysis of 675 human cancer cell lines. We report comprehensive analyses of transcriptome features including gene expression, mutations, gene fusions and expression of non-human sequences. Of the 2,200 gene fusions catalogued, 1,435 consist of genes not previously found in fusions, providing many leads for further investigation. We combine multiple genome and transcriptome features in a pathway-based approach to enhance prediction of response to targeted therapeutics. Our results provide a valuable resource for studies that use cancer cell lines.

Journal ArticleDOI
TL;DR: It is shown that MEG3 and EZH2 share common target genes, including the TGF-β pathway genes, and RNA–DNA triplex formation could be a general characteristic of target gene recognition by the chromatin-interacting lncRNAs.
Abstract: Long noncoding RNAs (lncRNAs) regulate gene expression by association with chromatin, but how they target chromatin remains poorly understood. We have used chromatin RNA immunoprecipitation-coupled high-throughput sequencing to identify 276 lncRNAs enriched in repressive chromatin from breast cancer cells. Using one of the chromatin-interacting lncRNAs, MEG3, we explore the mechanisms by which lncRNAs target chromatin. Here we show that MEG3 and EZH2 share common target genes, including the TGF-β pathway genes. Genome-wide mapping of MEG3 binding sites reveals that MEG3 modulates the activity of TGF-β genes by binding to distal regulatory elements. MEG3 binding sites have GA-rich sequences, which guide MEG3 to the chromatin through RNA–DNA triplex formation. We have found that RNA–DNA triplex structures are widespread and are present over the MEG3 binding sites associated with the TGF-β pathway genes. Our findings suggest that RNA–DNA triplex formation could be a general characteristic of target gene recognition by the chromatin-interacting lncRNAs.

Journal ArticleDOI
TL;DR: Cancers with recurrent somatic HLA mutations were associated with upregulation of signatures of cytolytic activity characteristic of tumor infiltration by effector lymphocytes, supporting immune evasion by altered HLA function as a contributory mechanism in cancer.
Abstract: Detection of somatic mutations in human leukocyte antigen (HLA) genes using whole-exome sequencing (WES) is hampered by the high polymorphism of the HLA loci, which prevents alignment of sequencing reads to the human reference genome. We describe a computational pipeline that enables accurate inference of germline alleles of class I HLA-A, B and C genes and subsequent detection of mutations in these genes using the inferred alleles as a reference. Analysis of WES data from 7,930 pairs of tumor and healthy tissue from the same patient revealed 298 nonsilent HLA mutations in tumors from 266 patients. These 298 mutations are enriched for likely functional mutations, including putative loss-of-function events. Recurrence of mutations suggested that these 'hotspot' sites were positively selected. Cancers with recurrent somatic HLA mutations were associated with upregulation of signatures of cytolytic activity characteristic of tumor infiltration by effector lymphocytes, supporting immune evasion by altered HLA function as a contributory mechanism in cancer.

Journal ArticleDOI
TL;DR: The spectrum of gene fusions in cancer and how the methods to identify them have evolved are described, and the conceptual implications of current, sequencing-based approaches for detection are discussed.
Abstract: Structural chromosome rearrangements may result in the exchange of coding or regulatory DNA sequences between genes. Many such gene fusions are strong driver mutations in neoplasia and have provided fundamental insights into the disease mechanisms that are involved in tumorigenesis. The close association between the type of gene fusion and the tumour phenotype makes gene fusions ideal for diagnostic purposes, enabling the subclassification of otherwise seemingly identical disease entities. In addition, many gene fusions add important information for risk stratification, and increasing numbers of chimeric proteins encoded by the gene fusions serve as specific targets for treatment, resulting in dramatically improved patient outcomes. In this Timeline article, we describe the spectrum of gene fusions in cancer and how the methods to identify them have evolved, and also discuss conceptual implications of current, sequencing-based approaches for detection.

Journal ArticleDOI
04 Jun 2015-Nature
TL;DR: A genome-wide length-dependent increase in gene expression is identified in MeCP2 mutant mouse models and human RTT brains, and it is found that long genes as a population are enriched for neuronal functions and selectively expressed in the brain.
Abstract: Rett syndrome is caused by mutation of the MECP2 gene that codes for a protein that binds methylated DNA; this study reveals that MeCP2 affects the expression of long genes, which often serve neuronal functions Autism-related Rett syndrome is caused by disruption of the MECP2 gene, which codes for a methyl-DNA binding protein, but how MECP2 may control transcription of other genes has remained unclear Now Michael Greenberg and colleagues show that disruption of the Mecp2 gene in a mouse model and in human Rett syndrome leads to preferential upregulation of longer genes, and that these often serve neuronal functions Further data indicate that decreasing the expression of long genes, via hypomethylation of the dinucleotide CA, attenuates Rett-related dysfunctions in cultured neurons lacking MECP2 Disruption of the MECP2 gene leads to Rett syndrome (RTT), a severe neurological disorder with features of autism1 MECP2 encodes a methyl-DNA-binding protein2 that has been proposed to function as a transcriptional repressor, but despite numerous mouse studies examining neuronal gene expression in Mecp2 mutants, no clear model has emerged for how MeCP2 protein regulates transcription3,4,5,6,7,8,9 Here we identify a genome-wide length-dependent increase in gene expression in MeCP2 mutant mouse models and human RTT brains We present evidence that MeCP2 represses gene expression by binding to methylated CA sites within long genes, and that in neurons lacking MeCP2, decreasing the expression of long genes attenuates RTT-associated cellular deficits In addition, we find that long genes as a population are enriched for neuronal functions and selectively expressed in the brain These findings suggest that mutations in MeCP2 may cause neurological dysfunction by specifically disrupting long gene expression in the brain

Journal ArticleDOI
TL;DR: Embryonic stem cell culture conditions are important for maintaining long-term self-renewal, and they influence cellular pluripotency state, with 2i being the most similar to blastocyst cells and including a subpopulation resembling the two-cell embryo state.