Showing papers by "Broad Institute published in 2014"
••
TL;DR: Associations at DRD2 and several genes involved in glutamatergic neurotransmission highlight molecules of known and potential therapeutic relevance to schizophrenia, and are consistent with leading pathophysiological hypotheses.
Abstract: Schizophrenia is a highly heritable disorder. Genetic risk is conferred by a large number of alleles, including common alleles of small effect that might be detected by genome-wide association studies. Here we report a multi-stage schizophrenia genome-wide association study of up to 36,989 cases and 113,075 controls. We identify 128 independent associations spanning 108 conservatively defined loci that meet genome-wide significance, 83 of which have not been previously reported. Associations were enriched among genes expressed in brain, providing biological plausibility for the findings. Many findings have the potential to provide entirely new insights into aetiology, but associations at DRD2 and several genes involved in glutamatergic neurotransmission highlight molecules of known and potential therapeutic relevance to schizophrenia, and are consistent with leading pathophysiological hypotheses. Independent of genes expressed in brain, associations were enriched among genes expressed in tissues that have important roles in immunity, providing support for the speculated link between the immune system and schizophrenia.
6,809 citations
••
TL;DR: In situ Hi-C is used to probe the 3D architecture of genomes, constructing haploid and diploid maps of nine cell types, identifying ∼10,000 loops that frequently link promoters and enhancers, correlate with gene activation, and show conservation across cell types and species.
5,945 citations
••
TL;DR: Pilon is a fully automated, all-in-one tool for correcting draft assemblies and calling sequence variants of multiple sizes, including very large insertions and deletions, which is being used to improve the assemblies of thousands of new genomes and to identify variants from thousands of clinically relevant bacterial strains.
Abstract: Advances in modern sequencing technologies allow us to generate sufficient data to analyze hundreds of bacterial genomes from a single machine in a single day. This potential for sequencing massive numbers of genomes calls for fully automated methods to produce high-quality assemblies and variant calls. We introduce Pilon, a fully automated, all-in-one tool for correcting draft assemblies and calling sequence variants of multiple sizes, including very large insertions and deletions. Pilon works with many types of sequence data, but is particularly strong when supplied with paired end data from two Illumina libraries with small e.g., 180 bp and large e.g., 3-5 Kb inserts. Pilon significantly improves draft genome assemblies by correcting bases, fixing mis-assemblies and filling gaps. For both haploid and diploid genomes, Pilon produces more contiguous genomes with fewer errors, enabling identification of more biologically relevant genes. Furthermore, Pilon identifies small variants with high accuracy as compared to state-of-the-art tools and is unique in its ability to accurately identify large sequence variants including duplications and resolve large insertions. Pilon is being used to improve the assemblies of thousands of new genomes and to identify variants from thousands of clinically relevant bacterial strains. Pilon is freely available as open source software.
5,659 citations
••
TL;DR: A comprehensive molecular evaluation of 295 primary gastric adenocarcinomas as part of The Cancer Genome Atlas (TCGA) project is described and a molecular classification dividing gastric cancer into four subtypes is proposed.
Abstract: Gastric cancer was the world’s third leading cause of cancer mortality in 2012, responsible for 723,000 deaths1. The vast majority of gastric cancers are adenocarcinomas, which can be further subdivided into intestinal and diffuse types according to the Lauren classification2. An alternative system, proposed by the World Health Organization, divides gastric cancer into papillary, tubular, mucinous (colloid) and poorly cohesive carcinomas3. These classification systems have little clinical utility, making the development of robust classifiers that can guide patient therapy an urgent priority.
The majority of gastric cancers are associated with infectious agents, including the bacterium Helicobacter pylori4 and Epstein–Barr virus (EBV). The distribution of histological subtypes of gastric cancer and the frequencies of H. pylori and EBV associated gastric cancer vary across the globe5. A small minority of gastric cancer cases are associated with germline mutation in E-cadherin (CDH1)6 or mismatch repair genes7 (Lynch syndrome), whereas sporadic mismatch repair-deficient gastric cancers have epigenetic silencing of MLH1 in the context of a CpG island methylator phenotype (CIMP)8. Molecular profiling of gastric cancer has been performed using gene expression or DNA sequencing9–12, but has not led to a clear biologic classification scheme. The goals of this study by The Cancer Genome Atlas (TCGA) were to develop a robust molecular classification of gastric cancer and to identify dysregulated pathways and candidate drivers of distinct classes of gastric cancer.
4,583 citations
••
TL;DR: In this paper, the authors describe the development and applications of Cas9 for a variety of research or translational applications while highlighting challenges as well as future directions, and highlight challenges and future directions.
4,361 citations
••
TL;DR: This work shows that lentiviral delivery of a genome-scale CRISPR-Cas9 knockout (GeCKO) library targeting 18,080 genes with 64,751 unique guide sequences enables both negative and positive selection screening in human cells, and observes a high level of consistency between independent guide RNAs targeting the same gene and a high rate of hit confirmation.
Abstract: The simplicity of programming the CRISPR (clustered regularly interspaced short palindromic repeats)–associated nuclease Cas9 to modify specific genomic loci suggests a new way to interrogate gene function on a genome-wide scale. We show that lentiviral delivery of a genome-scale CRISPR-Cas9 knockout (GeCKO) library targeting 18,080 genes with 64,751 unique guide sequences enables both negative and positive selection screening in human cells. First, we used the GeCKO library to identify genes essential for cell viability in cancer and pluripotent stem cells. Next, in a melanoma model, we screened for genes whose loss is involved in resistance to vemurafenib, a therapeutic RAF inhibitor. Our highest-ranking candidates include previously validated genes NF1 and MED12 , as well as novel hits NF2 , CUL3 , TADA2B , and TADA1. We observe a high level of consistency between independent guide RNAs targeting the same gene and a high rate of hit confirmation, demonstrating the promise of genome-scale screening with Cas9.
4,147 citations
••
TL;DR: Monocle is described, an unsupervised algorithm that increases the temporal resolution of transcriptome dynamics using single-cell RNA-Seq data collected at multiple time points that revealed switch-like changes in expression of key regulatory factors, sequential waves of gene regulation, and expression of regulators that were not known to act in differentiation.
Abstract: Defining the transcriptional dynamics of a temporal process such as cell differentiation is challenging owing to the high variability in gene expression between individual cells. Time-series gene expression analyses of bulk cells have difficulty distinguishing early and late phases of a transcriptional cascade or identifying rare subpopulations of cells, and single-cell proteomic methods rely on a priori knowledge of key distinguishing markers. Here we describe Monocle, an unsupervised algorithm that increases the temporal resolution of transcriptome dynamics using single-cell RNA-Seq data collected at multiple time points. Applied to the differentiation of primary human myoblasts, Monocle revealed switch-like changes in expression of key regulatory factors, sequential waves of gene regulation, and expression of regulators that were not known to act in differentiation. We validated some of these predicted regulators in a loss-of function screen. Monocle can in principle be used to recover single-cell gene expression kinetics from a wide array of cellular processes, including differentiation, proliferation and oncogenic transformation.
4,119 citations
••
Eric A. Collisson1, Joshua D. Campbell2, Angela N. Brooks3, Angela N. Brooks2 +315 more•Institutions (41)
TL;DR: In this paper, the authors report molecular profiling of 230 resected lung adnocarcinomas using messenger RNA, microRNA and DNA sequencing integrated with copy number, methylation and proteomic analyses.
Abstract: Adenocarcinoma of the lung is the leading cause of cancer death worldwide. Here we report molecular profiling of 230 resected lung adenocarcinomas using messenger RNA, microRNA and DNA sequencing integrated with copy number, methylation and proteomic analyses. High rates of somatic mutation were seen (mean 8.9 mutations per megabase). Eighteen genes were statistically significantly mutated, including RIT1 activating mutations and newly described loss-of-function MGA mutations which are mutually exclusive with focal MYC amplification. EGFR mutations were more frequent in female patients, whereas mutations in RBM10 were more common in males. Aberrations in NF1, MET, ERBB2 and RIT1 occurred in 13% of cases and were enriched in samples otherwise lacking an activated oncogene, suggesting a driver role for these events in certain tumours. DNA and mRNA sequence from the same tumour highlighted splicing alterations driven by somatic genomic changes, including exon 14 skipping in MET mRNA in 4% of cases. MAPK and PI(3)K pathway activity, when measured at the protein level, was explained by known mutations in only a fraction of cases, suggesting additional, unexplained mechanisms of pathway activation. These data establish a foundation for classification and further investigations of lung adenocarcinoma molecular pathogenesis.
4,104 citations
••
TL;DR: In this paper, Zhang et al. used a Genome-scale CRISPR Knock-Out (GeCKO) library to identify loss-of-function mutations in a melanoma model.
Abstract: Genome-wide, targeted loss-of-function pooled screens using the CRISPR (clustered regularly interspaced short palindrome repeats)–associated nuclease Cas9 in human and mouse cells provide an alternative screening system to RNA interference (RNAi) and have been used to reveal new mechanisms in diverse biological models1-4. Previously, we used a Genome-scale CRISPR Knock-Out (GeCKO) library to identify loss-of-function mutations conferring vemurafenib resistance in a melanoma model1. However, initial lentiviral delivery systems for CRISPR screening had low viral titer or required a cell line already expressing Cas9, limiting the range of biological systems amenable to screening.
Here, we sought to improve both the lentiviral packaging and choice of guide sequences in our original GeCKO library1, where a pooled library of synthesized oligonucleotides was cloned into a lentiviral backbone containing both the Streptococcus pyogenes Cas9 nuclease and the single guide RNA (sgRNA) scaffold. To create a new vector capable of producing higher-titer virus (lentiCRISPRv2), we made several modifications, including removal of one of the nuclear localization signals (NLS), human codon-optimization of the remaining NLS and P2A bicistronic linker sequences, and repositioning of the U6-driven sgRNA cassette (Fig. 1a). These changes resulted in a ~10-fold increase in functional viral titer over lentiCRISPRv11 (Fig. 1b).
Figure 1
New lentiviral CRISPR designs produce viruses with higher functional titer.
To further increase viral titer, we also cloned a two-vector system, in which Cas9 (lentiCas9-Blast) and sgRNA (lentiGuide-Puro) are delivered using separate viral vectors with distinct antibiotic selection markers (Fig. 1a). LentiGuide-Puro has a ~100-fold increase in functional viral titer over the original lentiCRISPRv1 (Fig. 1b). Both single and dual-vector systems mediate efficient knock-out of a genomically-integrated copy of EGFP in human cells (Supplementary Fig. 1). Whereas the dual vector system enables generation of Cas9-expressing cell lines which can be subsequently used for screens using lentiGuide-Puro, the single vector lentiCRISPRv2 may be better suited for in vivo or primary cell screening applications.
In addition to the vector improvements, we designed and synthesized new human and mouse GeCKOv2 sgRNA libraries (Supplementary Methods) with several improvements (Table 1): First, for both human and mouse libraries, to target all genes with a uniform number of sgRNAs, we selected 6 sgRNAs per gene distributed over 3-4 constitutively expressed exons. Second, to further minimize off-target genome modification, we improved the calculation of off-target scores based on specificity analysis5. Third, to inactivate microRNAs (miRNAs) which play a key role in transcriptional regulation, we added sgRNAs to direct mutations to the pre-miRNA hairpin structure6. Finally, we targeted ~1000 additional genes not included in the original GeCKO library.
Table 1
Comparison of new GeCKO v2 human and mouse sgRNA libraries with existing CRISPR libraries.
Both libraries, mouse and human, are divided into 2 sub-libraries — containing 3 sgRNAs targeting each gene in the genome, as well as 1000 non-targeting control sgRNAs. Screens can be performed by combining both sub-libraries, yielding 6 sgRNAs per gene, for higher coverage. Alternatively, individual sub-libraries can be used in situations where cell numbers are limiting (eg. primary cells, in vivo screens). The human and mouse libraries have been cloned into lentiCRISPRv2 and into lentiGuide-Puro and deep sequenced to ensure uniform representation (Supplementary Fig. 2, 3). These new lentiCRISPR vectors and human and mouse libraries further improve the GeCKO reagents for diverse screening applications. Reagents are available to the academic community through Addgene and associated protocols, support forums, and computational tools are available via the Zhang lab website (www.genome-engineering.org).
3,833 citations
••
TL;DR: The genome sequence of single cells isolated from brain glioblastomas was examined, which revealed shared chromosomal changes but also extensive transcription variation, including genes related to signaling, which represent potential therapeutic targets.
Abstract: Human cancers are complex ecosystems composed of cells with distinct phenotypes, genotypes, and epigenetic states, but current models do not adequately reflect tumor composition in patients. We used single-cell RNA sequencing (RNA-seq) to profile 430 cells from five primary glioblastomas, which we found to be inherently variable in their expression of diverse transcriptional programs related to oncogenic signaling, proliferation, complement/immune response, and hypoxia. We also observed a continuum of stemness-related expression states that enabled us to identify putative regulators of stemness in vivo. Finally, we show that established glioblastoma subtype classifiers are variably expressed across individual cells within a tumor and demonstrate the potential prognostic implications of such intratumoral heterogeneity. Thus, we reveal previously unappreciated heterogeneity in diverse regulatory programs central to glioblastoma biology, prognosis, and therapy.
3,475 citations
••
TL;DR: Targeted metabolomic profiling and chemoproteomics revealed that GPX4 is an essential regulator of ferroptotic cancer cell death and sensitivity profiling in 177 cancer cell lines revealed that diffuse large B cell lymphomas and renal cell carcinomas are particularly susceptible to GPx4-regulated ferroPTosis.
01 Jun 2014
TL;DR: The development and applications of Cas9 are described for a variety of research or translational applications while highlighting challenges as well as future directions.
Abstract: Recent advances in genome engineering technologies based on the CRISPR-associated RNA-guided endonuclease Cas9 are enabling the systematic interrogation of mammalian genome function. Analogous to the search function in modern word processors, Cas9 can be guided to specific locations within complex genomes by a short RNA search string. Using this system, DNA sequences within the endogenous genome and their functional outputs are now easily edited or modulated in virtually any organism of choice. Cas9-mediated genetic perturbation is simple and scalable, empowering researchers to elucidate the functional organization of the genome at the systems level and establish causal linkages between genetic variations and biological phenotypes. In this Review, we describe the development and applications of Cas9 for a variety of research or translational applications while highlighting challenges as well as future directions. Derived from a remarkable microbial defense system, Cas9 is driving innovative applications from basic biology to biotechnology and medicine.
••
Broad Institute1, Harvard University2, Hannover Medical School3, Minerva Foundation Institute for Medical Research4, University of Helsinki5, University of Southern California6, National Institutes of Health7, National Institute for Health and Welfare8, University of Mississippi Medical Center9, University of Mississippi10, Massachusetts Institute of Technology11, University of Michigan12, University of Oxford13, Danube University Krems14, King Abdulaziz University15, Albert Einstein College of Medicine16
TL;DR: Age-related clonal hematopoiesis is a common condition that is associated with increases in the risk of hematologic cancer and in all-cause mortality, with the latter possibly due to an increased risk of cardiovascular disease.
Abstract: Background The incidence of hematologic cancers increases with age. These cancers are associated with recurrent somatic mutations in specific genes. We hypothesized that such mutations would be detectable in the blood of some persons who are not known to have hematologic disorders. Methods We analyzed whole-exome sequencing data from DNA in the peripheral-blood cells of 17,182 persons who were unselected for hematologic phenotypes. We looked for somatic mutations by identifying previously characterized single-nucleotide variants and small insertions or deletions in 160 genes that are recurrently mutated in hematologic cancers. The presence of mutations was analyzed for an association with hematologic phenotypes, survival, and cardiovascular events. Results Detectable somatic mutations were rare in persons younger than 40 years of age but rose appreciably in frequency with age. Among persons 70 to 79 years of age, 80 to 89 years of age, and 90 to 108 years of age, these clonal mutations were observed in 9.5% (219 of 2300 persons), 11.7% (37 of 317), and 18.4% (19 of 103), respectively. The majority of the variants occurred in three genes: DNMT3A, TET2, and ASXL1. The presence of a somatic mutation was associated with an increase in the risk of hematologic cancer (hazard ratio, 11.1; 95% confidence interval [CI], 3.9 to 32.6), an increase in all-cause mortality (hazard ratio, 1.4; 95% CI, 1.1 to 1.8), and increases in the risks of incident coronary heart disease (hazard ratio, 2.0; 95% CI, 1.2 to 3.4) and ischemic stroke (hazard ratio, 2.6; 95% CI, 1.4 to 4.8). Conclusions Age-related clonal hematopoiesis is a common condition that is associated with increases in the risk of hematologic cancer and in all-cause mortality, with the latter possibly due to an increased risk of cardiovascular disease. (Funded by the National Institutes of Health and others.)
••
TL;DR: It is found that large-scale genomic analysis can identify nearly all known cancer genes in these cancer types and 33 genes that were not previously known to be significantly mutated in cancer, including genes related to proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing and protein homeostasis.
Abstract: Although a few cancer genes are mutated in a high proportion of tumours of a given type (.20%), most are mutated at intermediate frequencies (2–20%). To explore the feasibility of creating a comprehensive catalogue of cancer genes, we analysed somatic point mutations in exome sequences from 4,742 human cancers and their matched normal-tissue samples across 21 cancer types. We found that large-scale genomic analysis can identify nearly all known cancer genes in these tumour types. Our analysis also identified 33 genes that were not previously known to be significantly mutated in cancer, including genes related to proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing and protein homeostasis. Down-sampling analysis indicates that larger sample sizes will reveal many more genes mutated at clinically important frequencies. We estimate that near-saturation may be achieved with 600– 5,000 samples per tumour type, depending on background mutation frequency. The results may help to guide the next stage of cancer genomics. Comprehensive knowledge of the genes underlying human cancers is a critical foundation for cancer diagnostics, therapeutics, clinical-trial design and selection of rational combination therapies. It is now possible to use genomic analysis to identify cancer genes in an unbiased fashion, based on the presence of somatic mutations at a rate significantly higher than the expected background level. Systematic studies have revealed many new cancer genes, as well as new classes of cancer genes 1,2 . They have also made clear that, although some cancer genes are mutated at high frequencies, most cancer genes in most patients occur at intermediate frequencies (2–20%) or lower. Accordingly, a complete catalogue of mutations in this frequency class will be essential for recognizing dysregulated pathways and optimal targets for therapeutic intervention. However, recent work suggests major gaps in our knowledge of cancer genes of intermediate frequency. For example, a study of 183 lung adenocarcinomas 3 found that 15% of patients lacked even a single mutation affecting any of the 10 known hallmarks of cancer, and 38% had 3 or fewer such mutations. In this paper, we analysed somatic point mutations (substitutions and small insertion and deletions) in nearly 5,000 human cancers and their matched normal-tissue samples (‘tumour–normal pairs’) across 21 tumour types. The questions that we examine here are: first, whether large-scale genomic analysis across tumour types can reliably identify all known cancer genes; second, whether it will reveal many new candidate cancer genes; and third, how far we are from having a complete catalogue of cancer genes (at least those of intermediate frequency). We used rigorous statistical methods to enumerate candidate cancer genes and then carefully inspected each gene to identify those with strong biological connections to cancer and mutational patterns consistent with the expected function. The analysis reveals nearly all known cancer genes and revealed 33 novel candidates, including genes related to proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing and protein homeostasis. Importantly, the data show that the
••
TL;DR: In this paper, a pooled, loss-of-function genetic screening approach suitable for both positive and negative selection that uses a genome-scale lentiviral single-guide RNA (sgRNA) library was described.
Abstract: The bacterial clustered regularly interspaced short palindromic repeats (CRISPR)–Cas9 system for genome editing has greatly expanded the toolbox for mammalian genetics, enabling the rapid generation of isogenic cell lines and mice with modified alleles. Here, we describe a pooled, loss-of-function genetic screening approach suitable for both positive and negative selection that uses a genome-scale lentiviral single-guide RNA (sgRNA) library. sgRNA expression cassettes were stably integrated into the genome, which enabled a complex mutant pool to be tracked by massively parallel sequencing. We used a library containing 73,000 sgRNAs to generate knockout collections and performed screens in two human cell lines. A screen for resistance to the nucleotide analog 6-thioguanine identified all expected members of the DNA mismatch repair pathway, whereas another for the DNA topoisomerase II ( TOP2A ) poison etoposide identified TOP2A , as expected, and also cyclin-dependent kinase 6, CDK6. A negative selection screen for essential genes identified numerous gene sets corresponding to fundamental processes. Last, we show that sgRNA efficiency is associated with specific sequence motifs, enabling the prediction of more effective sgRNAs. Collectively, these results establish Cas9/sgRNA screens as a powerful tool for systematic genetic analysis in mammalian cells.
••
Broad Institute1, Emory University2, Cincinnati Children's Hospital Medical Center3, University of Colorado Boulder4, Harvard University5, University of Minnesota6, University of Toronto7, Women & Children's Hospital of Buffalo8, Boston Children's Hospital9, Mayo Clinic10, University of California, San Francisco11, Long Island Jewish Medical Center12, Children's Hospital of Philadelphia13, Children's Hospital of Eastern Ontario14, Nationwide Children's Hospital15, Howard Hughes Medical Institute16
TL;DR: Comparing the microbial signatures between the ileum, the rectum, and fecal samples indicates that at this early stage of disease, assessing the rectal mucosal-associated microbiome offers unique potential for convenient and early diagnosis of CD.
••
Icahn School of Medicine at Mount Sinai1, Carnegie Mellon University2, Harvard University3, University of Toronto4, Wellcome Trust Sanger Institute5, University of Pittsburgh6, Nagoya University7, University of Freiburg8, King's College London9, Vanderbilt University10, King Abdulaziz University11, University of Santiago de Compostela12, University of Utah13, Duke University14, Memorial University of Newfoundland15, Trinity College, Dublin16, University of Pennsylvania17, University of Illinois at Chicago18, Boston Children's Hospital19, Columbia University20, German Cancer Research Center21, University College London22, Kaiser Permanente23, Broad Institute24, Cardiff University25, Complutense University of Madrid26, Newcastle University27, Baylor College of Medicine28, University of California, San Francisco29, RWTH Aachen University30, National Health Service31, McMaster University32, Saarland University33, Karolinska Institutet34, National Institutes of Health35, University of Helsinki36, Emory University37
TL;DR: Using exome sequencing, it is shown that analysis of rare coding variation in 3,871 autism cases and 9,937 ancestry-matched or parental controls implicates 22 autosomal genes at a false discovery rate of < 0.05, plus a set of 107 genes strongly enriched for those likely to affect risk (FDR < 0.30).
Abstract: The genetic architecture of autism spectrum disorder involves the interplay of common and rare variants and their impact on hundreds of genes. Using exome sequencing, here we show that analysis of rare coding variation in 3,871 autism cases and 9,937 ancestry-matched or parental controls implicates 22 autosomal genes at a false discovery rate (FDR) < 0.05, plus a set of 107 autosomal genes strongly enriched for those likely to affect risk (FDR < 0.30). These 107 genes, which show unusual evolutionary constraint against mutations, incur de novo loss-of-function mutations in over 5% of autistic subjects. Many of the genes implicated encode proteins for synaptic formation, transcriptional regulation and chromatin-remodelling pathways. These include voltage-gated ion channels regulating the propagation of action potentials, pacemaking and excitability-transcription coupling, as well as histone-modifying enzymes and chromatin remodellers-most prominently those that mediate post-translational lysine methylation/demethylation modifications of histones.
••
Broad Institute1, Harvard University2, Monash University3, Kyoto University4, Genentech5, Vanderbilt University6, New York University7, NewYork–Presbyterian Hospital8, Second Military Medical University9, University of Queensland10, University of Toronto11, University of Groningen12, University of Tartu13, Beijing Jiaotong University14, Icahn School of Medicine at Mount Sinai15, Radboud University Nijmegen16, Medisch Spectrum Twente17, Leiden University18, University of Paris19, French Institute of Health and Medical Research20, University of Alabama at Birmingham21, University of Cambridge22, University of Amsterdam23, GlaxoSmithKline24, Hanyang University25, Spanish National Research Council26, Complutense University of Madrid27, Umeå University28, Boston University29, Council on Education for Public Health30, McGill University31, University of Manchester32, National Health Service33, University of Pittsburgh34, University of California, San Francisco35, Karolinska Institutet36, North Shore-LIJ Health System37, University of Chicago38, University of Tokyo39
TL;DR: A genome-wide association study meta-analysis in a total of >100,000 subjects of European and Asian ancestries provides empirical evidence that the genetics of RA can provide important information for drug discovery, and sheds light on fundamental genes, pathways and cell types that contribute to RA pathogenesis.
Abstract: A major challenge in human genetics is to devise a systematic strategy to integrate disease-associated variants with diverse genomic and biological data sets to provide insight into disease pathogenesis and guide drug discovery for complex traits such as rheumatoid arthritis (RA)1. Here we performed a genome-wide association study meta-analysis in a total of >100,000 subjects of European and Asian ancestries (29,880 RA cases and 73,758 controls), by evaluating ~10 million single-nucleotide polymorphisms. We discovered 42 novel RA risk loci at a genome-wide level of significance, bringing the total to 101 (refs 2, 3, 4). We devised an in silico pipeline using established bioinformatics methods based on functional annotation5, cis-acting expression quantitative trait loci6 and pathway analyses7, 8, 9—as well as novel methods based on genetic overlap with human primary immunodeficiency, haematological cancer somatic mutations and knockout mouse phenotypes—to identify 98 biological candidate genes at these 101 risk loci. We demonstrate that these genes are the targets of approved therapies for RA, and further suggest that drugs approved for other indications may be repurposed for the treatment of RA. Together, this comprehensive genetic study sheds light on fundamental genes, pathways and cell types that contribute to RA pathogenesis, and provides empirical evidence that the genetics of RA can provide important information for drug discovery.
••
TL;DR: This article identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height, and all common variants together captured 60% of heritability.
Abstract: Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.
••
TL;DR: New perspective is gained on the roles played by adipocyte in a variety of homeostatic processes and on the mechanisms used by adipocytes to communicate with other tissues and how these relationships are altered during metabolic disease and how they might be manipulated to restore metabolic health.
••
Max Planck Society1, University of California, Berkeley2, Broad Institute3, Harvard University4, University of Washington5, National Institutes of Health6, University of California, Santa Cruz7, Ludwig Maximilian University of Munich8, Emory University9, Fondation Jean Dausset Centre d'Etude du Polymorphisme Humain10, Allen Institute for Brain Science11, Russian Academy of Sciences12, Howard Hughes Medical Institute13
TL;DR: It is shown that interbreeding, albeit of low magnitude, occurred among many hominin groups in the Late Pleistocene and a definitive list of substitutions that became fixed in modern humans after their separation from the ancestors of Neanderthals and Denisovans is established.
Abstract: We present a high-quality genome sequence of a Neanderthal woman from Siberia. We show that her parents were related at the level of half-siblings and that mating among close relatives was common among her recent ancestors. We also sequenced the genome of a Neanderthal from the Caucasus to low coverage. An analysis of the relationships and population history of available archaic genomes and 25 present-day human genomes shows that several gene flow events occurred among Neanderthals, Denisovans and early modern humans, possibly including gene flow into Denisovans from an unknown archaic group. Thus, interbreeding, albeit of low magnitude, occurred among many hominin groups in the Late Pleistocene. In addition, the high-quality Neanderthal genome allows us to establish a definitive list of substitutions that became fixed in modern humans after their separation from the ancestors of Neanderthals and Denisovans.
••
Valentina Escott-Price1, Céline Bellenguez2, Li-San Wang3, Seung Hoan Choi4 +191 more•Institutions (67)
TL;DR: The additional genes identified in this study, have an array of functions previously implicated in Alzheimer's disease, including aspects of energy metabolism, protein degradation and the immune system and add further weight to these pathways as potential therapeutic targets in Alzheimers disease.
Abstract: Background: Alzheimer's disease is a common debilitating dementia with known heritability, for which 20 late onset susceptibility loci have been identified, but more remain to be discovered. This s ...
••
TL;DR: In vivo as well as ex vivo genome editing using adeno-associated virus, lentivirus, or particle-mediated delivery of guide RNA in neurons, immune cells, and endothelial cells is demonstrated, suggesting that Cas9 mice empower a wide range of biological and disease modeling applications.
••
TL;DR: Model-based Analysis of Genome-wide CRISPR/Cas9 Knockout (MAGeCK) demonstrates better performance compared with existing methods, identifies both positively and negatively selected genes simultaneously, and reports robust results across different experimental conditions.
Abstract: We propose the Model-based Analysis of Genome-wide CRISPR/Cas9 Knockout (MAGeCK) method for prioritizing single-guide RNAs, genes and pathways in genome-scale CRISPR/Cas9 knockout screens. MAGeCK demonstrates better performance compared with existing methods, identifies both positively and negatively selected genes simultaneously, and reports robust results across different experimental conditions. Using public datasets, MAGeCK identified novel essential genes and pathways, including EGFR in vemurafenib-treated A375 cells harboring a BRAF mutation. MAGeCK also detected cell type-specific essential genes, including BCR and ABL1, in KBM7 cells bearing a BCR-ABL fusion, and IGF1R in HL-60 cells, which depends on the insulin signaling pathway for proliferation.
••
TL;DR: An online tool for the design of highly active sgRNAs for any gene of interest is provided, including a further optimization of the protospacer-adjacent motif (PAM) of Streptococcus pyogenes Cas9.
Abstract: Components of the prokaryotic clustered, regularly interspaced, short palindromic repeats (CRISPR) loci have recently been repurposed for use in mammalian cells. The CRISPR-associated (Cas)9 can be programmed with a single guide RNA (sgRNA) to generate site-specific DNA breaks, but there are few known rules governing on-target efficacy of this system. We created a pool of sgRNAs, tiling across all possible target sites of a panel of six endogenous mouse and three endogenous human genes and quantitatively assessed their ability to produce null alleles of their target gene by antibody staining and flow cytometry. We discovered sequence features that improved activity, including a further optimization of the protospacer-adjacent motif (PAM) of Streptococcus pyogenes Cas9. The results from 1,841 sgRNAs were used to construct a predictive model of sgRNA activity to improve sgRNA design for gene editing and genetic screens. We provide an online tool for the design of highly active sgRNAs for any gene of interest.
••
TL;DR: The identification of glycolysis as a fundamental process in trained immunity further highlights a key regulatory role for metabolism in innate host defense and defines a potential therapeutic target in both infectious and inflammatory diseases.
Abstract: Epigenetic reprogramming of myeloid cells, also known as trained immunity, confers nonspecific protection from secondary infections. Using histone modification profiles of human monocytes trained with the Candida albicans cell wall constituent β-glucan, together with a genome-wide transcriptome, we identified the induced expression of genes involved in glucose metabolism. Trained monocytes display high glucose consumption, high lactate production, and a high ratio of nicotinamide adenine dinucleotide (NAD(+)) to its reduced form (NADH), reflecting a shift in metabolism with an increase in glycolysis dependent on the activation of mammalian target of rapamycin (mTOR) through a dectin-1-Akt-HIF-1α (hypoxia-inducible factor-1α) pathway. Inhibition of Akt, mTOR, or HIF-1α blocked monocyte induction of trained immunity, whereas the adenosine monophosphate-activated protein kinase activator metformin inhibited the innate immune response to fungal infection. Mice with a myeloid cell-specific defect in HIF-1α were unable to mount trained immunity against bacterial sepsis. Our results indicate that induction of aerobic glycolysis through an Akt-mTOR-HIF-1α pathway represents the metabolic basis of trained immunity.
••
University of California, San Diego1, Pennsylvania State University2, Stanford University3, University of Washington4, University of Michigan5, New College of Florida6, Florida State University7, Cold Spring Harbor Laboratory8, California Institute of Technology9, University of Vienna10, Emory University11, Fred Hutchinson Cancer Research Center12, Massachusetts Institute of Technology13, Broad Institute14, University of California, Irvine15, University of California, Santa Cruz16, University of California, San Francisco17, Yale University18, University of Florida19, Johns Hopkins University20, University College London21, University of Oxford22, Cornell University23, Memorial Sloan Kettering Cancer Center24, Harvard University25, University of Iowa26, Yeshiva University27, University of Pennsylvania28, Washington University in St. Louis29, National Institutes of Health30, University of North Carolina at Chapel Hill31
TL;DR: The mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types as mentioned in this paper.
Abstract: The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases
••
TL;DR: In this article, the exome sequences of 2,536 schizophrenia cases and 2,543 controls were analyzed and the authors demonstrated a polygenic burden primarily arising from rare (less than 1 in 10,000), disruptive mutations distributed across many genes.
Abstract: Schizophrenia is a common disease with a complex aetiology, probably involving multiple and heterogeneous genetic factors. Here, by analysing the exome sequences of 2,536 schizophrenia cases and 2,543 controls, we demonstrate a polygenic burden primarily arising from rare (less than 1 in 10,000), disruptive mutations distributed across many genes. Particularly enriched gene sets include the voltage-gated calcium ion channel and the signalling complex formed by the activity-regulated cytoskeleton-associated scaffold protein (ARC) of the postsynaptic density, sets previously implicated by genome-wide association and copy-number variation studies. Similar to reports in autism, targets of the fragile X mental retardation protein (FMRP, product of FMR1) are enriched for case mutations. No individual gene-based test achieves significance after correction for multiple testing and we do not detect any alleles of moderately low frequency (approximately 0.5 to 1 per cent) and moderately large effect. Taken together, these data suggest that population-based exome sequencing can discover risk alleles and complements established gene-mapping paradigms in neuropsychiatric disease.
••
University of North Carolina at Chapel Hill1, Buck Institute for Research on Aging2, University of California, San Francisco3, Broad Institute4, Pompeu Fabra University5, University of California, Santa Cruz6, Brown University7, Washington University in St. Louis8, University of Texas MD Anderson Cancer Center9, University of Southern California10, Sage Bionetworks11, BC Cancer Agency12, Catalan Institution for Research and Advanced Studies13, National Institutes of Health14
TL;DR: An integrative analysis using five genome-wide platforms and one proteomic platform on 3,527 specimens from 12 cancer types revealed a unified classification into 11 major subtypes, revealing several distinct cancer types found to converge into common subtypes.
••
TL;DR: Using quantitative proteomics, it is found that lenalidomide causes selective ubiquitination and degradation of two lymphoid transcription factors, IKZF1 and IKzF3, by the CRBN-CRL4 ubiquitin ligase, which are essential transcription factors in multiple myeloma.
Abstract: Lenalidomide is a drug with clinical efficacy in multiple myeloma and other B cell neoplasms, but its mechanism of action is unknown. Using quantitative proteomics, we found that lenalidomide causes selective ubiquitination and degradation of two lymphoid transcription factors, IKZF1 and IKZF3, by the CRBN-CRL4 ubiquitin ligase. IKZF1 and IKZF3 are essential transcription factors in multiple myeloma. A single amino acid substitution of IKZF3 conferred resistance to lenalidomide-induced degradation and rescued lenalidomide-induced inhibition of cell growth. Similarly, we found that lenalidomide-induced interleukin-2 production in T cells is due to depletion of IKZF1 and IKZF3. These findings reveal a previously unknown mechanism of action for a therapeutic agent: alteration of the activity of an E3 ubiquitin ligase, leading to selective degradation of specific targets.