Showing papers by "Wellcome Trust Centre for Human Genetics published in 2017"
••
TL;DR: The remarkable range of discoveriesGWASs has facilitated in population and complex-trait genetics, the biology of diseases, and translation toward new therapeutics are reviewed.
Abstract: Application of the experimental design of genome-wide association studies (GWASs) is now 10 years old (young), and here we review the remarkable range of discoveries it has facilitated in population and complex-trait genetics, the biology of diseases, and translation toward new therapeutics. We predict the likely discoveries in the next 10 years, when GWASs will be based on millions of samples with array data imputed to a large fully sequenced reference panel and on hundreds of thousands of samples with whole-genome sequencing data.
2,669 citations
••
TL;DR: The R/Bioconductor package scater is developed to facilitate rigorous pre‐processing, quality control, normalization and visualization of scRNA‐seq data and provides a convenient, flexible workflow to process raw sequencing reads into a high‐quality expression dataset ready for downstream analysis.
Abstract: Single-cell RNA sequencing (scRNA-seq) is increasingly used to study gene expression at the level of individual cells. However, preparing raw sequence data for further analysis is not a straightforward process. Biases, artifacts and other sources of unwanted variation are present in the data, requiring substantial time and effort to be spent on pre-processing, quality control (QC) and normalization.We have developed the R/Bioconductor package scater to facilitate rigorous pre-processing, quality control, normalization and visualization of scRNA-seq data. The package provides a convenient, flexible workflow to process raw sequencing reads into a high-quality expression dataset ready for downstream analysis. scater provides a rich suite of plotting tools for single-cell data and a flexible data structure that is compatible with existing tools and can be used as infrastructure for future software development.The open-source code, along with installation instructions, vignettes and case studies, is available through Bioconductor at http://bioconductor.org/packages/scater .davis@ebi.ac.uk.Supplementary data are available at Bioinformatics online.
1,093 citations
••
Kyriaki Michailidou1, Kyriaki Michailidou2, Sara Lindström3, Sara Lindström4 +393 more•Institutions (127)
TL;DR: A genome-wide association study of breast cancer in 122,977 cases and 105,974 controls of European ancestry and 14,068 cases and 13,104 controls of East Asian ancestry finds that heritability of Breast cancer due to all single-nucleotide polymorphisms in regulatory features was 2–5-fold enriched relative to the genome- wide average.
Abstract: Breast cancer risk is influenced by rare coding variants in susceptibility genes, such as BRCA1, and many common, mostly non-coding variants. However, much of the genetic contribution to breast cancer risk remains unknown. Here we report the results of a genome-wide association study of breast cancer in 122,977 cases and 105,974 controls of European ancestry and 14,068 cases and 13,104 controls of East Asian ancestry. We identified 65 new loci that are associated with overall breast cancer risk at P < 5 × 10-8. The majority of credible risk single-nucleotide polymorphisms in these loci fall in distal regulatory elements, and by integrating in silico data to predict target genes in breast cells at each locus, we demonstrate a strong overlap between candidate target genes and somatic driver genes in breast tumours. We also find that heritability of breast cancer due to all single-nucleotide polymorphisms in regulatory features was 2-5-fold enriched relative to the genome-wide average, with strong enrichment for particular transcription factor binding sites. These results provide further insight into genetic susceptibility to breast cancer and will improve the use of genetic risk scores for individualized screening and prevention.
1,014 citations
••
Wellcome Trust Sanger Institute1, Newcastle University2, Brigham and Women's Hospital3, Broad Institute4, University of Exeter5, Wellcome Trust Centre for Human Genetics6, Canterbury Christ Church University7, University of Edinburgh8, Torbay Hospital9, Royal Hospital for Sick Children10, Ninewells Hospital11, Guy's and St Thomas' NHS Foundation Trust12, University of Oxford13, John Radcliffe Hospital14, Norfolk and Norwich University Hospital15, King's College London16, University of the Witwatersrand17, University of Manchester18, Manchester Academic Health Science Centre19, University of Nottingham20
TL;DR: This work identified 25 new susceptibility loci, 3 of which contain integrin genes that encode proteins in pathways that have been identified as important therapeutic targets in inflammatory bowel disease and identified 3 associated variants that are correlated with expression changes in response to immune stimulus at two of these genes.
Abstract: Genetic association studies have identified 215 risk loci for inflammatory bowel disease, thereby uncovering fundamental aspects of its molecular biology. We performed a genome-wide association study of 25,305 individuals and conducted a meta-analysis with published summary statistics, yielding a total sample size of 59,957 subjects. We identified 25 new susceptibility loci, 3 of which contain integrin genes that encode proteins in pathways that have been identified as important therapeutic targets in inflammatory bowel disease. The associated variants are correlated with expression changes in response to immune stimulus at two of these genes (ITGA4 and ITGB8) and at previously implicated loci (ITGAL and ICAM1). In all four cases, the expression-increasing allele also increases disease risk. We also identified likely causal missense variants in a gene implicated in primary immune deficiency, PLCG2, and a negative regulator of inflammation, SLAMF8. Our results demonstrate that new associations at common variants continue to identify genes relevant to therapeutic target identification and prioritization.
813 citations
••
Christian R. Marshall, Daniel P. Howrigan1, Daniel P. Howrigan2, Daniele Merico +326 more•Institutions (98)
TL;DR: In this article, a centralized analysis pipeline was applied to a SCZ cohort of 21,094 cases and 20,227 controls, and a global enrichment of copy number variants (CNVs) was observed in cases (odds ratio (OR) = 1.11, P = 5.7 × 10-15), which persisted after excluding loci implicated in previous studies.
Abstract: Copy number variants (CNVs) have been strongly implicated in the genetic etiology of schizophrenia (SCZ). However, genome-wide investigation of the contribution of CNV to risk has been hampered by limited sample sizes. We sought to address this obstacle by applying a centralized analysis pipeline to a SCZ cohort of 21,094 cases and 20,227 controls. A global enrichment of CNV burden was observed in cases (odds ratio (OR) = 1.11, P = 5.7 × 10-15), which persisted after excluding loci implicated in previous studies (OR = 1.07, P = 1.7 × 10-6). CNV burden was enriched for genes associated with synaptic function (OR = 1.68, P = 2.8 × 10-11) and neurobehavioral phenotypes in mouse (OR = 1.18, P = 7.3 × 10-5). Genome-wide significant evidence was obtained for eight loci, including 1q21.1, 2p16.3 (NRXN1), 3q29, 7q11.2, 15q13.3, distal 16p11.2, proximal 16p11.2 and 22q11.2. Suggestive support was found for eight additional candidate susceptibility and protective loci, which consisted predominantly of CNVs mediated by nonallelic homologous recombination.
774 citations
••
Wellcome Trust Centre for Human Genetics1, Imperial College London2, University of Oulu3, Agency for Science, Technology and Research4, National Institutes of Health5, King's College London6, Ealing Hospital7, National University of Singapore8, University of Turin9, University Medical Center Groningen10, University of Tartu11, University of Bristol12, University College London13, University of Eastern Finland14, Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico15, University of Kiel16, Leiden University Medical Center17, Dresden University of Technology18, University of Düsseldorf19, University of Surrey20, Erasmus University Rotterdam21, Max Healthcare22, Technische Universität München23, University of Naples Federico II24, Wellcome Trust Sanger Institute25, Science for Life Laboratory26, University of Ulm27, Ludwig Maximilian University of Munich28, University of Kelaniya29, Institute of Cancer Research30, Queen Mary University of London31, King Abdulaziz University32, Massachusetts Institute of Technology33, Health Protection Agency34, Churchill Hospital35, University of Oxford36, Imperial College Healthcare37
TL;DR: In this article, the authors used epigenome-wide association to show that body mass index (BMI), a key measure of adiposity, is associated with widespread changes in DNA methylation.
Abstract: Approximately 1.5 billion people worldwide are overweight or affected by obesity, and are at risk of developing type 2 diabetes, cardiovascular disease and related metabolic and inflammatory disturbances1,2. Although the mechanisms linking adiposity to associated clinical conditions are poorly understood, recent studies suggest that adiposity may influence DNA methylation3,4,5,6, a key regulator of gene expression and molecular phenotype7. Here we use epigenome-wide association to show that body mass index (BMI; a key measure of adiposity) is associated with widespread changes in DNA methylation (187 genetic loci with P < 1 × 10−7, range P = 9.2 × 10−8 to 6.0 × 10−46; n = 10,261 samples). Genetic association analyses demonstrate that the alterations in DNA methylation are predominantly the consequence of adiposity, rather than the cause. We find that methylation loci are enriched for functional genomic features in multiple tissues (P < 0.05), and show that sentinel methylation markers identify gene expression signatures at 38 loci (P < 9.0 × 10−6, range P = 5.5 × 10−6 to 6.1 × 10−35, n = 1,785 samples). The methylation loci identify genes involved in lipid and lipoprotein metabolism, substrate transport and inflammatory pathways. Finally, we show that the disturbances in DNA methylation predict future development of type 2 diabetes (relative risk per 1 standard deviation increase in methylation risk score: 2.3 (2.07–2.56); P = 1.1 × 10−54). Our results provide new insights into the biologic pathways influenced by adiposity, and may enable development of new strategies for prediction and prevention of type 2 diabetes and other adverse clinical consequences of obesity.
667 citations
••
TL;DR: This article conducted a meta-analysis of genome-wide association data from 26,676 T2D case and 132,532 control subjects of European ancestry after imputation using the 1000 Genomes multiethnic reference panel.
Abstract: To characterize type 2 diabetes (T2D)-associated variation across the allele frequency spectrum, we conducted a meta-analysis of genome-wide association data from 26,676 T2D case and 132,532 control subjects of European ancestry after imputation using the 1000 Genomes multiethnic reference panel Promising association signals were followed up in additional data sets (of 14,545 or 7,397 T2D case and 38,994 or 71,604 control subjects) We identified 13 novel T2D-associated loci (P < 5 × 10-8), including variants near the GLP2R, GIP, and HLA-DQA1 genes Our analysis brought the total number of independent T2D associations to 128 distinct signals at 113 loci Despite substantially increased sample size and more complete coverage of low-frequency variation, all novel associations were driven by common single nucleotide variants Credible sets of potentially causal variants were generally larger than those based on imputation with earlier reference panels, consistent with resolution of causal signals to common risk haplotypes Stratification of T2D-associated loci based on T2D-related quantitative trait associations revealed tissue-specific enrichment of regulatory annotations in pancreatic islet enhancers for loci influencing insulin secretion and in adipocytes, monocytes, and hepatocytes for insulin action-associated loci These findings highlight the predominant role played by common variants of modest effect and the diversity of biological mechanisms influencing T2D pathophysiology
601 citations
••
TL;DR: Improved analytical approaches that evaluate which genes and variant classes are interpretable are outlined and it is proposed that these will increase the clinical utility of testing across a range of Mendelian diseases.
532 citations
••
University of Leicester1, National Institute for Health Research2, Wellcome Trust Centre for Human Genetics3, University of Oxford4, University of Cambridge5, Queen Mary University of London6, Technische Universität München7, Stanford University8, Icahn School of Medicine at Mount Sinai9, Imperial College London10, London North West Healthcare NHS Trust11, Imperial College Healthcare12, University of Dundee13, University of Leeds14, Massachusetts Institute of Technology15, Tartu University Hospital16, University of Ioannina17, Umeå University18, Harvard University19, Lund University20, Peking Union Medical College21, University College London22, University of Tampere23, Vanderbilt University24, Synlab Group25, Heidelberg University26, Medical University of Graz27, University of Ottawa28, University of Tartu29, Lebanese American University30, King Abdulaziz University31, Central Manchester University Hospitals NHS Foundation Trust32, University of Manchester33, National Institutes of Health34, St Bartholomew's Hospital35, Manchester Academic Health Science Centre36, Wellcome Trust Sanger Institute37, University of Lübeck38, Harokopio University39, Karolinska University Hospital40
TL;DR: This approach identified 13 new loci at genome-wide significance, 12 of which were on the previous list of loci meeting the 5% FDR threshold, thus providing strong support that the remaining loci identified by FDR represent genuine signals.
Abstract: Genome-wide association studies (GWAS) in coronary artery disease (CAD) had identified 66 loci at 'genome-wide significance' (P < 5 × 10-8) at the time of this analysis, but a much larger number of putative loci at a false discovery rate (FDR) of 5% (refs. 1,2,3,4). Here we leverage an interim release of UK Biobank (UKBB) data to evaluate the validity of the FDR approach. We tested a CAD phenotype inclusive of angina (SOFT; ncases = 10,801) as well as a stricter definition without angina (HARD; ncases = 6,482) and selected cases with the former phenotype to conduct a meta-analysis using the two most recent CAD GWAS. This approach identified 13 new loci at genome-wide significance, 12 of which were on our previous list of loci meeting the 5% FDR threshold, thus providing strong support that the remaining loci identified by FDR represent genuine signals. The 304 independent variants associated at 5% FDR in this study explain 21.2% of CAD heritability and identify 243 loci that implicate pathways in blood vessel morphogenesis as well as lipid metabolism, nitric oxide signaling and inflammation.
529 citations
••
TL;DR: It is shown that, relative to healthy controls, inflamed intestinal tissues from patients with IBD express high amounts of the cytokine oncostatin M (OSM) and its receptor (OSMR), which correlate closely with histopathological disease severity.
Abstract: Inflammatory bowel diseases (IBD), including Crohn's disease (CD) and ulcerative colitis (UC), are complex chronic inflammatory conditions of the gastrointestinal tract that are driven by perturbed cytokine pathways. Anti-tumor necrosis factor-α (TNF) antibodies are mainstay therapies for IBD. However, up to 40% of patients are nonresponsive to anti-TNF agents, which makes the identification of alternative therapeutic targets a priority. Here we show that, relative to healthy controls, inflamed intestinal tissues from patients with IBD express high amounts of the cytokine oncostatin M (OSM) and its receptor (OSMR), which correlate closely with histopathological disease severity. The OSMR is expressed in nonhematopoietic, nonepithelial intestinal stromal cells, which respond to OSM by producing various proinflammatory molecules, including interleukin (IL)-6, the leukocyte adhesion factor ICAM1, and chemokines that attract neutrophils, monocytes, and T cells. In an animal model of anti-TNF-resistant intestinal inflammation, genetic deletion or pharmacological blockade of OSM significantly attenuates colitis. Furthermore, according to an analysis of more than 200 patients with IBD, including two cohorts from phase 3 clinical trials of infliximab and golimumab, high pretreatment expression of OSM is strongly associated with failure of anti-TNF therapy. OSM is thus a potential biomarker and therapeutic target for IBD, and has particular relevance for anti-TNF-resistant patients.
486 citations
••
TL;DR: It is found that beta-thalassemia trait carriers displayed lower TC and were protected from coronary artery disease (CAD), and only some mechanisms of lowering LDL-C appeared to increase risk for type 2 diabetes (T2D); and TG-lowering alleles involved in hepatic production of TG-rich lipoproteins tracked with higher liver fat, higher risk for T2D, and lower risk for CAD.
Abstract: We screened variants on an exome-focused genotyping array in >300,000 participants (replication in >280,000 participants) and identified 444 independent variants in 250 loci significantly associated with total cholesterol (TC), high-density-lipoprotein cholesterol (HDL-C), low-density-lipoprotein cholesterol (LDL-C), and/or triglycerides (TG). At two loci (JAK2 and A1CF), experimental analysis in mice showed lipid changes consistent with the human data. We also found that: (i) beta-thalassemia trait carriers displayed lower TC and were protected from coronary artery disease (CAD); (ii) excluding the CETP locus, there was not a predictable relationship between plasma HDL-C and risk for age-related macular degeneration; (iii) only some mechanisms of lowering LDL-C appeared to increase risk for type 2 diabetes (T2D); and (iv) TG-lowering alleles involved in hepatic production of TG-rich lipoproteins (TM6SF2 and PNPLA3) tracked with higher liver fat, higher risk for T2D, and lower risk for CAD, whereas TG-lowering alleles involved in peripheral lipolysis (LPL and ANGPTL4) had no effect on liver fat but decreased risks for both T2D and CAD.
••
Queen Mary University of London1, Imperial College London2, University of Ioannina3, National Institute for Health Research4, University of Cambridge5, National Institutes of Health6, University of Liverpool7, Washington University in St. Louis8, University College London9, University of Bristol10, Agency for Science, Technology and Research11, Harvard University12, University of Groningen13, University of Edinburgh14, Bayer HealthCare Pharmaceuticals15, King's College London16, Tulane University17, National University of Singapore18, University of Tartu19, Wellcome Trust Centre for Human Genetics20, University Medical Center Groningen21, University of Glasgow22, Royal College of Surgeons in Ireland23, University of Dundee24, University College Dublin25, Broad Institute26, University of Pennsylvania27, Brigham and Women's Hospital28, Johns Hopkins University School of Medicine29, University of Oxford30, Imperial College Healthcare31, Ealing Hospital32, Manchester Academic Health Science Centre33, University of Manchester34, Glenfield Hospital35, University of Leicester36, Geneva College37
TL;DR: In this paper, the authors report genetic association of blood pressure (systolic, diastolic, pulse pressure) among UK Biobank participants of European ancestry with independent replication in other cohorts, and robust validation of 107 independent loci.
Abstract: Elevated blood pressure is the leading heritable risk factor for cardiovascular disease worldwide. We report genetic association of blood pressure (systolic, diastolic, pulse pressure) among UK Biobank participants of European ancestry with independent replication in other cohorts, and robust validation of 107 independent loci. We also identify new independent variants at 11 previously reported blood pressure loci. In combination with results from a range of in silico functional analyses and wet bench experiments, our findings highlight new biological pathways for blood pressure regulation enriched for genes expressed in vascular tissues and identify potential therapeutic targets for hypertension. Results from genetic risk score models raise the possibility of a precision medicine approach through early lifestyle intervention to offset the impact of blood pressure-raising genetic variants on future cardiovascular disease risk.
••
Harvard University1, Broad Institute2, University of Liège3, University of Oxford4, Wellcome Trust Sanger Institute5, Montreal Heart Institute6, University of Southern Denmark7, Katholieke Universiteit Leuven8, John Radcliffe Hospital9, Wellcome Trust Centre for Human Genetics10, Ikerbasque11, Karolinska Institutet12, Illumina13, University of Kiel14, Örebro University15, Cedars-Sinai Medical Center16, Lancaster University17, University of Western Australia18, Western General Hospital19, Norwegian University of Life Sciences20, Wellcome Trust21, University of Groningen22, University Medical Center Groningen23, University of Pittsburgh24, King's College London25, University of the Witwatersrand26, Université de Montréal27, Yale University28
TL;DR: The results of this study suggest that high-resolution fine-mapping in large samples can convert many discoveries from genome-wide association studies into statistically convincing causal variants, providing a powerful substrate for experimental elucidation of disease mechanisms.
Abstract: Inflammatory bowel diseases are chronic gastrointestinal inflammatory disorders that affect millions of people worldwide. Genome-wide association studies have identified 200 inflammatory bowel disease-associated loci, but few have been conclusively resolved to specific functional variants. Here we report fine-mapping of 94 inflammatory bowel disease loci using high-density genotyping in 67,852 individuals. We pinpoint 18 associations to a single causal variant with greater than 95% certainty, and an additional 27 associations to a single variant with greater than 50% certainty. These 45 variants are significantly enriched for protein-coding changes (n = 13), direct disruption of transcription-factor binding sites (n = 3), and tissue-specific epigenetic marks (n = 10), with the last category showing enrichment in specific immune cells among associations stronger in Crohn's disease and in gut mucosa among associations stronger in ulcerative colitis. The results of this study suggest that high-resolution fine-mapping in large samples can convert many discoveries from genome-wide association studies into statistically convincing causal variants, providing a powerful substrate for experimental elucidation of disease mechanisms.
••
TL;DR: The findings support the notion that limited storage capacity of peripheral adipose tissue is an important etiological component in insulin-resistant cardiometabolic disease and highlight genes and mechanisms underpinning this link.
Abstract: Insulin resistance is a key mediator of obesity-related cardiometabolic disease, yet the mechanisms underlying this link remain obscure. Using an integrative genomic approach, we identify 53 genomic regions associated with insulin resistance phenotypes (higher fasting insulin levels adjusted for BMI, lower HDL cholesterol levels and higher triglyceride levels) and provide evidence that their link with higher cardiometabolic risk is underpinned by an association with lower adipose mass in peripheral compartments. Using these 53 loci, we show a polygenic contribution to familial partial lipodystrophy type 1, a severe form of insulin resistance, and highlight shared molecular mechanisms in common/mild and rare/severe insulin resistance. Population-level genetic analyses combined with experiments in cellular models implicate CCDC92, DNAH10 and L3MBTL3 as previously unrecognized molecules influencing adipocyte differentiation. Our findings support the notion that limited storage capacity of peripheral adipose tissue is an important etiological component in insulin-resistant cardiometabolic disease and highlight genes and mechanisms underpinning this link.
••
TL;DR: It is likely that longer telomeres increase risk for several cancers but reduce risk for some non-neoplastic diseases, including cardiovascular diseases, as well as single nucleotide polymorphisms (SNPs) that are strongly associated with telomere length in the general population.
Abstract: IMPORTANCE: The causal direction and magnitude of the association between telomere length and incidence of cancer and non-neoplastic diseases is uncertain owing to the susceptibility of observational studies to confounding and reverse causation. OBJECTIVE: To conduct a Mendelian randomization study, using germline genetic variants as instrumental variables, to appraise the causal relevance of telomere length for risk of cancer and non-neoplastic diseases. DATA SOURCES: Genomewide association studies (GWAS) published up to January 15, 2015. STUDY SELECTION: GWAS of noncommunicable diseases that assayed germline genetic variation and did not select cohort or control participants on the basis of preexisting diseases. Of 163 GWAS of noncommunicable diseases identified, summary data from 103 were available. DATA EXTRACTION AND SYNTHESIS: Summary association statistics for single nucleotide polymorphisms (SNPs) that are strongly associated with telomere length in the general population. MAIN OUTCOMES AND MEASURES: Odds ratios (ORs) and 95% confidence intervals (CIs) for disease per standard deviation (SD) higher telomere length due to germline genetic variation. RESULTS: Summary data were available for 35 cancers and 48 non-neoplastic diseases, corresponding to 420 081 cases (median cases, 2526 per disease) and 1 093 105 controls (median, 6789 per disease). Increased telomere length due to germline genetic variation was generally associated with increased risk for site-specific cancers. The strongest associations (ORs [95% CIs] per 1-SD change in genetically increased telomere length) were observed for glioma, 5.27 (3.15-8.81); serous low-malignant-potential ovarian cancer, 4.35 (2.39-7.94); lung adenocarcinoma, 3.19 (2.40-4.22); neuroblastoma, 2.98 (1.92-4.62); bladder cancer, 2.19 (1.32-3.66); melanoma, 1.87 (1.55-2.26); testicular cancer, 1.76 (1.02-3.04); kidney cancer, 1.55 (1.08-2.23); and endometrial cancer, 1.31 (1.07-1.61). Associations were stronger for rarer cancers and at tissue sites with lower rates of stem cell division. There was generally little evidence of association between genetically increased telomere length and risk of psychiatric, autoimmune, inflammatory, diabetic, and other non-neoplastic diseases, except for coronary heart disease (OR, 0.78 [95% CI, 0.67-0.90]), abdominal aortic aneurysm (OR, 0.63 [95% CI, 0.49-0.81]), celiac disease (OR, 0.42 [95% CI, 0.28-0.61]) and interstitial lung disease (OR, 0.09 [95% CI, 0.05-0.15]). CONCLUSIONS AND RELEVANCE: It is likely that longer telomeres increase risk for several cancers but reduce risk for some non-neoplastic diseases, including cardiovascular diseases.
••
TL;DR: Both PacBio and ONT sequencing are suitable for full-length single-molecule transcriptome analysis as this first use of ONT reads in a Hybrid-Seq analysis has shown.
Abstract: Background: Given the demonstrated utility of Third Generation Sequencing [Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT)] long reads in many studies, a comprehensive analysis and comparison of their data quality and applications is in high demand. Methods: Based on the transcriptome sequencing data from human embryonic stem cells, we analyzed multiple data features of PacBio and ONT, including error pattern, length, mappability and technical improvements over previous platforms. We also evaluated their application to transcriptome analyses, such as isoform identification and quantification and characterization of transcriptome complexity, by comparing the performance of size-selected PacBio, non-size-selected ONT and their corresponding Hybrid-Seq strategies (PacBio+Illumina and ONT+Illumina). Results: PacBio shows overall better data quality, while ONT provides a higher yield. As with data quality, PacBio performs marginally better than ONT in most aspects for both long reads only and Hybrid-Seq strategies in transcriptome analysis. In addition, Hybrid-Seq shows superior performance over long reads only in most transcriptome analyses. Conclusions: Both PacBio and ONT sequencing are suitable for full-length single-molecule transcriptome analysis. As this first use of ONT reads in a Hybrid-Seq analysis has shown, both PacBio and ONT can benefit from a combined Illumina strategy. The tools and analytical methods developed here provide a resource for future applications and evaluations of these rapidly-changing technologies.
••
TL;DR: Analysis of 334,652 SNVs revealed that the majority of these variants are de novo and cell-line mutations or reside within previously unidentified duplications and deletions, which are a resource for objective assessment of the accuracy of variant calls throughout genomes.
Abstract: Improvement of variant calling in next-generation sequence data requires a comprehensive, genome-wide catalog of high-confidence variants called in a set of genomes for use as a benchmark. We generated deep, whole-genome sequence data of 17 individuals in a three-generation pedigree and called variants in each genome using a range of currently available algorithms. We used haplotype transmission information to create a phased "Platinum" variant catalog of 4.7 million single-nucleotide variants (SNVs) plus 0.7 million small (1-50 bp) insertions and deletions (indels) that are consistent with the pattern of inheritance in the parents and 11 children of this pedigree. Platinum genotypes are highly concordant with the current catalog of the National Institute of Standards and Technology for both SNVs (>99.99%) and indels (99.92%) and add a validated truth catalog that has 26% more SNVs and 45% more indels. Analysis of 334,652 SNVs that were consistent between informatics pipelines yet inconsistent with haplotype transmission ("nonplatinum") revealed that the majority of these variants are de novo and cell-line mutations or reside within previously unidentified duplications and deletions. The reference materials from this study are a resource for objective assessment of the accuracy of variant calls throughout genomes.
••
TL;DR: This multiancestry study recommends investigation of the possible benefits of screening for the G6PD genotype along with using HbA1c to diagnose T2D in populations of African ancestry or groups where G 6PD deficiency is common, and investigates the effect of genetic risk-scores comprised of erythrocytic or glycemic variants on incident diabetes prediction and on prevalent diabetes screening performance.
Abstract: Background: Glycated hemoglobin (HbA1c) is used to diagnose type 2 diabetes (T2D) and assess glycemic control in patients with diabetes. Previous genome-wide association studies (GWAS) have identif ...
••
University College London1, Imperial College London2, Clinical Trial Service Unit3, University of Oxford4, St Bartholomew's Hospital5, Wellcome Trust Centre for Human Genetics6, University of Glasgow7, Universidade Federal de Pelotas8, UCL Institute of Child Health9, University of South Australia10, European Bioinformatics Institute11, Charité12, University of Lübeck13, Max Planck Society14, Innsbruck Medical University15, Bradford Royal Infirmary16, University of Bristol17, St George's, University of London18, University of Edinburgh19, University of Lausanne20, University of Nicosia21, Cyprus University of Technology22, Utrecht University23, University of Turin24, Cancer Epidemiology Unit25, University of Cambridge26, Russian Academy27, Jagiellonian University28, Lithuanian University of Health Sciences29, University of Copenhagen30, Marshfield Clinic31, Children's Hospital of Philadelphia32, Group Health Research Institute33, Mayo Clinic34, Vanderbilt University35, George Washington University36, University of Newcastle37, Population Health Research Institute38, University Medical Center Groningen39, Leiden University Medical Center40, Uppsala University41, Science for Life Laboratory42, Stanford University43, Erasmus University Medical Center44, Greifswald University Hospital45, University of Regensburg46, University of London47, Robertson Centre for Biostatistics48, university of lille49, French Institute of Health and Medical Research50, University of Nantes51, University of Essex52, Brigham and Women's Hospital53, Fred Hutchinson Cancer Research Center54, University of Colorado Denver55, Pennsylvania State University56, Geisinger Health System57, University of Pennsylvania58
TL;DR: PCSK9 variants associated with lower LDL cholesterol were also associated with circulating higher fasting glucose concentration, bodyweight, and waist-to-hip ratio, and an increased risk of type 2 diabetes.
••
TL;DR: The exo-E415G SNP and plasmepsin 2-3 amplification are markers of piperquine resistance and dihydroartemisinin-piperaquine failures in Cambodia, and can help monitor the spread of these phenotypes into other countries of the Greater Mekong subregion, and elucidate the mechanism of p Piperaquine resistance.
Abstract: Summary Background As the prevalence of artemisinin-resistant Plasmodium falciparum malaria increases in the Greater Mekong subregion, emerging resistance to partner drugs in artemisinin combination therapies seriously threatens global efforts to treat and eliminate this disease. Molecular markers that predict failure of artemisinin combination therapy are urgently needed to monitor the spread of partner drug resistance, and to recommend alternative treatments in southeast Asia and beyond. Methods We did a genome-wide association study of 297 P falciparum isolates from Cambodia to investigate the relationship of 11 630 exonic single-nucleotide polymorphisms (SNPs) and 43 copy number variations (CNVs) with in-vitro piperaquine 50% inhibitory concentrations (IC 50 s), and tested whether these genetic variants are markers of treatment failure with dihydroartemisinin–piperaquine. We then did a survival analysis of 133 patients to determine whether candidate molecular markers predicted parasite recrudescence following dihydroartemisinin–piperaquine treatment. Findings Piperaquine IC 50 s increased significantly from 2011 to 2013 in three Cambodian provinces (2011 vs 2013 median IC 50 s: 20·0 nmol/L [IQR 13·7–29·0] vs 39·2 nmol/L [32·8–48·1] for Ratanakiri, 19·3 nmol/L [15·1–26·2] vs 66·2 nmol/L [49·9–83·0] for Preah Vihear, and 19·6 nmol/L [11·9–33·9] vs 81·1 nmol/L [61·3–113·1] for Pursat; all p≤10 −3 ; Kruskal-Wallis test). Genome-wide analysis of SNPs identified a chromosome 13 region that associates with raised piperaquine IC 50 s. A non-synonymous SNP (encoding a Glu415Gly substitution) in this region, within a gene encoding an exonuclease, associates with parasite recrudescence following dihydroartemisinin–piperaquine treatment. Genome-wide analysis of CNVs revealed that a single copy of the mdr1 gene on chromosome 5 and a novel amplification of the plasmepsin 2 and plasmepsin 3 genes on chromosome 14 also associate with raised piperaquine IC 50 s. After adjusting for covariates, both exo-E415G and plasmepsin 2–3 markers significantly associate (p=3·0 × 10 −8 and p=1·7 × 10 −7 , respectively) with decreased treatment efficacy (survival rates 0·38 [95% CI 0·25–0·51] and 0·41 [0·28–0·53], respectively). Interpretation The exo-E415G SNP and plasmepsin 2–3 amplification are markers of piperaquine resistance and dihydroartemisinin–piperaquine failures in Cambodia, and can help monitor the spread of these phenotypes into other countries of the Greater Mekong subregion, and elucidate the mechanism of piperaquine resistance. Since plasmepsins are involved in the parasite's haemoglobin-to-haemozoin conversion pathway, targeted by related antimalarials, plasmepsin 2–3 amplification probably mediates piperaquine resistance. Funding Intramural Research Program of the US National Institute of Allergy and Infectious Diseases, National Institutes of Health, Wellcome Trust, Bill & Melinda Gates Foundation, Medical Research Council, and UK Department for International Development.
••
TL;DR: A low-cost method of DNA extraction directly from patient samples for M. tuberculosis WGS is demonstrated, providing a potential solution to the problem of variable amounts of M.culosis DNA in direct samples.
Abstract: Routine full characterization of Mycobacterium tuberculosis is culture based, taking many weeks. Whole-genome sequencing (WGS) can generate antibiotic susceptibility profiles to inform treatment, augmented with strain information for global surveillance; such data could be transformative if provided at or near the point of care. We demonstrate a low-cost method of DNA extraction directly from patient samples for M. tuberculosis WGS. We initially evaluated the method by using the Illumina MiSeq sequencer (40 smear-positive respiratory samples obtained after routine clinical testing and 27 matched liquid cultures). M. tuberculosis was identified in all 39 samples from which DNA was successfully extracted. Sufficient data for antibiotic susceptibility prediction were obtained from 24 (62%) samples; all results were concordant with reference laboratory phenotypes. Phylogenetic placement was concordant between direct and cultured samples. With Illumina MiSeq/MiniSeq, the workflow from patient sample to results can be completed in 44/16 h at a reagent cost of £96/£198 per sample. We then employed a nonspecific PCR-based library preparation method for sequencing on an Oxford Nanopore Technologies MinION sequencer. We applied this to cultured Mycobacterium bovis strain BCG DNA and to combined culture-negative sputum DNA and BCG DNA. For flow cell version R9.4, the estimated turnaround time from patient to identification of BCG, detection of pyrazinamide resistance, and phylogenetic placement was 7.5 h, with full susceptibility results 5 h later. Antibiotic susceptibility predictions were fully concordant. A critical advantage of MinION is the ability to continue sequencing until sufficient coverage is obtained, providing a potential solution to the problem of variable amounts of M. tuberculosis DNA in direct samples.
••
National Institute for Health Research1, University of Oxford2, Public Health England3, Imperial College London4, Wellcome Trust Centre for Human Genetics5, Leeds General Infirmary6, Leeds Teaching Hospitals NHS Trust7, Tufts University8, Cubist Pharmaceuticals9, Royal Free London NHS Foundation Trust10
TL;DR: Limiting fluoroquinolone prescribing appears to explain the decline in incidence of C difficile infections, above other measures, in Oxfordshire and Leeds, England.
Abstract: Summary Background The control of Clostridium difficile infections is an international clinical challenge. The incidence of C difficile in England declined by roughly 80% after 2006, following the implementation of national control policies; we tested two hypotheses to investigate their role in this decline. First, if C difficile infection declines in England were driven by reductions in use of particular antibiotics, then incidence of C difficile infections caused by resistant isolates should decline faster than that caused by susceptible isolates across multiple genotypes. Second, if C difficile infection declines were driven by improvements in hospital infection control, then transmitted (secondary) cases should decline regardless of susceptibility. Methods Regional (Oxfordshire and Leeds, UK) and national data for the incidence of C difficile infections and antimicrobial prescribing data (1998–2014) were combined with whole genome sequences from 4045 national and international C difficile isolates. Genotype (multilocus sequence type) and fluoroquinolone susceptibility were determined from whole genome sequences. The incidence of C difficile infections caused by fluoroquinolone-resistant and fluoroquinolone-susceptible isolates was estimated with negative-binomial regression, overall and per genotype. Selection and transmission were investigated with phylogenetic analyses. Findings National fluoroquinolone and cephalosporin prescribing correlated highly with incidence of C difficile infections (cross-correlations >0·88), by contrast with total antibiotic prescribing (cross-correlations C difficile decline was driven by elimination of fluoroquinolone-resistant isolates (approximately 67% of Oxfordshire infections in September, 2006, falling to approximately 3% in February, 2013; annual incidence rate ratio 0·52, 95% CI 0·48–0·56 vs fluoroquinolone-susceptible isolates: 1·02, 0·97–1·08). C difficile infections caused by fluoroquinolone-resistant isolates declined in four distinct genotypes (p 0·2). Interpretation Restricting fluoroquinolone prescribing appears to explain the decline in incidence of C difficile infections, above other measures, in Oxfordshire and Leeds, England. Antimicrobial stewardship should be a central component of C difficile infection control programmes. Funding UK Clinical Research Collaboration (Medical Research Council, Wellcome Trust, National Institute for Health Research); NIHR Oxford Biomedical Research Centre; NIHR Health Protection Research Unit on Healthcare Associated Infection and Antimicrobial Resistance (Oxford University in partnership with Public Health England [PHE]), and on Modelling Methodology (Imperial College, London in partnership with PHE); and the Health Innovation Challenge Fund.
••
Mariaelisa Graff1, Robert A. Scott2, Anne E. Justice1, Kristin L. Young1 +346 more•Institutions (101)
TL;DR: In additional genome-wide meta-analyses adjusting for PA and interaction with PA, 11 novel adiposity loci are identified, suggesting that accounting for PA or other environmental factors that contribute to variation in adiposity may facilitate gene discovery.
Abstract: Physical activity (PA) may modify the genetic effects that give rise to increased risk of obesity. To identify adiposity loci whose effects are modified by PA, we performed genome-wide interaction meta-analyses of BMI and BMI-adjusted waist circumference and waist-hip ratio from up to 200,452 adults of European (n = 180,423) or other ancestry (n = 20,029). We standardized PA by categorizing it into a dichotomous variable where, on average, 23% of participants were categorized as inactive and 77% as physically active. While we replicate the interaction with PA for the strongest known obesity-risk locus in the FTO gene, of which the effect is attenuated by ~30% in physically active individuals compared to inactive individuals, we do not identify additional loci that are sensitive to PA. In additional genome-wide meta-analyses adjusting for PA and interaction with PA, we identify 11 novel adiposity loci, suggesting that accounting for PA or other environmental factors that contribute to variation in adiposity may facilitate gene discovery.
••
Roger L. Milne1, Roger L. Milne2, Karoline Kuchenbaecker3, Karoline Kuchenbaecker4 +509 more•Institutions (169)
TL;DR: A genome-wide association study (GWAS) of predominantly estrogen receptor (ER)-positive disease and BRCA1 mutation carrier GWAS observed consistent associations with ER-negative disease for 105 susceptibility variants identified by other studies, which explain approximately 16% of the familial risk of this breast cancer subtype.
Abstract: Most common breast cancer susceptibility variants have been identified through genome-wide association studies (GWAS) of predominantly estrogen receptor (ER)-positive disease. We conducted a GWAS using 21,468 ER-negative cases and 100,594 controls combined with 18,908 BRCA1 mutation carriers (9,414 with breast cancer), all of European origin. We identified independent associations at P < 5 × 10-8 with ten variants at nine new loci. At P < 0.05, we replicated associations with 10 of 11 variants previously reported in ER-negative disease or BRCA1 mutation carrier GWAS and observed consistent associations with ER-negative disease for 105 susceptibility variants identified by other studies. These 125 variants explain approximately 16% of the familial risk of this breast cancer subtype. There was high genetic correlation (0.72) between risk of ER-negative breast cancer and breast cancer risk for BRCA1 mutation carriers. These findings may lead to improved risk prediction and inform further fine-mapping and functional work to better understand the biological basis of ER-negative breast cancer.
••
TL;DR: This study provides a method for the molecular classification of patients with sepsis to four different endotypes upon ICU admission and established candidate biomarkers for the endotypes to allow identification of patient endotypes in clinical practice.
••
TL;DR: It is suggested that until further evidence on the efficacy or otherwise of surveillance are published, patients with sessile serrated lesions that appear associated with a higher risk of future neoplasia or colorectal cancer should be offered a one-off colonoscopic surveillance examination at 3 years.
Abstract: Serrated polyps have been recognised in the last decade
as important premalignant lesions accounting for
between 15% and 30% of colorectal cancers. There is
therefore a clinical need for guidance on how to manage
these lesions; however, the evidence base is limited. A
working group was commission by the British Society of
Gastroenterology (BSG) Endoscopy section to review the
available evidence and develop a position statement to
provide clinical guidance until the evidence becomes
available to support a formal guideline. The scope of the
position statement was wide-ranging and included:
evidence that serrated lesions have premalignant
potential; detection and resection of serrated lesions;
surveillance strategies after detection of serrated lesions;
special situations—serrated polyposis syndrome
(including surgery) and serrated lesions in colitis;
education, audit and benchmarks and research
questions. Statements on these issues were proposed
where the evidence was deemed sufficient, and reevaluated
modified via a Delphi process until >80%
agreement was reached. The Grading of
Recommendations, Assessment, Development and
Evaluations (GRADE) tool was used to assess the
strength of evidence and strength of recommendation
for finalised statements. Key recommendation: we
suggest that until further evidence on the efficacy or
otherwise of surveillance are published, patients with
sessile serrated lesions (SSLs) that appear associated
with a higher risk of future neoplasia or colorectal
cancer (SSLs ≥10 mm or serrated lesions harbouring
dysplasia including traditional serrated adenomas)
should be offered a one-off colonoscopic surveillance
examination at 3 years (weak recommendation, low
quality evidence, 90% agreement).
••
University of Leicester1, University of Lübeck2, Queen Mary University of London3, Washington University in St. Louis4, Technische Universität München5, Uppsala University6, University of Tartu7, Icahn School of Medicine at Mount Sinai8, Vanderbilt University Medical Center9, University of Wisconsin–Milwaukee10, Fred Hutchinson Cancer Research Center11, University of Michigan12, Université de Montréal13, University of Oxford14, Harvard University15, University of Veterinary Medicine Vienna16, Wellcome Trust Centre for Human Genetics17, University of Dundee18, Humanitas University19, University of Kiel20, University of Bonn21, Norwegian University of Science and Technology22, Umeå University23, University of Verona24, Broad Institute25, Lund University26, University of Edinburgh27, National Institutes of Health28, University of Ottawa29, Montreal Heart Institute30, King Abdulaziz University31, Merck & Co.32, Utrecht University33, University College London34, Ohio State University35, Ludwig Maximilian University of Munich36, University of Cambridge37, Robertson Centre for Biostatistics38, Leiden University Medical Center39, Lille University of Science and Technology40, Copenhagen University Hospital41, University of Toulouse42, University of Pennsylvania43, British Heart Foundation44, University of Strasbourg45, University of Leeds46, Duke University47, Columbia University48, University of Washington49, Glenfield Hospital50
TL;DR: Several CAD loci show substantial pleiotropy, which may help us understand the mechanisms by which these loci affect CAD risk, and identify 6 new loci associated with CAD at genome-wide significance.
••
TL;DR: It is suggested that the genetic contribution to prognosis in Crohn's disease is largely independent of the contribution to disease susceptibility and point to a biology of prognosis that could provide new therapeutic opportunities.
Abstract: For most immune-mediated diseases, the main determinant of patient well-being is not the diagnosis itself but instead the course that the disease takes over time (prognosis). Prognosis may vary substantially between patients for reasons that are poorly understood. Familial studies support a genetic contribution to prognosis, but little evidence has been found for a proposed association between prognosis and the burden of susceptibility variants. To better characterize how genetic variation influences disease prognosis, we performed a within-cases genome-wide association study in two cohorts of patients with Crohn's disease. We identified four genome-wide significant loci, none of which showed any association with disease susceptibility. Conversely, the aggregated effect of all 170 disease susceptibility loci was not associated with disease prognosis. Together, these data suggest that the genetic contribution to prognosis in Crohn's disease is largely independent of the contribution to disease susceptibility and point to a biology of prognosis that could provide new therapeutic opportunities.
••
QIMR Berghofer Medical Research Institute1, St. Jude Children's Research Hospital2, deCODE genetics3, Wellcome Trust Centre for Human Genetics4, University of Liverpool5, Katholieke Universiteit Leuven6, Harvard University7, Brigham and Women's Hospital8, University of Queensland9, Vanderbilt University Medical Center10, Copenhagen University Hospital11, University of California, San Diego12, John Radcliffe Hospital13, Niigata University14, University of Tokyo15, RMIT University16, Lundbeck17, Aarhus University18, Merck KGaA19, Queensland University of Technology20
TL;DR: A meta-analysis of genome-wide association case-control data sets for endometriosis highlights novel variants in or near specific genes with important roles in sex steroid hormone signalling and function, and offers unique opportunities for more targeted functional research efforts.
Abstract: Endometriosis is a heritable hormone-dependent gynecological disorder, associated with severe pelvic pain and reduced fertility; however, its molecular mechanisms remain largely unknown Here we perform a meta-analysis of 11 genome-wide association case-control data sets, totalling 17,045 endometriosis cases and 191,596 controls In addition to replicating previously reported loci, we identify five novel loci significantly associated with endometriosis risk (P<5 × 10-8), implicating genes involved in sex steroid hormone pathways (FN1, CCDC170, ESR1, SYNE1 and FSHB) Conditional analysis identified five secondary association signals, including two at the ESR1 locus, resulting in 19 independent single nucleotide polymorphisms (SNPs) robustly associated with endometriosis, which together explain up to 519% of variance in endometriosis These results highlight novel variants in or near specific genes with important roles in sex steroid hormone signalling and function, and offer unique opportunities for more targeted functional research efforts
••
TL;DR: MSI and IHC analysis are highly concordant in endometrial cancer, and holds true for cases with subclonal loss of MMR protein expression.