Showing papers by "Wellcome Trust Centre for Human Genetics published in 2015"
••
University of Oxford1, University of Basel2, Swiss Tropical and Public Health Institute3, University of California, San Francisco4, Tulane University5, Imperial College London6, World Health Organization7, University of Bath8, Institute for Health Metrics and Evaluation9, National Institutes of Health10, Wellcome Trust Centre for Human Genetics11
TL;DR: It is found that Plasmodium falciparum infection prevalence in endemic Africa halved and the incidence of clinical disease fell by 40% between 2000 and 2015, and interventions have averted 663 (542–753 credible interval) million clinical cases since 2000.
Abstract: Since the year 2000, a concerted campaign against malaria has led to unprecedented levels of intervention coverage across sub-Saharan Africa. Understanding the effect of this control effort is vital to inform future control planning. However, the effect of malaria interventions across the varied epidemiological settings of Africa remains poorly understood owing to the absence of reliable surveillance data and the simplistic approaches underlying current disease estimates. Here we link a large database of malaria field surveys with detailed reconstructions of changing intervention coverage to directly evaluate trends from 2000 to 2015, and quantify the attributable effect of malaria disease control efforts. We found that Plasmodium falciparum infection prevalence in endemic Africa halved and the incidence of clinical disease fell by 40% between 2000 and 2015. We estimate that interventions have averted 663 (542-753 credible interval) million clinical cases since 2000. Insecticide-treated nets, the most widespread intervention, were by far the largest contributor (68% of cases averted). Although still below target levels, current malaria interventions have substantially reduced malaria disease incidence across the continent. Increasing access to these interventions, and maintaining their effectiveness in the face of insecticide and drug resistance, should form a cornerstone of post-2015 control strategies.
2,135 citations
••
Wellcome Trust Sanger Institute1, University Medical Center Groningen2, Harvard University3, The Chinese University of Hong Kong4, Wellcome Trust Centre for Human Genetics5, Yonsei University6, Icahn School of Medicine at Mount Sinai7, University of Delhi8, University of Liverpool9, Central Manchester University Hospitals NHS Foundation Trust10, St Mary's Hospital11, Asan Medical Center12
TL;DR: The first trans-ancestry association study of IBD is reported, with genome-wide or Immunochip genotype data from an extended cohort of 86,640 European individuals and immunochip data from 9,846 individuals of East Asian, Indian or Iranian descent, implicate 38 loci in IBD risk for the first time.
Abstract: Ulcerative colitis and Crohn's disease are the two main forms of inflammatory bowel disease (IBD). Here we report the first trans-ancestry association study of IBD, with genome-wide or Immunochip genotype data from an extended cohort of 86,640 European individuals and Immunochip data from 9,846 individuals of East Asian, Indian or Iranian descent. We implicate 38 loci in IBD risk for the first time. For the majority of the IBD risk loci, the direction and magnitude of effect are consistent in European and non-European cohorts. Nevertheless, we observe genetic heterogeneity between divergent populations at several established risk loci driven by differences in allele frequency (NOD2) or effect size (TNFSF15 and ATG16L1) or a combination of these factors (IL23R and IRGM). Our results provide biological insights into the pathogenesis of IBD and demonstrate the usefulness of trans-ancestry association studies for mapping loci associated with complex diseases and understanding genetic architecture across diverse populations.
1,826 citations
••
TL;DR: In this paper, the authors compile the largest contemporary database for both species and pair it with relevant environmental variables predicting their global distribution, showing Aedes distributions to be the widest ever recorded; now extensive in all continents, including North America and Europe.
Abstract: Dengue and chikungunya are increasing global public health concerns due to their rapid geographical spread and increasing disease burden. Knowledge of the contemporary distribution of their shared vectors, Aedes aegypti and Aedes albopictus remains incomplete and is complicated by an ongoing range expansion fuelled by increased global trade and travel. Mapping the global distribution of these vectors and the geographical determinants of their ranges is essential for public health planning. Here we compile the largest contemporary database for both species and pair it with relevant environmental variables predicting their global distribution. We show Aedes distributions to be the widest ever recorded; now extensive in all continents, including North America and Europe. These maps will help define the spatial limits of current autochthonous transmission of dengue and chikungunya viruses. It is only with this kind of rigorous entomological baseline that we can hope to project future health impacts of these viruses.
1,416 citations
••
TL;DR: This work uses maximum likelihood inference to simultaneously detect recombination in bacterial genomes and account for it in phylogenetic reconstruction and finds evidence for recombination hotspots associated with mobile elements in Clostridium difficile ST6 and a previously undescribed 310kb chromosomal replacement in Staphylococcus aureus ST582.
Abstract: Recombination is an important evolutionary force in bacteria, but it remains challenging to reconstruct the imports that occurred in the ancestry of a genomic sample. Here we present ClonalFrameML, which uses maximum likelihood inference to simultaneously detect recombination in bacterial genomes and account for it in phylogenetic reconstruction. ClonalFrameML can analyse hundreds of genomes in a matter of hours, and we demonstrate its usefulness on simulated and real datasets. We find evidence for recombination hotspots associated with mobile elements in Clostridium difficile ST6 and a previously undescribed 310kb chromosomal replacement in Staphylococcus aureus ST582. ClonalFrameML is freely available at http://clonalframeml.googlecode.com/.
684 citations
••
TL;DR: This study comprised 7,219 cases and 15,991 controls of European ancestry, constituting a new genome-wide association study, a meta-analysis with a published GWAS and a replication study, which mapped 43 susceptibility loci, including ten new associations.
Abstract: Systemic lupus erythematosus (SLE) is a genetically complex autoimmune disease characterized by loss of immune tolerance to nuclear and cell surface antigens. Previous genome-wide association studies (GWAS) had modest sample sizes, reducing their scope and reliability. Our study comprised 7,219 cases and 15,991 controls of European ancestry, constituting a new GWAS, a meta-analysis with a published GWAS and a replication study. We have mapped 43 susceptibility loci, including ten new associations. Assisted by dense genome coverage, imputation provided evidence for missense variants underpinning associations in eight genes. Other likely causal genes were established by examining associated alleles for cis-acting eQTL effects in a range of ex vivo immune cells. We found an over-representation (n = 16) of transcription factors among SLE susceptibility genes. This finding supports the view that aberrantly regulated gene expression networks in multiple cell types in both the innate and adaptive immune response contribute to the risk of developing SLE.
637 citations
••
TL;DR: A dimensionality-reduction method is developed, (Z)ero (I)nflated (F)actor (A)nalysis (ZIFA), which explicitly models the dropout characteristics, and it is shown that it improves modeling accuracy on simulated and biological data sets.
Abstract: Single-cell RNA-seq data allows insight into normal cellular function and various disease states through molecular characterization of gene expression on the single cell level. Dimensionality reduction of such high-dimensional data sets is essential for visualization and analysis, but single-cell RNA-seq data are challenging for classical dimensionality-reduction methods because of the prevalence of dropout events, which lead to zero-inflated data. Here, we develop a dimensionality-reduction method, (Z)ero (I)nflated (F)actor (A)nalysis (ZIFA), which explicitly models the dropout characteristics, and show that it improves modeling accuracy on simulated and biological data sets.
583 citations
••
TL;DR: A broad catalogue of genetic mutations enable data from whole-genome sequencing to be used clinically to predict drug resistance, drug susceptibility, or to identify drug phenotypes that cannot yet be genetically predicted.
Abstract: Summary Background Diagnosing drug-resistance remains an obstacle to the elimination of tuberculosis. Phenotypic drug-susceptibility testing is slow and expensive, and commercial genotypic assays screen only common resistance-determining mutations. We used whole-genome sequencing to characterise common and rare mutations predicting drug resistance, or consistency with susceptibility, for all first-line and second-line drugs for tuberculosis. Methods Between Sept 1, 2010, and Dec 1, 2013, we sequenced a training set of 2099 Mycobacterium tuberculosis genomes. For 23 candidate genes identified from the drug-resistance scientific literature, we algorithmically characterised genetic mutations as not conferring resistance (benign), resistance determinants, or uncharacterised. We then assessed the ability of these characterisations to predict phenotypic drug-susceptibility testing for an independent validation set of 1552 genomes. We sought mutations under similar selection pressure to those characterised as resistance determinants outside candidate genes to account for residual phenotypic resistance. Findings We characterised 120 training-set mutations as resistance determining, and 772 as benign. With these mutations, we could predict 89·2% of the validation-set phenotypes with a mean 92·3% sensitivity (95% CI 90·7–93·7) and 98·4% specificity (98·1–98·7). 10·8% of validation-set phenotypes could not be predicted because uncharacterised mutations were present. With an in-silico comparison, characterised resistance determinants had higher sensitivity than the mutations from three line-probe assays (85·1% vs 81·6%). No additional resistance determinants were identified among mutations under selection pressure in non-candidate genes. Interpretation A broad catalogue of genetic mutations enable data from whole-genome sequencing to be used clinically to predict drug resistance, drug susceptibility, or to identify drug phenotypes that cannot yet be genetically predicted. This approach could be integrated into routine diagnostic workflows, phasing out phenotypic drug-susceptibility testing while reporting drug resistance early. Funding Wellcome Trust, National Institute of Health Research, Medical Research Council, and the European Union.
511 citations
••
Wellcome Trust Sanger Institute1, Wellcome Trust Centre for Human Genetics2, University of Oxford3, National Institutes of Health4, University of Maryland, Baltimore5, Mahidol University6, Medical Research Council7, World Health Organization8, United States Department of the Army9, University of Ilorin10
TL;DR: Analysis of the fine structure of the parasite population showed that the fd, arps10, mdr2 and crt polymorphisms are markers of a genetic background on which kelch13 mutations are particularly likely to arise and that they correlate with the contemporary geographical boundaries and population frequencies of artemisinin resistance.
Abstract: We report a large multicenter genome-wide association study of Plasmodium falciparum resistance to artemisinin, the frontline antimalarial drug. Across 15 locations in Southeast Asia, we identified at least 20 mutations in kelch13 (PF3D7_1343700) affecting the encoded propeller and BTB/POZ domains, which were associated with a slow parasite clearance rate after treatment with artemisinin derivatives. Nonsynonymous polymorphisms in fd (ferredoxin), arps10 (apicoplast ribosomal protein S10), mdr2 (multidrug resistance protein 2) and crt (chloroquine resistance transporter) also showed strong associations with artemisinin resistance. Analysis of the fine structure of the parasite population showed that the fd, arps10, mdr2 and crt polymorphisms are markers of a genetic background on which kelch13 mutations are particularly likely to arise and that they correlate with the contemporary geographical boundaries and population frequencies of artemisinin resistance. These findings indicate that the risk of new resistance-causing mutations emerging is determined by specific predisposing genetic factors in the underlying parasite population.
507 citations
••
Wellcome Trust Sanger Institute1, University of Cambridge2, National Institutes of Health3, University of the Witwatersrand4, Uganda Virus Research Institute5, Wellcome Trust Centre for Human Genetics6, Medical Research Council7, Addis Ababa University8, University College London9, National Health Laboratory Service10, UPRRP College of Natural Sciences11, University of KwaZulu-Natal12
TL;DR: It is shown that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa.
Abstract: Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.
482 citations
••
TL;DR: This paper performed a meta-analysis of >15 million genetic variants in 21,399 cases and 95,464 controls from populations of European, African, Japanese and Latino ancestry, followed by replication in 32,059 cases and 228,628 controls from 18 studies.
Abstract: Genetic association studies have identified 21 loci associated with atopic dermatitis risk predominantly in populations of European ancestry. To identify further susceptibility loci for this common, complex skin disease, we performed a meta-analysis of >15 million genetic variants in 21,399 cases and 95,464 controls from populations of European, African, Japanese and Latino ancestry, followed by replication in 32,059 cases and 228,628 controls from 18 studies. We identified ten new risk loci, bringing the total number of known atopic dermatitis risk loci to 31 (with new secondary signals at four of these loci). Notably, the new loci include candidate genes with roles in the regulation of innate host defenses and T cell function, underscoring the important contribution of (auto)immune mechanisms to atopic dermatitis pathogenesis.
471 citations
••
TL;DR: De Bruijn graph representation of bacterial diversity can be used to identify species and resistance profiles of clinical isolates and is implemented in a software package that takes raw sequence data as input, and generates a clinician-friendly report within 3 minutes on a laptop.
Abstract: The rise of antibiotic-resistant bacteria has led to an urgent need for rapid detection of drug resistance in clinical samples, and improvements in global surveillance. Here we show how de Bruijn graph representation of bacterial diversity can be used to identify species and resistance profiles of clinical isolates. We implement this method for Staphylococcus aureus and Mycobacterium tuberculosis in a software package ('Mykrobe predictor') that takes raw sequence data as input, and generates a clinician-friendly report within 3 minutes on a laptop. For S. aureus, the error rates of our method are comparable to gold-standard phenotypic methods, with sensitivity/specificity of 99.1%/99.6% across 12 antibiotics (using an independent validation set, n=470). For M. tuberculosis, our method predicts resistance with sensitivity/specificity of 82.6%/98.5% (independent validation set, n=1,609); sensitivity is lower here, probably because of limited understanding of the underlying genetic mechanisms. We give evidence that minor alleles improve detection of extremely drug-resistant strains, and demonstrate feasibility of the use of emerging single-molecule nanopore sequencing techniques for these purposes.
••
TL;DR: The results provide a path to a subunit vaccine against dengue virus and have implications for the design and monitoring of future vaccine trials in which the induction of antibody to the EDE should be prioritized.
Abstract: Dengue is a rapidly emerging, mosquito-borne viral infection, with an estimated 400 million infections occurring annually. To gain insight into dengue immunity, we characterized 145 human monoclonal antibodies (mAbs) and identified a previously unknown epitope, the envelope dimer epitope (EDE), that bridges two envelope protein subunits that make up the 90 repeating dimers on the mature virion. The mAbs to EDE were broadly reactive across the dengue serocomplex and fully neutralized virus produced in either insect cells or primary human cells, with 50% neutralization in the low picomolar range. Our results provide a path to a subunit vaccine against dengue virus and have implications for the design and monitoring of future vaccine trials in which the induction of antibody to the EDE should be prioritized.
••
TL;DR: There is significant pre-Roman but post-Mesolithic movement into southeastern England from continental Europe, and it is shown that in non-Saxon parts of the United Kingdom, there exist genetically differentiated subgroups rather than a general ‘Celtic’ population.
Abstract: Fine-scale genetic variation between human populations is interesting as a signature of historical demographic events and because of its potential for confounding disease studies. We use haplotype-based statistical methods to analyse genome-wide single nucleotide polymorphism (SNP) data from a carefully chosen geographically diverse sample of 2,039 individuals from the United Kingdom. This reveals a rich and detailed pattern of genetic differentiation with remarkable concordance between genetic clusters and geography. The regional genetic differentiation and differing patterns of shared ancestry with 6,209 individuals from across Europe carry clear signals of historical demographic events. We estimate the genetic contribution to southeastern England from Anglo-Saxon migrations to be under half, and identify the regions not carrying genetic material from these migrations. We suggest significant pre-Roman but post-Mesolithic movement into southeastern England from continental Europe, and show that in non-Saxon parts of the United Kingdom, there exist genetically differentiated subgroups rather than a general 'Celtic' population.
••
Agency for Science, Technology and Research1, University of Oulu2, Imperial College London3, Wellcome Trust Centre for Human Genetics4, University of Milan5, University of Bristol6, National Institutes of Health7, Ealing Hospital8, Pasteur Institute of Lille9, university of lille10, Hammersmith Hospital11, Baker IDI Heart and Diabetes Institute12, National University of Singapore13, French Institute of Health and Medical Research14, Technische Universität München15, University of Kiel16, University of Oxford17, University of Cambridge18, University of Surrey19, Hannover Medical School20, Max Healthcare21, University of Kelaniya22, University of Mauritius23, University of Helsinki24, Imperial College Healthcare25, University of Pennsylvania26, University of Eastern Finland27, University of Düsseldorf28, Dresden University of Technology29, National Institute for Health Research30
TL;DR: A nested case-control study of DNA methylation in Indian Asians and Europeans with incident type 2 diabetes who were identified from the 8-year follow-up of 25 372 participants in the London Life Sciences Prospective Population study.
••
TL;DR: This paper performed fine mapping of 39 established type 2 diabetes (T2D) loci in 27,206 cases and 57,574 controls of European ancestry, and identified 49 distinct association signals at these loci including five mapping in or near KCNQ1.
Abstract: We performed fine mapping of 39 established type 2 diabetes (T2D) loci in 27,206 cases and 57,574 controls of European ancestry. We identified 49 distinct association signals at these loci, including five mapping in or near KCNQ1. 'Credible sets' of the variants most likely to drive each distinct signal mapped predominantly to noncoding sequence, implying that association with T2D is mediated through gene regulation. Credible set variants were enriched for overlap with FOXA2 chromatin immunoprecipitation binding sites in human islet and liver cells, including at MTNR1B, where fine mapping implicated rs10830963 as driving T2D association. We confirmed that the T2D risk allele for this SNP increases FOXA2-bound enhancer activity in islet- and liver-derived cells. We observed allele-specific differences in NEUROD1 binding in islet-derived cells, consistent with evidence that the T2D risk allele increases islet MTNR1B expression. Our study demonstrates how integration of genetic and genomic information can define molecular mechanisms through which variants underlying association signals exert their effects on disease.
••
University of Leicester1, University of Nottingham2, University of British Columbia3, Laval University4, Tongji University5, Icahn School of Medicine at Mount Sinai6, University Medical Center Groningen7, University of Oxford8, UCL Institute of Neurology9, King's College London10, University of Tartu11, Boston Children's Hospital12, Wellcome Trust Centre for Human Genetics13, University of Geneva14, Queen Mary University of London15, King Abdulaziz University16, Imperial College London17, Imperial College Healthcare18, University of Glasgow19, Wellcome Trust Sanger Institute20, University of Liverpool21, St George's, University of London22, National Institute for Health Research23
TL;DR: By sampling from the extremes of the lung function distribution in UK Biobank, novel genetic causes of lung function and smoking behaviour are identified and substantial shared genetic architecture underlying airflow obstruction is shown across individuals, irrespective of smoking behaviour and other airway disease.
••
TL;DR: It is shown that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling, and a method for combining WGS panels to improve variant coverage and downstream imputations accuracy is presented.
Abstract: Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth (average 7x), aiming to exhaustively characterize genetic variation down to 0.1% minor allele frequency in the British population. Here we demonstrate the value of this resource for improving imputation accuracy at rare and low-frequency variants in both a UK and an Italian population. We show that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling. We also present a method for combining WGS panels to improve variant coverage and downstream imputation accuracy, which we illustrate by integrating 7,562 WGS haplotypes from the UK10K project with 2,184 haplotypes from the 1000 Genomes Project. Finally, we introduce a novel approximation that maintains speed without sacrificing imputation accuracy for rare variants.
••
National Institute for Health Research1, Wellcome Trust Centre for Human Genetics2, University of Oxford3, Illumina4, University of Coimbra5, Laboratory of Molecular Biology6, University of Ulm7, Cliniques Universitaires Saint-Luc8, University of Zurich9, University Hospital Southampton NHS Foundation Trust10, Northwestern University11, Queen's University Belfast12, Imperial College London13, Copenhagen University Hospital14, Belfast City Hospital15, King's College London16, Shriners Hospitals for Children17
TL;DR: It is found that jointly calling variants across samples, filtering against both local and external databases, deploying multiple annotation tools and using familial transmission above biological plausibility contributed to accuracy.
Abstract: To assess factors influencing the success of whole-genome sequencing for mainstream clinical diagnosis, we sequenced 217 individuals from 156 independent cases or families across a broad spectrum of disorders in whom previous screening had identified no pathogenic variants. We quantified the number of candidate variants identified using different strategies for variant calling, filtering, annotation and prioritization. We found that jointly calling variants across samples, filtering against both local and external databases, deploying multiple annotation tools and using familial transmission above biological plausibility contributed to accuracy. Overall, we identified disease-causing variants in 21% of cases, with the proportion increasing to 34% (23/68) for mendelian disorders and 57% (8/14) in family trios. We also discovered 32 potentially clinically actionable variants in 18 genes unrelated to the referral disorder, although only 4 were ultimately considered reportable. Our results demonstrate the value of genome sequencing for routine clinical diagnosis but also highlight many outstanding challenges.
••
TL;DR: Genetic evidence is provided in Drosophila that Notum requires glypicans to suppress Wnt signalling, but does not cleave their glycophosphatidylinositol anchor, and Notum constitutes the first known extracellular protein deacylase.
Abstract: Signalling by Wnt proteins is finely balanced to ensure normal development and tissue homeostasis while avoiding diseases such as cancer. This is achieved in part by Notum, a highly conserved secreted feedback antagonist. Notum has been thought to act as a phospholipase, shedding glypicans and associated Wnt proteins from the cell surface. However, this view fails to explain specificity, as glypicans bind many extracellular ligands. Here we provide genetic evidence in Drosophila that Notum requires glypicans to suppress Wnt signalling, but does not cleave their glycophosphatidylinositol anchor. Structural analyses reveal glycosaminoglycan binding sites on Notum, which probably help Notum to co-localize with Wnt proteins. They also identify, at the active site of human and Drosophila Notum, a large hydrophobic pocket that accommodates palmitoleate. Kinetic and mass spectrometric analyses of human proteins show that Notum is a carboxylesterase that removes an essential palmitoleate moiety from Wnt proteins and thus constitutes the first known extracellular protein deacylase.
••
TL;DR: In this article, the authors investigated whether molecular analysis can be used to refine risk assessment, direct adjuvant therapy, and identify actionable alterations in high-risk endometrial cancer.
••
TL;DR: It is shown that NEAT1 long non-coding RNA (lncRNA) is a direct transcriptional target of HIF in many breast cancer cell lines and in solid tumors and that this contributes to the pro-tumorigenic hypoxia-phenotype in breast cancer.
Abstract: Activation of cellular transcriptional responses, mediated by hypoxia-inducible factor (HIF), is common in many types of cancer, and generally confers a poor prognosis. Known to induce many hundreds of protein-coding genes, HIF has also recently been shown to be a key regulator of the non-coding transcriptional response. Here, we show that NEAT1 long non-coding RNA (lncRNA) is a direct transcriptional target of HIF in many breast cancer cell lines and in solid tumors. Unlike previously described lncRNAs, NEAT1 is regulated principally by HIF-2 rather than by HIF-1. NEAT1 is a nuclear lncRNA that is an essential structural component of paraspeckles and the hypoxic induction of NEAT1 induces paraspeckle formation in a manner that is dependent upon both NEAT1 and on HIF-2. Paraspeckles are multifunction nuclear structures that sequester transcriptionally active proteins as well as RNA transcripts that have been subjected to adenosine-to-inosine (A-to-I) editing. We show that the nuclear retention of one such transcript, F11R (also known as junctional adhesion molecule 1, JAM1), in hypoxia is dependent upon the hypoxic increase in NEAT1, thereby conferring a novel mechanism of HIF-dependent gene regulation. Induction of NEAT1 in hypoxia also leads to accelerated cellular proliferation, improved clonogenic survival and reduced apoptosis, all of which are hallmarks of increased tumorigenesis. Furthermore, in patients with breast cancer, high tumor NEAT1 expression correlates with poor survival. Taken together, these results indicate a new role for HIF transcriptional pathways in the regulation of nuclear structure and that this contributes to the pro-tumorigenic hypoxia-phenotype in breast cancer.
••
Wellcome Trust Centre for Human Genetics1, University of Nottingham2, University of British Columbia3, Virginia Commonwealth University4, University of California, Santa Cruz5, Norwich Research Park6, Malaghan Institute of Medical Research7, European Bioinformatics Institute8, Brown University9, University of East Anglia10, Ontario Institute for Cancer Research11, Cold Spring Harbor Laboratory12
TL;DR: This study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites, and a preliminary analysis of the characteristics of typical runs including the consistency, rate, volume and quality of data produced.
Abstract: The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION™ Access Programme (MAP) was initiated by Oxford Nanopore Technologies™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five laboratories on two continents generated data using a control strain of Escherichia coli K-12, preparing and sequencing samples according to a revised ONT protocol. Here, we provide the details of the protocol used, along with a preliminary analysis of the characteristics of typical runs including the consistency, rate, volume and quality of data produced. Further analysis of the Phase 1 data presented here, and additional experiments in Phase 2 of E. coli from MARC are already underway to identify ways to improve and enhance MinION performance.
••
University of Minnesota1, University of Texas at Austin2, National Institutes of Health3, North Carolina State University4, Wellcome Trust Centre for Human Genetics5, Wellcome Trust Sanger Institute6, Baylor College of Medicine7, University of Alabama at Birmingham8, Queen Mary University of London9, Boston University10, King's College London11, University of North Carolina at Chapel Hill12
TL;DR: DNA methylation beta value with concurrent body mass index (BMI) and waist circumference (WC), and BMI change, adjusting for batch effects and potential confounders was tested in the Atherosclerosis Risk in Communities study.
Abstract: Obesity is an important component of the pathophysiology of chronic diseases. Identifying epigenetic modifications associated with elevated adiposity, including DNA methylation variation, may point to genomic pathways that are dysregulated in numerous conditions. The Illumina 450K Bead Chip array was used to assay DNA methylation in leukocyte DNA obtained from 2097 African American adults in the Atherosclerosis Risk in Communities (ARIC) study. Mixed-effects regression models were used to test the association of methylation beta value with concurrent body mass index (BMI) and waist circumference (WC), and BMI change, adjusting for batch effects and potential confounders. Replication using whole-blood DNA from 2377 White adults in the Framingham Heart Study and CD4+ T cell DNA from 991 Whites in the Genetics of Lipid Lowering Drugs and Diet Network Study was followed by testing using adipose tissue DNA from 648 women in the Multiple Tissue Human Expression Resource cohort. Seventy-six BMI-related probes, 164 WC-related probes and 8 BMI change-related probes passed the threshold for significance in ARIC (P < 1 × 10(-7); Bonferroni), including probes in the recently reported HIF3A, CPT1A and ABCG1 regions. Replication using blood DNA was achieved for 37 BMI probes and 1 additional WC probe. Sixteen of these also replicated in adipose tissue, including 15 novel methylation findings near genes involved in lipid metabolism, immune response/cytokine signaling and other diverse pathways, including LGALS3BP, KDM2B, PBX1 and BBS2, among others. Adiposity traits are associated with DNA methylation at numerous CpG sites that replicate across studies despite variation in tissue type, ethnicity and analytic approaches.
••
National Institutes of Health1, University of Helsinki2, Wellcome Trust Centre for Human Genetics3, University of Tartu4, University of Ferrara5, University Medical Center Groningen6, Amgen7, Uppsala University8, Karolinska Institutet9, VU University Amsterdam10, Erasmus University Rotterdam11, Lund University12, Leiden University Medical Center13, National Institute for Health Research14, University of Lübeck15, Medical Research Council16, Technische Universität München17, University of Tampere18, Steno Diabetes Center19, Ludwig Maximilian University of Munich20, Massachusetts Institute of Technology21, Harvard University22, European Bioinformatics Institute23, University of Leicester24, Turku University Hospital25, Uppsala University Hospital26, Erasmus University Medical Center27, University College London28, University of Turku29, University of Oxford30, University of Iceland31, Minerva Foundation Institute for Medical Research32, University of Liverpool33, Imperial College London34, Wellcome Trust Sanger Institute35
TL;DR: Using a genome-wide screen of 9.6 million genetic variants achieved through 1000 Genomes Project imputation in 62,166 samples, association to lipid traits in 93 loci is identified, including 79 previously identified loci with new lead SNPs and 10 new loci, including 15 locu with a low-frequency lead SNP and 10 loco with a missense lead SNP.
Abstract: Using a genome-wide screen of 9.6 million genetic variants achieved through 1000 Genomes Project imputation in 62,166 samples, we identify association to lipid traits in 93 loci, including 79 previously identified loci with new lead SNPs and 10 new loci, 15 loci with a low-frequency lead SNP and 10 loci with a missense lead SNP, and 2 loci with an accumulation of rare variants. In six loci, SNPs with established function in lipid genetics (CELSR2, GCKR, LIPC and APOE) or candidate missense mutations with predicted damaging function (CD300LG and TM6SF2) explained the locus associations. The low-frequency variants increased the proportion of variance explained, particularly for low-density lipoprotein cholesterol and total cholesterol. Altogether, our results highlight the impact of low-frequency variants in complex traits and show that imputation offers a cost-effective alternative to resequencing.
••
TL;DR: A comprehensive analysis pipeline for epigenome-wide association studies (EWAS) using the Illumina Infinium HumanMethylation450 BeadChip is developed, enabling accurate identification of methylation quantitative trait loci for hypothesis driven follow-up experiments.
Abstract: DNA methylation plays a fundamental role in the regulation of the genome, but the optimal strategy for analysis of genome-wide DNA methylation data remains to be determined. We developed a comprehensive analysis pipeline for epigenome-wide association studies (EWAS) using the Illumina Infinium HumanMethylation450 BeadChip, based on 2,687 individuals, with 36 samples measured in duplicate. We propose new approaches to quality control, data normalisation and batch correction through control-probe adjustment and establish a null hypothesis for EWAS using permutation testing. Our analysis pipeline outperforms existing approaches, enabling accurate identification of methylation quantitative trait loci for hypothesis driven follow-up experiments.
••
Université de Montréal1, University of Cambridge2, University of Kiel3, University of Oxford4, Wellcome Trust Centre for Human Genetics5, Harvard University6, University of Liège7, Casa Sollievo della Sofferenza8, University of California, San Francisco9, University of Melbourne10, University of Pittsburgh11, Wellcome Trust Sanger Institute12, Cedars-Sinai Medical Center13, University of Chicago14, Oslo University Hospital15, University of Oslo16
TL;DR: High-density SNP typing of the MHC in >32,000 individuals with IBD implicates multiple HLA alleles, with a primary role for HLA-DRB1*01:03 in both Crohn's disease and ulcerative colitis, suggesting an important role of the adaptive immune response in the colonic environment in the pathogenesis of IBD.
Abstract: Genome-wide association studies of the related chronic inflammatory bowel diseases (IBD) known as Crohn's disease and ulcerative colitis have shown strong evidence of association to the major histocompatibility complex (MHC). This region encodes a large number of immunological candidates, including the antigen-presenting classical human leukocyte antigen (HLA) molecules. Studies in IBD have indicated that multiple independent associations exist at HLA and non-HLA genes, but they have lacked the statistical power to define the architecture of association and causal alleles. To address this, we performed high-density SNP typing of the MHC in >32,000 individuals with IBD, implicating multiple HLA alleles, with a primary role for HLA-DRB1*01:03 in both Crohn's disease and ulcerative colitis. Noteworthy differences were observed between these diseases, including a predominant role for class II HLA variants and heterozygous advantage observed in ulcerative colitis, suggesting an important role of the adaptive immune response in the colonic environment in the pathogenesis of IBD.
••
Public Health England1, University of Manchester2, University of Oxford3, Institute for Health Metrics and Evaluation4, University of London5, Green Templeton College6, John Radcliffe Hospital7, Population Health Research Institute8, St George's, University of London9, University of Birmingham10, King's College London11, Queen Mary University of London12, Anglia Ruskin University13, University of Cambridge14, University of Liverpool15, University of Leicester16, Great Ormond Street Hospital17, University of Southampton18, Guy's and St Thomas' NHS Foundation Trust19, Imperial College London20, University of Sheffield21, University of Bristol22, Ulster University23, Wellcome Trust Centre for Human Genetics24, University College London25, Aintree University Hospitals NHS Foundation Trust26, Swansea University27, University of York28, University of Cape Town29, Newcastle University30, West Hertfordshire Hospitals NHS Trust31, The George Institute for Global Health32, Mid Sweden University33, British Heart Foundation34, Northumbria University35, University of Edinburgh36, Imperial College Healthcare37, NHS England38, University of Nottingham39, Royal Cornwall Hospital40, London School of Economics and Political Science41
TL;DR: In the Global Burden of Disease Study 2013 (GBDDS) as discussed by the authors, knowledge about health and its determinants has been integrated into a comparable framework to inform health policy.
••
TL;DR: To study the evolution and determinants of recombination in species lacking the gene that encodes PRDM9, fine-scale genetic maps from population resequencing data for two bird species found that both species have recombination hotspots, which are enriched near functional genomic elements.
Abstract: The DNA-binding protein PRDM9 has a critical role in specifying meiotic recombination hotspots in mice and apes, but it appears to be absent from other vertebrate species, including birds. To study the evolution and determinants of recombination in species lacking the gene that encodes PRDM9, we inferred fine-scale genetic maps from population resequencing data for two bird species: the zebra finch, Taeniopygia guttata, and the long-tailed finch, Poephila acuticauda. We found that both species have recombination hotspots, which are enriched near functional genomic elements. Unlike in mice and apes, most hotspots are shared between the two species, and their conservation seems to extend over tens of millions of years. These observations suggest that in the absence of PRDM9, recombination targets functional features that both enable access to the genome and constrain its evolution.
••
Netherlands Cancer Institute1, University of Bern2, University of East Anglia3, St Thomas' Hospital4, First Faculty of Medicine, Charles University in Prague5, Bosch6, University of Tübingen7, Wellcome Trust Centre for Human Genetics8, Technische Universität München9, University of Amsterdam10, Utrecht University11
TL;DR: This systematic review and meta-analysis found that DPYD variants c.1679T>G and c.1236G>A/HapB3 are clinically relevant predictors of fluoropyrimidine-associated toxicity, and upfront screening for these variants, in addition to the established variants DPYD*2A andc.2846A>T, is recommended to improve the safety of patients with cancer treated with fluoropyridines.
Abstract: Summary Background The best-known cause of intolerance to fluoropyrimidines is dihydropyrimidine dehydrogenase (DPD) deficiency, which can result from deleterious polymorphisms in the gene encoding DPD ( DPYD ), including DPYD *2A and c.2846A>T. Three other variants— DPYD c.1679T>G, c.1236G>A/HapB3, and c.1601G>A—have been associated with DPD deficiency, but no definitive evidence for the clinical validity of these variants is available. The primary objective of this systematic review and meta-analysis was to assess the clinical validity of c.1679T>G, c.1236G>A/HapB3, and c.1601G>A as predictors of severe fluoropyrimidine-associated toxicity. Methods We did a systematic review of the literature published before Dec 17, 2014, to identify cohort studies investigating associations between DPYD c.1679T>G, c.1236G>A/HapB3, and c.1601G>A and severe (grade ≥3) fluoropyrimidine-associated toxicity in patients treated with fluoropyrimidines (fluorouracil, capecitabine, or tegafur-uracil as single agents, in combination with other anticancer drugs, or with radiotherapy). Individual patient data were retrieved and analysed in a multivariable analysis to obtain an adjusted relative risk (RR). Effect estimates were pooled by use of a random-effects meta-analysis. The threshold for significance was set at a p value of less than 0·0167 (Bonferroni correction). Findings 7365 patients from eight studies were included in the meta-analysis. DPYD c.1679T>G was significantly associated with fluoropyrimidine-associated toxicity (adjusted RR 4·40, 95% CI 2·08–9·30, p A/HapB3 (1·59, 1·29–1·97, p A and fluoropyrimidine-associated toxicity was not significant (adjusted RR 1·52, 95% CI 0·86–2·70, p=0·15). Analysis of individual types of toxicity showed consistent associations of c.1679T>G and c.1236G>A/HapB3 with gastrointestinal toxicity (adjusted RR 5·72, 95% CI 1·40–23·33, p=0·015; and 2·04, 1·49–2·78, p DPYD *2A and c.2846A>T were also significantly associated with severe fluoropyrimidine-associated toxicity (adjusted RR 2·85, 95% CI 1·75–4·62, p Interpretation DPYD variants c.1679T>G and c.1236G>A/HapB3 are clinically relevant predictors of fluoropyrimidine-associated toxicity. Upfront screening for these variants, in addition to the established variants DPYD *2A and c.2846A>T, is recommended to improve the safety of patients with cancer treated with fluoropyrimidines. Funding None.
••
TL;DR: The transcriptome effects of protein-truncating variants, a class of variants expected to have profound effects on gene function, are characterized using data from the Genotype-Tissue Expression (GTEx) and Geuvadis projects to illustrate the value of transcriptome data in the functional interpretation of genetic variants.
Abstract: Accurate prediction of the functional effect of genetic variation is critical for clinical genome interpretation. We systematically characterized the transcriptome effects of protein-truncating variants, a class of variants expected to have profound effects on gene function, using data from the Genotype-Tissue Expression (GTEx) and Geuvadis projects. We quantitated tissue-specific and positional effects on nonsense-mediated transcript decay and present an improved predictive model for this decay. We directly measured the effect of variants both proximal and distal to splice junctions. Furthermore, we found that robustness to heterozygous gene inactivation is not due to dosage compensation. Our results illustrate the value of transcriptome data in the functional interpretation of genetic variants.