Showing papers by "David Altshuler published in 2016"
••
Broad Institute1, Harvard University2, Boston Children's Hospital3, University of Washington4, University of Arizona5, Cardiff University6, Google7, Icahn School of Medicine at Mount Sinai8, Samsung Medical Center9, Vertex Pharmaceuticals10, University of Michigan11, University of Cambridge12, State University of New York Upstate Medical University13, Karolinska Institutet14, University of Eastern Finland15, University of Oxford16, Wellcome Trust Centre for Human Genetics17, Cedars-Sinai Medical Center18, University of Ottawa19, University of Pennsylvania20, University of North Carolina at Chapel Hill21, University of Helsinki22, University of California, San Diego23, University of Mississippi Medical Center24
TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.
Abstract: Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.
8,758 citations
••
Wellcome Trust Sanger Institute1, University of Michigan2, University of Oxford3, University of Geneva4, University of Exeter5, Greifswald University Hospital6, National Research Council7, University of Bristol8, University of Colorado Boulder9, University of Washington10, Fred Hutchinson Cancer Research Center11, SUNY Downstate Medical Center12, Erasmus University Rotterdam13, University of Trieste14, VU University Amsterdam15, King's College London16, South London and Maudsley NHS Foundation Trust17, University of Edinburgh18, Harvard University19, National Institutes of Health20, Harokopio University21, Innsbruck Medical University22, Broad Institute23, University of Helsinki24, Lund University25, Norwegian University of Science and Technology26, University of Cambridge27, University of Minnesota28, Technische Universität München29, University of North Carolina at Chapel Hill30, University of Toronto31, McGill University32, Leiden University33, University of Pennsylvania34, University of Groningen35, Utrecht University36, Churchill Hospital37
TL;DR: A reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies.
Abstract: We describe a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry. Using this resource leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies, and it can help to discover and refine causal loci. We describe remote server resources that allow researchers to carry out imputation and phasing consistently and efficiently.
2,149 citations
01 Jan 2016
TL;DR: In this article, a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry is presented.
Abstract: We describe a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry. Using this resource leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies, and it can help to discover and refine causal loci. We describe remote server resources that allow researchers to carry out imputation and phasing consistently and efficiently.
1,261 citations
••
Christian Fuchsberger1, Christian Fuchsberger2, Jason Flannick3, Jason Flannick4 +346 more•Institutions (77)
TL;DR: In this paper, the authors performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing for 12,940 individuals from five ancestry groups.
Abstract: The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.
866 citations
01 Jan 2016
TL;DR: Large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes, but most fell within regions previously identified by genome-wide association studies.
Abstract: The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.
698 citations
••
TL;DR: A meta-analysis of genome-wide association studies for estimated glomerular filtration rate suggests that genetic determinants of eGFR are mediated largely through direct effects within the kidney and highlight important cell types and biological pathways.
Abstract: Reduced glomerular filtration rate defines chronic kidney disease and is associated with cardiovascular and all-cause mortality. We conducted a meta-analysis of genome-wide association studies for estimated glomerular filtration rate (eGFR), combining data across 133,413 individuals with replication in up to 42,166 individuals. We identify 24 new and confirm 29 previously identified loci. Of these 53 loci, 19 associate with eGFR among individuals with diabetes. Using bioinformatics, we show that identified genes at eGFR loci are enriched for expression in kidney tissues and in pathways relevant for kidney development and transmembrane transporter activity, kidney structure, and regulation of glucose metabolism. Chromatin state mapping and DNase I hypersensitivity analyses across adult tissues demonstrate preferential mapping of associated variants to regulatory regions in kidney but not extra-renal tissues. These findings suggest that genetic determinants of eGFR are mediated largely through direct effects within the kidney and highlight important cell types and biological pathways.
409 citations
••
TL;DR: Saturation mutagenesis and prospective experimental characterization can support immediate diagnostic interpretation of newly discovered missense variants in disease-related genes.
Abstract: Amit Majithia and colleagues employ a pooled assay in human macrophages to assess the functional effects of all possible missense variants in PPARG. Their study shows the value of saturation mutagenesis and prospective experimental characterization to support diagnostic interpretation of newly discovered missense variants in disease-related genes. Clinical exome sequencing routinely identifies missense variants in disease-related genes, but functional characterization is rarely undertaken, leading to diagnostic uncertainty1,2. For example, mutations in PPARG cause Mendelian lipodystrophy3,4 and increase risk of type 2 diabetes (T2D)5. Although approximately 1 in 500 people harbor missense variants in PPARG, most are of unknown consequence. To prospectively characterize PPARγ variants, we used highly parallel oligonucleotide synthesis to construct a library encoding all 9,595 possible single–amino acid substitutions. We developed a pooled functional assay in human macrophages, experimentally evaluated all protein variants, and used the experimental data to train a variant classifier by supervised machine learning. When applied to 55 new missense variants identified in population-based and clinical sequencing, the classifier annotated 6 variants as pathogenic; these were subsequently validated by single-variant assays. Saturation mutagenesis and prospective experimental characterization can support immediate diagnostic interpretation of newly discovered missense variants in disease-related genes.
196 citations
••
TL;DR: This data-sharing effort has led to improved variant interpretation and development of treatments for rare diseases and some cancer types, but such benefits will only be available to the general population if researchers and clinicians can access and make comparisons across data from millions of individuals.
Abstract: Silos of genome data collection are being transformed into seamlessly connected, independent systems Early data-sharing efforts have led to improved variant interpretation and development of treatments for rare diseases and some cancer types (1–3). However, such benefits will only be available to the general population if researchers and clinicians can access and make comparisons across data from millions of individuals.
173 citations
••
TL;DR: From the analysis of NHLBI Exome Sequencing Project (ESP) data, not only have a number of important disease and complex trait association findings emerged, but the collective experience offers some valuable lessons for WGS initiatives.
Abstract: Massively parallel whole-genome sequencing (WGS) data have ushered in a new era in human genetics. These data are now being used to understand the role of rare variants in complex traits and to advance the goals of precision medicine. The technological and computing advances that have enabled us to generate WGS data on thousands of individuals have also outpaced our ability to perform analyses in scientifically and statistically rigorous and thoughtful ways. The past several years have witnessed the application of whole-exome sequencing (WES) to complex traits and diseases. From our analysis of NHLBI Exome Sequencing Project (ESP) data, not only have a number of important disease and complex trait association findings emerged, but our collective experience offers some valuable lessons for WGS initiatives. These include caveats associated with generating automated pipelines for quality control and analysis of rare variants; the importance of studying minority populations; sample size requirements and efficient study designs for identifying rare-variant associations; and the significance of incidental findings in population-based genetic research. With the ESP as an example, we offer guidance and a framework on how to conduct a large-scale association study in the era of WGS.
79 citations
••
University of Oxford1, Wellcome Trust Centre for Human Genetics2, University of Michigan3, Wellcome Trust Sanger Institute4, University of Tokyo5, University of Cambridge6, Ealing Hospital7, Boston University8, Harvard University9, Institute of Genomics and Integrative Biology10, Broad Institute11, Centre national de la recherche scientifique12, University of Texas Health Science Center at Houston13, University of North Carolina at Chapel Hill14, University of Chicago15, Texas Biomedical Research Institute16, Systems Research Institute17, University of California, San Francisco18, University of Haifa19, Albert Einstein College of Medicine20, The Chinese University of Hong Kong21, University of Mississippi Medical Center22, Jawaharlal Nehru University23, Hallym University24, Seoul National University25, Imperial College Healthcare26, National Institutes of Health27, University of Pennsylvania28, National University of Singapore29, Vanderbilt University30, Beta31, Imperial College London32, Life Sciences Institute33, University of Liverpool34
TL;DR: Transancestral fine-mapping data is undertook in 22 086 cases and 42 539 controls of East Asian, European, South Asian, African American and Mexican American descent to provide insight into the mechanisms through which type 2 diabetes association signals are mediated, and suggest future routes to understanding the biology of specific disease susceptibility loci.
Abstract: To gain insight into potential regulatory mechanisms through which the effects of variants at four established type 2 diabetes (T2D) susceptibility loci (CDKAL1, CDKN2A-B, IGF2BP2 and KCNQ1) are mediated, we undertook transancestral fine-mapping in 22 086 cases and 42 539 controls of East Asian, European, South Asian, African American and Mexican American descent. Through high-density imputation and conditional analyses, we identified seven distinct association signals at these four loci, each with allelic effects on T2D susceptibility that were homogenous across ancestry groups. By leveraging differences in the structure of linkage disequilibrium between diverse populations, and increased sample size, we localised the variants most likely to drive each distinct association signal. We demonstrated that integration of these genetic fine-mapping data with genomic annotation can highlight potential causal regulatory elements in T2D-relevant tissues. These analyses provide insight into the mechanisms through which T2D association signals are mediated, and suggest future routes to understanding the biology of specific disease susceptibility loci.
23 citations
••
Harvard University1, National Institutes of Health2, Brigham and Women's Hospital3, Indiana University4, Broad Institute5, Icahn School of Medicine at Mount Sinai6, University of Cambridge7, University of Pennsylvania8, University of Michigan9, Churchill Hospital10, Wellcome Trust Centre for Human Genetics11, University of Oxford12
TL;DR: Disruption of ANGPTL8 function in humans does not seem to have a large effect on measures of glucose tolerance, and its association with fasting glucose levels and risk for type 2 diabetes is found to be insignificant.
Abstract: Experiments in mice initially suggested a role for the protein angiopoietin-like 8 (ANGPTL8) in glucose homeostasis. However, subsequent experiments in model systems have challenged this proposed role. We sought to better understand the importance of ANGPTL8 in human glucose homeostasis by examining the association of a null mutation in ANGPTL8 with fasting glucose levels and risk for type 2 diabetes. A naturally-occurring null mutation in human ANGPTL8 (rs145464906; c.361C > T; p.Q121X) is carried by ~1 in 1000 individuals of European ancestry and is associated with higher levels of plasma high-density lipoprotein cholesterol, suggesting that this mutation has functional significance. We examined the association of p.Q121X with fasting glucose levels and risk for type 2 diabetes in up to 95,558 individuals (14,824 type 2 diabetics and 80,734 controls). We found no significant association of p.Q121X with either fasting glucose or type 2 diabetes (p-value = 0.90 and 0.65, respectively). Given our sample sizes, we had >98 % power to detect at least a 0.23 mmol/L effect on plasma glucose and >95 % power to detect a 70 % increase in risk for type 2 diabetes. Disruption of ANGPTL8 function in humans does not seem to have a large effect on measures of glucose tolerance.
••
TL;DR: A limited number of null/damaging alleles with a large effect on cardiovascular traits were detectable in ≈3000 black individuals.
Abstract: Background— The correlation of null alleles with human phenotypes can provide insight into gene function in humans. In individuals of African ancestry, we set out to identify null and damaging missense variants, and test these variants for association with a range of cardiovascular phenotypes.
Methods and Results— We performed whole-exome sequencing in 3223 black individuals from the Jackson Heart Study and found a total of 729 666 variant sites with minor allele frequency <5%, including 17 263 null variants and 49 929 missense variants predicted to be damaging by in silico algorithms. We tested null and damaging missense variants within each gene for association with 36 cardiovascular traits. We found 3 associations that met our prespecified level of significance (α=1.1×10−7). Null and damaging missense variants in PCSK9 were associated with 36 mg/dL lower low-density lipoprotein cholesterol ( P =3×10−21). Three individuals in their 50s with complete PCSK9 deficiency (each compound heterozygote for PCSK9 p.Y142X and p.C679X) were identified, with one having a coronary artery calcification score in the 83rd percentile despite a low-density lipoprotein cholesterol of 32 mg/dL. A damaging missense variant in HBQ1 (p.G52A) was associated with a 2 pg/cell lower mean corpuscular hemoglobin ( P =9×10−13) and rare damaging missense variants in VPS13A with higher red blood cell distribution width ( P =9.9×10–8).
Conclusions— A limited number of null/damaging alleles with a large effect on cardiovascular traits were detectable in ≈3000 black individuals.
••
Baylor College of Medicine1, Fred Hutchinson Cancer Research Center2, University of Wisconsin–Milwaukee3, University of Antioquia4, University of North Carolina at Chapel Hill5, National Institutes of Health6, Boston University7, University of California, Los Angeles8, University of Minnesota9, University of Washington10, Harvard University11, University of Mississippi12, Ohio State University13, New York University14, University of Michigan15, Broad Institute16
TL;DR: This study indicates that the combined effect of rare variants contribute to the inter-individual variation in fat distribution through the regulation of insulin response.
Abstract: Waist-to-hip ratio (WHR), a relative comparison of waist and hip circumferences, is an easily accessible measurement of body fat distribution, in particular central abdominal fat. A high WHR indicates more intra-abdominal fat deposition and is an established risk factor for cardiovascular disease and type 2 diabetes. Recent genome-wide association studies have identified numerous common genetic loci influencing WHR, but the contributions of rare variants have not been previously reported. We investigated rare variant associations with WHR in 1510 European-American and 1186 African-American women from the National Heart, Lung, and Blood Institute-Exome Sequencing Project. Association analysis was performed on the gene level using several rare variant association methods. The strongest association was observed for rare variants in IKBKB (P=4.0 × 10(-8)) in European-Americans, where rare variants in this gene are predicted to decrease WHRs. The activation of the IKBKB gene is involved in inflammatory processes and insulin resistance, which may affect normal food intake and body weight and shape. Meanwhile, aggregation of rare variants in COBLL1, previously found to harbor common variants associated with WHR and fasting insulin, were nominally associated (P=2.23 × 10(-4)) with higher WHR in European-Americans. However, these significant results are not shared between African-Americans and European-Americans that may be due to differences in the allelic architecture of the two populations and the small sample sizes. Our study indicates that the combined effect of rare variants contribute to the inter-individual variation in fat distribution through the regulation of insulin response.
••
TL;DR: PALB2 acts in the double-strand DNA break repair pathway recruiting RAD51 and BRCA2 to DNA breaks via its WD40 domain, and founder and recurrent mutations in PALB2 have been identified in several populations.
Abstract: Introduction PALB2 (partner and localizer of BRCA2) has been implicated in hereditary breast cancer susceptibility, with estimates of breast cancer risk up to 91% (95% CI, 44% to 100%) to age 70 years for particular mutations. Germline mutations in PALB2 have also been identified in individuals with pancreatic cancer and ovarian carcinoma, both with and without familial breast cancer, suggesting a role in susceptibility to breast and ovarian cancer. PALB2 acts in the double-strand DNA break repair pathway recruiting RAD51 and BRCA2 to DNA breaks via its WD40 domain. Biallelic germline mutations cause Fanconi anemia, complementation group N(FANCN). Pathogenic germline variants of PALB2 causing loss of normal function may be substitutions or insertions/ deletions. Tumors in germline PALB2 mutation carriers show loss of the wild-type allele consistent with a tumor suppressor function. Structural variants deleting or duplicating multiple exons of PALB2 have been reported in association with familial breast cancer and in FANCN, and founder and recurrent mutations in PALB2 have been identified in several populations.
••
01 Jan 2016TL;DR: The ability to define more complete individual inventories of genetic risk and environmental exposure is only a start towards understanding the complex molecular pathophysiology that underlies disease, understanding that will support the development of integrative readouts that track causal pathogenetic mechanisms and the invention of new therapies that restore homeostasis through these pathways.
Abstract: Individual predisposition to type 2 diabetes is influenced by the combination of genetic variants, environmental exposures, behaviour and chance. Human genetics offers a method to identify specific genetic variants that influence disease risk and thereby the pathways and mechanisms through which they operate. These pathways provide a powerful lens through which to develop biological insights into metabolism and disease and have the potential to inform diagnosis and treatment. Indeed, this potential is already being realised in precision medical management of monogenic and syndromic forms of diabetes. While substantial progress has been made identifying genetic variants for the common, multifactorial forms of type 2 diabetes, major challenges remain before we gain insight and translational benefit. The difficulty derives from the genetic architecture of type 2 diabetes and other common diseases, which involves a large number of variants of modest effect, many non-coding and presumably regulatory in nature. The ability to define more complete individual inventories of genetic risk and environmental exposure is only a start towards understanding the complex molecular pathophysiology that underlies disease, understanding that will support the development of integrative readouts that track causal pathogenetic mechanisms and the invention of new therapies that restore homeostasis through these pathways.