Showing papers by "David Altshuler published in 2013"
••
TL;DR: This unit describes how to use BWA and the Genome Analysis Toolkit to map genome sequencing data to a reference and produce high‐quality variant calls that can be used in downstream analyses.
Abstract: This unit describes how to use BWA and the Genome Analysis Toolkit (GATK) to map genome sequencing data to a reference and produce high-quality variant calls that can be used in downstream analyses. The complete workflow includes the core NGS data processing steps that are necessary to make the raw data suitable for analysis by the GATK, as well as the key methods involved in variant discovery using the GATK.
5,150 citations
••
Panos Deloukas1, Stavroula Kanoni1, Christina Willenborg2, Martin Farrall3 +201 more•Institutions (64)
TL;DR: An association analysis in CAD cases and controls identifies 15 loci reaching genome-wide significance, taking the number of susceptibility loci for CAD to 46, and a further 104 independent variants strongly associated with CAD at a 5% false discovery rate (FDR).
Abstract: Coronary artery disease (CAD) is the commonest cause of death. Here, we report an association analysis in 63,746 CAD cases and 130,681 controls identifying 15 loci reaching genome-wide significance, taking the number of susceptibility loci for CAD to 46, and a further 104 independent variants (r(2) < 0.2) strongly associated with CAD at a 5% false discovery rate (FDR). Together, these variants explain approximately 10.6% of CAD heritability. Of the 46 genome-wide significant lead SNPs, 12 show a significant association with a lipid trait, and 5 show a significant association with blood pressure, but none is significantly associated with diabetes. Network analysis with 233 candidate genes (loci at 10% FDR) generated 5 interaction networks comprising 85% of these putative genes involved in CAD. The four most significant pathways mapping to these networks are linked to lipid metabolism and inflammation, underscoring the causal role of these activities in the genetic etiology of CAD. Our study provides insights into the genetic basis of CAD and identifies key biological pathways.
1,518 citations
••
TL;DR: The results better delimit the historical details of human protein-coding variation, show the profound effect of recent human history on the burden of deleterious SNVs segregating in contemporary populations, and provide important practical information that can be used to prioritize variants in disease-gene discovery.
Abstract: Establishing the age of each mutation segregating in contemporary human populations is important to fully understand our evolutionary history and will help to facilitate the development of new approaches for disease-gene discovery. Large-scale surveys of human genetic variation have reported signatures of recent explosive population growth, notable for an excess of rare genetic variants, suggesting that many mutations arose recently. To more quantitatively assess the distribution of mutation ages, we resequenced 15,336 genes in 6,515 individuals of European American and African American ancestry and inferred the age of 1,146,401 autosomal single nucleotide variants (SNVs). We estimate that approximately 73% of all protein-coding SNVs and approximately 86% of SNVs predicted to be deleterious arose in the past 5,000-10,000 years. The average age of deleterious SNVs varied significantly across molecular pathways, and disease genes contained a significantly higher proportion of recently arisen deleterious SNVs than other genes. Furthermore, European Americans had an excess of deleterious variants in essential and Mendelian disease genes compared to African Americans, consistent with weaker purifying selection due to the Out-of-Africa dispersal. Our results better delimit the historical details of human protein-coding variation, show the profound effect of recent human history on the burden of deleterious SNVs segregating in contemporary populations, and provide important practical information that can be used to prioritize variants in disease-gene discovery.
934 citations
••
TL;DR: It is suggested that triglyceride-rich lipoproteins causally influence risk for CAD, and the strength of a polymorphism's effect on triglyceride levels is correlated with the magnitude of its effect on CAD risk.
Abstract: Triglycerides are transported in plasma by specific triglyceride-rich lipoproteins; in epidemiological studies, increased triglyceride levels correlate with higher risk for coronary artery disease (CAD). However, it is unclear whether this association reflects causal processes. We used 185 common variants recently mapped for plasma lipids (P < 5 × 10(-8) for each) to examine the role of triglycerides in risk for CAD. First, we highlight loci associated with both low-density lipoprotein cholesterol (LDL-C) and triglyceride levels, and we show that the direction and magnitude of the associations with both traits are factors in determining CAD risk. Second, we consider loci with only a strong association with triglycerides and show that these loci are also associated with CAD. Finally, in a model accounting for effects on LDL-C and/or high-density lipoprotein cholesterol (HDL-C) levels, the strength of a polymorphism's effect on triglyceride levels is correlated with the magnitude of its effect on CAD risk. These results suggest that triglyceride-rich lipoproteins causally influence risk for CAD.
817 citations
••
TL;DR: The concept of dose–response curves derived from experiments of nature are described, with an emphasis on human genetics as a valuable tool to prioritize molecular targets in drug development.
Abstract: More than 90% of the compounds that enter clinical trials fail to demonstrate sufficient safety and efficacy to gain regulatory approval. Most of this failure is due to the limited predictive value of preclinical models of disease, and our continued ignorance regarding the consequences of perturbing specific targets over long periods of time in humans. 'Experiments of nature' - naturally occurring mutations in humans that affect the activity of a particular protein target or targets - can be used to estimate the probable efficacy and toxicity of a drug targeting such proteins, as well as to establish causal rather than reactive relationships between targets and outcomes. Here, we describe the concept of dose-response curves derived from experiments of nature, with an emphasis on human genetics as a valuable tool to prioritize molecular targets in drug development. We discuss empirical examples of drug-gene pairs that support the role of human genetics in testing therapeutic hypotheses at the stage of target validation, provide objective criteria to prioritize genetic findings for future drug discovery efforts and highlight the limitations of a target validation approach that is anchored in human genetics.
544 citations
••
Broad Institute1, Harvard University2, Yale University3, Baylor College of Medicine4, Icahn School of Medicine at Mount Sinai5, Carnegie Mellon University6, University of Pennsylvania7, University of Texas Health Science Center at Houston8, University of Illinois at Chicago9, Vanderbilt University10, University of Pittsburgh11
TL;DR: Results provide compelling evidence that rare autosomal and X chromosome complete gene knockouts are important inherited risk factors for ASD.
248 citations
••
Memorial Sloan Kettering Cancer Center1, Radboud University Nijmegen2, University of Washington3, Columbia University4, Primary Children's Hospital5, University of Chicago6, Baylor College of Medicine7, University of New South Wales8, QIMR Berghofer Medical Research Institute9, University of British Columbia10, South Australia Pathology11, Cornell University12, University of Pennsylvania13, Broad Institute14
TL;DR: A new heterozygous germline variant, c.547G>A (p.Gly183Ser), affecting the octapeptide domain of PAX5 that was found to segregate with disease in two unrelated kindreds with autosomal dominant B-ALL is reported, implicate PAX5 in a new syndrome of susceptibility to pre-B cell neoplasia.
Abstract: Somatic alterations of the lymphoid transcription factor gene PAX5 (also known as BSAP) are a hallmark of B cell precursor acute lymphoblastic leukemia (B-ALL), but inherited mutations of PAX5 have not previously been described. Here we report a new heterozygous germline variant, c.547G>A (p.Gly183Ser), affecting the octapeptide domain of PAX5 that was found to segregate with disease in two unrelated kindreds with autosomal dominant B-ALL. Leukemic cells from all affected individuals in both families exhibited 9p deletion, with loss of heterozygosity and retention of the mutant PAX5 allele at 9p13. Two additional sporadic ALL cases with 9p loss harbored somatic PAX5 substitutions affecting Gly183. Functional and gene expression analysis of the PAX5 mutation demonstrated that it had significantly reduced transcriptional activity. These data extend the role of PAX5 alterations in the pathogenesis of pre-B cell ALL and implicate PAX5 in a new syndrome of susceptibility to pre-B cell neoplasia.
248 citations
••
TL;DR: An integrated simulation framework is developed, calibrated to empirical data, to enable the systematic evaluation of contradictory hypotheses about features of genetic architecture, including those where rare variants explain either little or most of T2D heritability.
Abstract: The genetic architecture of human diseases governs the success of genetic mapping and the future of personalized medicine. Although numerous studies have queried the genetic basis of common disease, contradictory hypotheses have been advocated about features of genetic architecture (for example, the contribution of rare versus common variants). We developed an integrated simulation framework, calibrated to empirical data, to enable the systematic evaluation of such hypotheses. For type 2 diabetes (T2D), two simple parameters--(i) the target size for causal mutation and (ii) the coupling between selection and phenotypic effect--define a broad space of architectures. Whereas extreme models are excluded by the combination of epidemiology, linkage and genome-wide association studies, many models remain consistent, including those where rare variants explain either little ( 80%) of T2D heritability. Ongoing sequencing and genotyping studies will further constrain the space of possible architectures, but very large samples (for example, >250,000 unselected individuals) will be required to localize most of the heritability underlying T2D and other traits characterized by these models.
154 citations
••
University of Copenhagen1, University of California, Berkeley2, Wellcome Trust Centre for Human Genetics3, Lund University4, Centre national de la recherche scientifique5, University of Eastern Finland6, University of Oxford7, Aarhus University8, VU University Medical Center9, University of Dundee10, University of Helsinki11, University of Exeter12, Umeå University13, Chinese PLA General Hospital14, Queen Mary University of London15, South China University of Technology16, Steno Diabetes Center17, IRSA18, Claude Bernard University Lyon 119, French Institute of Health and Medical Research20, Leiden University Medical Center21, Newcastle University22, Broad Institute23, Harvard University24, Imperial College London25, National Institute for Health Research26, University of Southern Denmark27, Glostrup Hospital28, Aalborg University29
TL;DR: Exome sequencing is applied as a basis for finding genetic determinants of metabolic traits and show the existence of low-frequency and common coding polymorphisms with impact on common metabolic traits.
Abstract: Aims/hypothesis Human complex metabolic traits are in part regulated by genetic determinants. Here we applied exome sequencing to identify novel associations of coding polymorphisms at minor allele frequencies (MAFs) >1% with common metabolic phenotypes. Methods The study comprised three stages. We performed medium-depth (8×) whole exome sequencing in 1,000 cases with type 2 diabetes, BMI >27.5 kg/m 2 and hypertension and in 1,000 controls (stage 1). We selected 16,192 polymorphisms nominally associated (p<0.05) with case–control status, from four selected annotation categories or from
130 citations
••
Massachusetts Institute of Technology1, Harvard University2, University of Bergen3, University of Mississippi Medical Center4, Jackson State University5, Tougaloo College6, Lund University7, Haukeland University Hospital8, Pfizer9, University of Helsinki10, Helsinki University Central Hospital11, Boston Children's Hospital12, Brigham and Women's Hospital13, Howard Hughes Medical Institute14
TL;DR: Accurate estimates of variant effect sizes from population-based sequencing are needed to avoid falsely predicting a substantial fraction of individuals as being at risk for MODY or other Mendelian diseases.
Abstract: Genome sequencing can identify individuals in the general population who harbor rare coding variants in genes for Mendelian disorders and who may consequently have increased disease risk. Previous studies of rare variants in phenotypically extreme individuals display ascertainment bias and may demonstrate inflated effect-size estimates. We sequenced seven genes for maturity-onset diabetes of the young (MODY) in well-phenotyped population samples (n = 4,003). We filtered rare variants according to two prediction criteria for disease-causing mutations: reported previously in MODY or satisfying stringent de novo thresholds (rare, conserved and protein damaging). Approximately 1.5% and 0.5% of randomly selected individuals from the Framingham and Jackson Heart Studies, respectively, carry variants from these two classes. However, the vast majority of carriers remain euglycemic through middle age. Accurate estimates of variant effect sizes from population-based sequencing are needed to avoid falsely predicting a substantial fraction of individuals as being at risk for MODY or other Mendelian diseases.
123 citations
••
American Cancer Society1, University of Cambridge2, Memorial Sloan Kettering Cancer Center3, New York University4, QIMR Berghofer Medical Research Institute5, Laval University6, University of Lyon7, Mayo Clinic8, Emory University9, McGill University10, Netherlands Cancer Institute11, Paris Descartes University12, Curie Institute13, Peter MacCallum Cancer Centre14, University of Pennsylvania15, Roswell Park Cancer Institute16, Medical University of Vienna17, Odense University Hospital18, Copenhagen University Hospital19, Beckman Research Institute20, National Institutes of Health21, Dana Corporation22, City of Hope National Medical Center23, Mount Sinai Hospital, Toronto24, University of Toronto25, University of Utah26, University of Padua27, Hospital Clínico San Carlos28, Helsinki University Central Hospital29, University of Pretoria30, University of Iceland31, Rappaport Faculty of Medicine32, Erasmus University Medical Center33, University Medical Center Groningen34, Radboud University Nijmegen Medical Centre35, Leiden University Medical Center36, Utrecht University37, Guy's and St Thomas' NHS Foundation Trust38, Western General Hospital39, University of Bordeaux40, University of Florence41, University of Kiel42, Technische Universität München43, Heidelberg University44, Sheba Medical Center45, Aarhus University Hospital46, University of Melbourne47, University of California, Los Angeles48, University of London49, German Cancer Research Center50, French Institute of Health and Medical Research51, University of Paris-Sud52, Hannover Medical School53, Karolinska Institutet54, University of Eastern Finland55, Katholieke Universiteit Leuven56, Cancer Council Victoria57, University of Southern California58, Oulu University Hospital59, Institute of Cancer Research60, The Breast Cancer Research Foundation61, University of Sheffield62, Pomeranian Medical University63, Harvard University64, Broad Institute65
TL;DR: A comprehensive update of novel and previously reported breast cancer susceptibility loci contributes to the establishment of a panel of SNPs that modify breast cancer risk in BRCA2 mutation carriers and may have clinical utility for women with BRCa2 mutations weighing options for medical prevention of breast cancer.
Abstract: Common genetic variants contribute to the observed variation in breast cancer risk for BRCA2 mutation carriers; those known to date have all been found through population-based genome-wide association studies (GWAS). To comprehensively identify breast cancer risk modifying loci for BRCA2 mutation carriers, we conducted a deep replication of an ongoing GWAS discovery study. Using the ranked P-values of the breast cancer associations with the imputed genotype of 1.4 M SNPs, 19,029 SNPs were selected and designed for inclusion on a custom Illumina array that included a total of 211,155 SNPs as part of a multi-consortial project. DNA samples from 3,881 breast cancer affected and 4,330 unaffected BRCA2 mutation carriers from 47 studies belonging to the Consortium of Investigators of Modifiers of BRCA1/2 were genotyped and available for analysis. We replicated previously reported breast cancer susceptibility alleles in these BRCA2 mutation carriers and for several regions (including FGFR2, MAP3K1, CDKN2A/B, and PTHLH) identified SNPs that have stronger evidence of association than those previously published. We also identified a novel susceptibility allele at 6p24 that was inversely associated with risk in BRCA2 mutation carriers (rs9348512; per allele HR = 0.85, 95% CI 0.80-0.90, P = 3.9 x 10(-8)). This SNP was not associated with breast cancer risk either in the general population or in BRCA1 mutation carriers. The locus lies within a region containing TFAP2A, which encodes a transcriptional activation protein that interacts with several tumor suppressor genes. This report identifies the first breast cancer risk locus specific to a BRCA2 mutation background. This comprehensive update of novel and previously reported breast cancer susceptibility loci contributes to the establishment of a panel of SNPs that modify breast cancer risk in BRCA2 mutation carriers. This panel may have clinical utility for women with BRCA2 mutations weighing options for medical prevention of breast cancer.
••
TL;DR: Rieder as mentioned in this paper was a member of the Seattle Grand Opportunity group and oversaw data generation and quality control and was one of the pioneers in the development of the GANs.
Abstract: Nature 493, 216–220 (2013); doi:10.1038/nature11690 In this Letter, Mark J. Rieder (Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA) was inadvertently omitted from the author list. He oversaw data generation and quality control and is a member of the Seattle Grand Opportunity group.
••
Washington University in St. Louis1, Broad Institute2, Harvard University3, Fred Hutchinson Cancer Research Center4, University of Wisconsin–Milwaukee5, Wellcome Trust Centre for Human Genetics6, University of Oxford7, Karolinska Institutet8, University of Leicester9, Lund University10, University of Ottawa11, University of Verona12, Ohio State University13, University of North Carolina at Chapel Hill14, University of Parma15, University of Pennsylvania16, Technische Universität München17, University of Lübeck18, University of Amsterdam19
TL;DR: This study diagnosed clinically unapparent cholesterol ester storage disease in the affected individuals from this kindred and addressed an outstanding question about risk of cardiovascular disease in LIPA E8SJM heterozygous carriers.
Abstract: Objective— Autosomal recessive hypercholesterolemia is a rare inherited disorder, characterized by extremely high total and low-density lipoprotein cholesterol levels, that has been previously linked to mutations in LDLRAP1 . We identified a family with autosomal recessive hypercholesterolemia not explained by mutations in LDLRAP1 or other genes known to cause monogenic hypercholesterolemia. The aim of this study was to identify the molecular pathogenesis of autosomal recessive hypercholesterolemia in this family. Approach and Results— We used exome sequencing to assess all protein-coding regions of the genome in 3 family members and identified a homozygous exon 8 splice junction mutation (c.894G>A, also known as E8SJM) in LIPA that segregated with the diagnosis of hypercholesterolemia. Because homozygosity for mutations in LIPA is known to cause cholesterol ester storage disease, we performed directed follow-up phenotyping by noninvasively measuring hepatic cholesterol content. We observed abnormal hepatic accumulation of cholesterol in the homozygote individuals, supporting the diagnosis of cholesterol ester storage disease. Given previous suggestions of cardiovascular disease risk in heterozygous LIPA mutation carriers, we genotyped E8SJM in >27 000 individuals and found no association with plasma lipid levels or risk of myocardial infarction, confirming a true recessive mode of inheritance. Conclusions— By integrating observations from Mendelian and population genetics along with directed clinical phenotyping, we diagnosed clinically unapparent cholesterol ester storage disease in the affected individuals from this kindred and addressed an outstanding question about risk of cardiovascular disease in LIPA E8SJM heterozygous carriers.
••
Thorgeir E. Thorgeirsson1, Daniel F. Gudbjartsson1, Patrick Sulem1, Søren Besenbacher2 +313 more•Institutions (90)
TL;DR: The results strongly point to a common biological basis of the regulation of theregulation of the authors' appetite for tobacco and food, and thus the vulnerability to nicotine addiction and obesity, and the effect of single-nucleotide polymorphisms affecting body mass index (BMI).
Abstract: Smoking influences body weight such that smokers weigh less than non-smokers and smoking cessation often leads to weight increase. The relationship between body weight and smoking is partly explained by the effect of nicotine on appetite and metabolism. However, the brain reward system is involved in the control of the intake of both food and tobacco. We evaluated the effect of single-nucleotide polymorphisms (SNPs) affecting body mass index (BMI) on smoking behavior, and tested the 32 SNPs identified in a meta-analysis for association with two smoking phenotypes, smoking initiation (SI) and the number of cigarettes smoked per day (CPD) in an Icelandic sample (N=34 216 smokers). Combined according to their effect on BMI, the SNPs correlate with both SI (r=0.019, P=0.00054) and CPD (r=0.032, P=8.0 × 10−7). These findings replicate in a second large data set (N=127 274, thereof 76 242 smokers) for both SI (P=1.2 × 10−5) and CPD (P=9.3 × 10−5). Notably, the variant most strongly associated with BMI (rs1558902-A in FTO) did not associate with smoking behavior. The association with smoking behavior is not due to the effect of the SNPs on BMI. Our results strongly point to a common biological basis of the regulation of our appetite for tobacco and food, and thus the vulnerability to nicotine addiction and obesity.
••
TL;DR: BCAA/AAAs changed acutely during glipizide and metformin administration, and the magnitude and direction of change differed by the insulin resistance status of the individual and the intervention, indicating that BCAA/ AAAs may be useful biomarkers for monitoring the early response to therapeutic interventions for T2D.
Abstract: Objective Elevated circulating levels of branched chain and aromatic amino acids (BCAA/AAAs) are associated with insulin resistance and incident type 2 diabetes (T2D). BCAA/AAAs decrease acutely during an oral glucose tolerance test (OGTT), a diagnostic test for T2D. It is unknown whether changes in BCAA/AAAs also signal an early response to commonly used medical therapies for T2D. Materials and Methods A liquid chromatography–mass spectrometry approach was used to measure BCAA/AAAs in 30 insulin sensitive (IS) and 30 insulin resistant (IR) subjects before and after: 1) one dose of a sulfonylurea medication, glipizide, 5 mg orally; 2) two days of twice daily metformin 500 mg orally; and 3) a 75-g OGTT. Percent change in BCAA/AAAs was determined after each intervention. Results Following glipizide, which increased insulin and decreased glucose in both subject groups, BCAA/AAAs decreased in the IS subjects only (all P Conclusions BCAA/AAAs changed acutely during glipizide and metformin administration, and the magnitude and direction of change differed by the insulin resistance status of the individual and the intervention. These results indicate that BCAA/AAAs may be useful biomarkers for monitoring the early response to therapeutic interventions for T2D.