scispace - formally typeset
Search or ask a question

Showing papers by "Adam E. Locke published in 2014"


Journal ArticleDOI
Andrew R. Wood1, Tõnu Esko2, Jian Yang3, Sailaja Vedantam4  +441 moreInstitutions (132)
TL;DR: This article identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height, and all common variants together captured 60% of heritability.
Abstract: Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.

1,872 citations


Journal ArticleDOI
TL;DR: In this article, a general protocol for conducting GWAMAs and carrying out QC to minimize errors and to guarantee maximum use of the data is presented. But this protocol is not suitable for large consortia such as the GIANT Consortium.
Abstract: Rigorous organization and quality control (QC) are necessary to facilitate successful genome-wide association meta-analyses (GWAMAs) of statistics aggregated across multiple genome-wide association studies. This protocol provides guidelines for (i) organizational aspects of GWAMAs, and for (ii) QC at the study file level, the meta-level across studies and the meta-analysis output level. Real-world examples highlight issues experienced and solutions developed by the GIANT Consortium that has conducted meta-analyses including data from 125 studies comprising more than 330,000 individuals. We provide a general protocol for conducting GWAMAs and carrying out QC to minimize errors and to guarantee maximum use of the data. We also include details for the use of a powerful and flexible software package called EasyQC. Precise timings will be greatly influenced by consortium size. For consortia of comparable size to the GIANT Consortium, this protocol takes a minimum of about 10 months to complete.

370 citations


Journal ArticleDOI
Leslie A. Lange1, Youna Hu2, He Zhang2, Chenyi Xue2, Ellen M. Schmidt2, Zheng-Zheng Tang1, Chris Bizon3, Ethan M. Lange1, Joshua D. Smith4, Emily H. Turner4, Goo Jun2, Hyun Min Kang2, Gina M. Peloso5, Paul L. Auer6, Kuo Ping Li2, Jason Flannick7, Ji Zhang2, Christian Fuchsberger2, Kyle J. Gaulton8, Cecilia M. Lindgren8, Adam E. Locke2, Alisa K. Manning7, Xueling Sim2, Manuel A. Rivas8, Oddgeir L. Holmen9, Omri Gottesman10, Yingchang Lu10, Douglas M. Ruderfer10, Eli A. Stahl10, Qing Duan1, Yun Li1, Peter Durda11, Shuo Jiao12, Aaron Isaacs13, Albert Hofman13, Joshua C. Bis4, Adolfo Correa14, Michael Griswold14, Johanna Jakobsdottir, Albert V. Smith15, Pamela J. Schreiner16, Mary F. Feitosa17, Qunyuan Zhang17, Jennifer E. Huffman18, Jacy R Crosby19, Christina L. Wassel20, Ron Do5, Nora Franceschini1, Lisa W. Martin21, Jennifer G. Robinson22, Themistocles L. Assimes23, David R. Crosslin4, Elisabeth A. Rosenthal4, Michael Y. Tsai16, Mark J. Rieder4, Deborah N. Farlow5, Aaron R. Folsom16, Thomas Lumley24, Ervin R. Fox14, Christopher S. Carlson12, Ulrike Peters12, Rebecca D. Jackson25, Cornelia M. van Duijn13, André G. Uitterlinden13, Daniel Levy26, Jerome I. Rotter27, Herman A. Taylor28, Vilmundur Gudnason15, David S. Siscovick4, Myriam Fornage19, Ingrid B. Borecki17, Caroline Hayward18, Igor Rudan18, Y. Eugene Chen2, Erwin P. Bottinger10, Ruth J. F. Loos10, Pål Sætrom9, Kristian Hveem9, Michael Boehnke2, Leif Groop29, Mark I. McCarthy8, Thomas Meitinger30, Christie M. Ballantyne31, Stacey Gabriel5, Christopher J. O'Donnell7, Wendy S. Post32, Kari E. North1, Alexander P. Reiner4, Eric Boerwinkle19, Bruce M. Psaty33, David Altshuler7, Sekar Kathiresan7, Danyu Lin1, Gail P. Jarvik4, L. Adrienne Cupples26, Charles Kooperberg12, James G. Wilson14, Deborah A. Nickerson4, Gonçalo R. Abecasis2, Stephen S. Rich34, Russell P. Tracy11, Cristen J. Willer2 
TL;DR: This large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL- C and provides unique insight into the design and analysis of similar experiments.
Abstract: Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98(th) or <2(nd) percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments.

201 citations


04 Dec 2014
TL;DR: The results indicate a genetic architecture for human height that is characterized by a very large but finite number of causal variants, including mTOR, osteoglycin and binding of hyaluronic acid.
Abstract: Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate–related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.

97 citations


Journal ArticleDOI
Sophie R. Wang1, Sophie R. Wang2, Sophie R. Wang3, Vineeta Agarwala4, Vineeta Agarwala1, Vineeta Agarwala3, Jason Flannick3, Jason Flannick1, Charleston W. K. Chiang5, David Altshuler, Alisa Manning2, Christopher Hartl2, Pierre Fontanillas, Todd Green, Eric Banks, Mark A. DePristo, Ryan Poplin, Khalid Shakir, Timothy Fennell, Jacquelyn Murphy, Noël P. Burtt, Stacey Gabriel, Christian Fuchsberger, Hyun Min Kang, Xueling Sim, Clement Ma, Adam E. Locke, Thomas W. Blackwell, Anne U. Jackson, Tanya M. Teslovich, Heather M. Stringham, Peter S. Chines, Phoenix Kwan, Jeroen R. Huyghe, Adrian Tan, Goo Jun, Michael L. Stitzel, Richard N. Bergman, Lori L. Bonnycastle, Jaakko Tuomilehto, Francis S. Collins, Laura J. Scott, Karen L. Mohlke, Gonçalo R. Abecasis, Michael Boehnke, Tim M. Strom, Christian Gieger, Martina Müller-Nurasyid, Harald Grallert, Jennifer Kriebel, Janina S. Ried, Martin Hrabé de Angelis, Cornelia Huth, Christa Meisinger, Annette Peters, Wolfgang Rathmann, Konstantin Strauch, Thomas Meitinger, Jasmina Kravic, Claes Ladenvall, Tiinamaija Toumi, Bo Isomaa, Leif Groop, Kyle J. Gaulton, Loukas Moutsianas, Manny Rivas, Richard D. Pearson, Anubha Mahajan, Inga Prokopenko, Ashok Kumar, John R. B. Perry, Jeff Chen, Bryan Howie, Martijn van de Bunt, Kerrin S. Small, Cecilia M. Lindgren, Gerton Lunter, Neil Robertson, Will Rayner, Andrew D. Morris, David Buck, Andrew T. Hattersley, Tim D. Spector, Gil McVean, Timothy M. Frayling, Peter Donnelly, Mark I. McCarthy, Joel N. Hirschhorn3, Joel N. Hirschhorn1, Joel N. Hirschhorn2 
TL;DR: It is demonstrated that power of rare-variant association tests is higher in the Finnish population, especially when variants' phenotypic effects are tightly coupled with fitness effects and therefore reflect a greater contribution of rarer variants.
Abstract: Finnish samples have been extensively utilized in studying single-gene disorders, where the founder effect has clearly aided in discovery, and more recently in genome-wide association studies of complex traits, where the founder effect has had less obvious impacts. As the field starts to explore rare variants' contribution to polygenic traits, it is of great importance to characterize and confirm the Finnish founder effect in sequencing data and to assess its implications for rare-variant association studies. Here, we employ forward simulation, guided by empirical deep resequencing data, to model the genetic architecture of quantitative polygenic traits in both the general European and the Finnish populations simultaneously. We demonstrate that power of rare-variant association tests is higher in the Finnish population, especially when variants' phenotypic effects are tightly coupled with fitness effects and therefore reflect a greater contribution of rarer variants. SKAT-O, variable-threshold tests, and single-variant tests are more powerful than other rare-variant methods in the Finnish population across a range of genetic models. We also compare the relative power and efficiency of exome array genotyping to those of high-coverage exome sequencing. At a fixed cost, less expensive genotyping strategies have far greater power than sequencing; in a fixed number of samples, however, genotyping arrays miss a substantial portion of genetic signals detected in sequencing, even in the Finnish founder population. As genetic studies probe sequence variation at greater depth in more diverse populations, our simulation approach provides a framework for evaluating various study designs for gene discovery.

30 citations


Journal ArticleDOI
TL;DR: It is argued that larger samples, alternative study designs, and additional bioinformatics approaches will be necessary to discover associations between these endophenotypes and genomic variation.
Abstract: Five of the companion articles in this special issue describe genome-wide association studies (GWAS) from a fixed genotyping array with a prespecified set of 527,829 variants. Such genotyping arrays are designed primarily to capture common variants, those with a minor allele frequency, or MAF, greater than .05. The other original research article in this issue (Vrieze et al., 2014) describes an association study between the 17 putative endophenotypes and rare nonsynonymous exonic variants specifically, which are variants in coding regions that affect protein structure. In the current study, we extended these analyses by employing whole genome sequencing in an attempt to discover nearly all single nucleotide polymorphisms (SNPs) present in any given individual, including those on the GWAS and exome arrays as well as tens of millions of additional variants. Because all variants are directly measured and genotyped, this results in increased power for common variants and the ability to test rare variants throughout the entire genome on a far larger scale than the other articles in this special issue. Whole genome sequencing interrogates the entire genome to discover and accurately genotype variants from across the allelic spectrum, from private mutations possessed by a single person (or family), to common variants genotyped on typical microarrays. The past few years have seen significant advances in population genetics and characterization of rare genomic variation, which were only possible with genome sequencing technology. The 1000 Genomes Project, for example, combined exome and whole genome sequencing to discover 38 million SNPs in 1,092 individuals from 14 ancestral populations (1000 Genomes Project Consortium, 2012). The Exome Sequencing Project (Fu et al., 2013) and analogous exome sequencing projects (Nelson et al., 2012) have extensively interrogated exonic regions of the genome and characterized a wide diversity of rare coding variants. In the present study, we found 27.1 million autosomal SNPs, 21.3 million of which have minor allele frequency less than 5%. Almost none of these 21 million variants were tested in the other articles of this special issue. 1000 Genomes, Exome Sequencing Project, ENCODE, and many related projects represent breathtaking technological and analytical achievements, delivering insight into molecular biology, genomics, evolutionary history, migratory patterns, and disease biology, to name a few (Lander, 2011). Genome sequencing has been less widely used in the study of human behavior, with notable exceptions including advances in the genetics of autism (Neale et al., 2012; O’Roak et al., 2011) and schizophrenia (Fromer et al., 2014; Purcell et al., 2014). These studies employed exome sequencing, which interrogates only the exons for each of ~20,000 protein coding genes throughout the genome. The exome is an important, but small, section of the genome, comprising less than 2% of all sequence in the genome. The remainder of the genome, colorfully referred to in the past as “junk DNA,” is everything but that. Work by the ENCODE consortium (ENCODE Project Consortium, 2012) and others have verified that noncoding DNA harbors genetic variation critical to genomic function. While coding variants can affect protein structure, which is undoubtedly important, noncoding DNA can affect which, when, and how frequently genes are expressed, termed “gene regulation” more broadly. Indeed, early work suggests that a majority of disease-associated variants are in noncoding regions, with regulatory regions likely enriched for genome-wide significant variants (Maurano et al., 2012; Pickrell, 2014). Exhaustively interrogating genetic variation in coding and noncoding regions requires whole genome sequencing. In the accompanying papers in this special issue, we described a variety of genetic association studies in a sample of 4,905 individuals using different genotyping technologies to identify variants associated with 17 psychophysiological phenotypes (for an overview, see Iacono, Malone, Vaidyanathan, & Vrieze, 2014). These endophenotypes include P300 amplitude, antisaccade direction errors, startle eye blink magnitude and modulation by affective stimuli, skin conductance level and responses in a habituation paradigm, and measures of resting EEG. Although some of these endophenotypes are robust candidates, and despite the hope that endophenotypes would provide increased power to detect associated genes, the investigations described in the companion articles of this special issue yielded few significant findings. In analyses of common variants, only antisaccade error was significantly associated with an individual SNP (Vaidyanathan, Malone, Donnelly et al., 2014). Tests of rare exonic variants also produced one significant association, between a nonsynonymous SNP in PARD3 and electroencephalogram (EEG) theta power (Vrieze et al., 2014). Gene-based tests of common variants, which aggregate the effect of all SNPs within a given gene into a single score, yielded several significant associations. P3 amplitude was associated with MYEF2 (Malone, Vaidyanathan et al., 2014), EEG delta power was associated with three genes (DEFA4, DEFA6, and GABRA1; Malone, Burwell et al., 2014), antisaccade performance was associated with two genes on Chromosome 2—B3GNT7 and NCL—and the aversive difference startle modulation score was associated with the PARP14 gene on Chromosome 3. Gene-based tests of rare exonic variants yielded one significant association with the pleasant difference startle modulation score and PNPLA7 (Vrieze et al., 2014), which was not readily interpretable. The present article appears last in this special issue because it is our most comprehensive and most powerful attempt to discover novel genetic loci associated with these endophenotypes. In this article, we describe three primary analyses. First, we test for association between 27 million autosomal SNPs and each of the 17 endophenotypes in 1,706 individuals with whole genome sequences. Second, we conduct gene-based tests of nonsynonymous variants in these same 1,706 individuals. Third, we use the combination of genotype arrays and sequences to impute all 27 million variants into the full Minnesota Center for Twin and Family Research (MCTFR) sample with psychophysiological endophenotypes (N = 4,905) and conduct the same single variant and gene-based burden tests in this larger sample.

28 citations