scispace - formally typeset

Posted ContentDOI

A large-scale transcriptome-wide association study (TWAS) of ten blood cell phenotypes reveals complexities of TWAS fine-mapping

23 Feb 2021-bioRxiv (Cold Spring Harbor Laboratory)-

AbstractHematological measures are important intermediate clinical phenotypes for many acute and chronic diseases. Hematological measures are highly heritable, and although genome-wide association studies (GWAS) have identified thousands of loci containing trait-associated variants, the causal genes underlying these associations are often uncertain. To better understand the underlying genetic regulatory mechanisms, we performed a transcriptome-wide association study (TWAS) using PrediXcan to systematically investigate the association between genetically-predicted gene expression and hematological measures in 54,542 individuals of European ancestry from the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. We found 239 significant gene-trait associations with hematological measures. Among this set of 239 associations, we replicated 71 at p < 0.05 with same direction of effect for the blood cell trait in a meta-analysis of TWAS results consisting of up to 35,900 European ancestry individuals from the Womens Health Initiative (WHI), the Atherosclerosis Risk in Communities Study (ARIC), and BioMe Biobank. We further attempted to refine this list of candidate genes by performing conditional analyses, adjusting for individual variants previously associated with these hematological measures, and performed further fine-mapping of TWAS loci. To assist with the interpretation of TWAS findings, we designed an R Shiny application to interactively visualize TWAS results, one genomic locus at a time, by integrating our TWAS results with additional genetic data sources (GWAS, TWAS from other gene expression reference panels, conditional analyses, known GWAS variants, etc.). Our results and R Shiny application highlight frequently overlooked challenges with TWAS and illustrate the complexity of TWAS fine-mapping efforts. Author SummaryTranscriptome-wide association studies (TWAS) have shown great promise in furthering our understanding of the genetic regulatory mechanisms underlying complex trait variation. However, interpreting TWAS results can be incredibly complex, especially in large-scale analyses where hundreds of signals appear throughout the genome, with multiple genes often identified in a single chromosomal region. Our research demonstrates this complexity through real data examples from our analysis of hematological traits, and we provide a useful web application to visualize TWAS results in a broadly approachable format. Together, our results and web application illustrate the importance of interpreting TWAS studies in context and highlight the need to carefully examine results in a region-wide context to draw reasonable conclusions and formulate mechanistic hypotheses.

Summary (2 min read)

Jump to: [Introduction][Results][Discussion][Materials and Methods] and [FOCUS:]

Introduction

  • Hematological measures (red cell, white cell, and platelet traits) have a critical role in oxygen transport, immunity, infection, thrombosis, and hemostasis and are associated with many acute and chronic diseases, including autoimmunity, asthma, cardiovascular disease, and COVID-19 [1-5].
  • Unfortunately, these individual SNP-based GWAS make it difficult to identify regulatory variants with small effect sizes which in aggregate impact the same gene, even in very large sample sizes, and they identify regions of associated variants whose biological function is often not clear [6].
  • Hematological phenotypes are particularly good candidates for TWAS analysis due to the availability of large RNA-sequencing datasets in a relevant tissue type, high heritability across traits, and the large number of known genetic associations, most with poorly understood mechanisms and target genes.

Results

  • The authors applied the PrediXcan method to identify expression-trait associations using individual level genotype and phenotype data from the GERA non-Hispanic white ethnic group.
  • CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • The copyright holder for this preprintthis version posted February 23, 2021.
  • Marginal TWAS result displayed in (A), with Black colored genes and variants denoting those previously reported by GWAS, blue variants denote those not previously reported as GWAS sentinel variants.
  • (B) is a mirrored-Manhattan locus-zoom plot displaying genes connected to their predictive model variants with TWAS results (top panel) and GWAS results .

Discussion

  • The authors performed a large-scale TWAS using PrediXcan on 54,542 GERA individuals of European ancestry and present compelling evidence that results from marginal TWAS analyses alone cannot illuminate causal genes at loci for complex traits.
  • The copyright holder for this preprintthis version posted February 23, 2021.
  • CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • Additionally, results for these two genes differ slightly by reference panel.
  • While the authors show that TWAS may help in some cases to pinpoint likely causal genes, they emphasize the need for investigators to carefully interpret TWAS results alone, out of context.

Materials and Methods

  • CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • Genotyping was performed using the Illumina GSA array (~640.000 variants) and genotype data were imputed using the “1000G Phase 3 v5” reference panel.
  • In total, 8,455 European ancestry participants with hematological phenotypes were included in the analysis.
  • In order to replicate the conditionally significant gene-trait association, the authors tested each association via a meta-analysis of the ARIC, WHI, and BioMe cohorts.

FOCUS:

  • The authors used the Fine-mapping Of CaUsal gene Sets [15] software to fine-map TWAS statistics at genomic risk regions.
  • CC-BY 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
  • BioRxiv preprint 28 has not yet been assigned to a locus and continue in this fashion until all statistically significant TWAS genes have been assigned to a locus.
  • The authors excluded those without a valid date of blood cell count measurement, with age < 18 years, or with discordant genotypic and phenotypic sex, as well as .
  • Specifically, the authors computed the eigenvalue decomposition on the GERA sample outside of the cpgen script (for each phenotype), and then subsequently loaded the appropriate eigenvectors and eigenvalues into the program, modifying the script so that it could take these eigenvectors and eigenvalues as input.

Did you find this useful? Give us your feedback

...read more

Content maybe subject to copyright    Report

1
Full title: A large-scale transcriptome-wide association study (TWAS) of ten blood cell
phenotypes reveals complexities of TWAS fine-mapping
Short title: TWAS fine-mapping of blood cell phenotypes
Authors
Amanda L Tapia MS
1
, Bryce T Rowland BS
1
, Jonathan D Rosen MS
1
, Michael Preuss PhD
2
, Kris
Young PhD
3
, Misa Graff PhD
3
, Hélène Choquet PhD
4
, David J Couper PhD
1
, Steve Buyske PhD
5
,
Stephanie A Bien PhD
6
, Eric Jorgenson PhD
4
, Charles Kooperberg PhD
6
, Ruth J.F. Loos PhD
2
,
Alanna C Morrison PhD
7
, Kari E North PhD
3
, Bing Yu PhD
7
, Alexander P Reiner MD
8
, Yun Li
PhD
9,1,10*
, Laura M Raffield PhD
9*
*Contributed equally to this work
1
Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA,
2
The Charles
Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New
York, NY, USA,
3
Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA,
4
Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA,
5
Department
of Statistics, Rutgers University, Piscataway, NJ, USA,
6
Division of Public Health Sciences, Fred
Hutchinson Cancer Research Center, Seattle, WA, USA,
7
Human Genetics Center, Department of
Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The
University of Texas Health Science Center at Houston, Houston, TX, USA,
8
Department of
Epidemiology, University of Washington, Seattle, WA, USA,
9
Department of Genetics, University
of North Carolina, Chapel Hill, NC, USA,
10
Department of Computer Science, University of North
Carolina at Chapel Hill, Chapel Hill, NC, USA
Correspondence: Laura M. Raffield, PhD
Assistant Professor, Department of Genetics
University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599
laura_raffield@unc.edu
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 23, 2021. ; https://doi.org/10.1101/2021.02.23.432444doi: bioRxiv preprint

2
Abstract
Hematological measures are important intermediate clinical phenotypes for many acute
and chronic diseases. Hematological measures are highly heritable, and although genome-wide
association studies (GWAS) have identified thousands of loci containing trait-associated
variants, the causal genes underlying these associations are often uncertain. To better
understand the underlying genetic regulatory mechanisms, we performed a transcriptome-
wide association study (TWAS) using PrediXcan to systematically investigate the association
between genetically-predicted gene expression and hematological measures in 54,542
individuals of European ancestry from the Genetic Epidemiology Research on Adult Health and
Aging (GERA) cohort. We found 239 significant gene-trait associations with hematological
measures. Among this set of 239 associations, we replicated 71 at p < 0.05 with same direction
of effect for the blood cell trait in a meta-analysis of TWAS results consisting of up to 35,900
European ancestry individuals from the Women’s Health Initiative (WHI), the Atherosclerosis
Risk in Communities Study (ARIC), and BioMe Biobank. We further attempted to refine this list
of candidate genes by performing conditional analyses, adjusting for individual variants
previously associated with these hematological measures, and performed further fine-mapping
of TWAS loci. To assist with the interpretation of TWAS findings, we designed an R Shiny
application to interactively visualize TWAS results, one genomic locus at a time, by integrating
our TWAS results with additional genetic data sources (GWAS, TWAS from other gene
expression reference panels, conditional analyses, known GWAS variants, etc.). Our results and
R Shiny application highlight frequently overlooked challenges with TWAS and illustrate the
complexity of TWAS fine-mapping efforts.
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 23, 2021. ; https://doi.org/10.1101/2021.02.23.432444doi: bioRxiv preprint

3
Author Summary
Transcriptome-wide association studies (TWAS) have shown great promise in furthering our
understanding of the genetic regulatory mechanisms underlying complex trait variation.
However, interpreting TWAS results can be incredibly complex, especially in large-scale
analyses where hundreds of signals appear throughout the genome, with multiple genes often
identified in a single chromosomal region. Our research demonstrates this complexity through
real data examples from our analysis of hematological traits, and we provide a useful web
application to visualize TWAS results in a broadly approachable format. Together, our results
and web application illustrate the importance of interpreting TWAS studies in context and
highlight the need to carefully examine results in a region-wide context to draw reasonable
conclusions and formulate mechanistic hypotheses.
Introduction
Hematological measures (red cell, white cell, and platelet traits) have a critical role in
oxygen transport, immunity, infection, thrombosis, and hemostasis and are associated with
many acute and chronic diseases, including autoimmunity, asthma, cardiovascular disease, and
COVID-19 [1-5]. Genome-wide association studies (GWAS) have identified thousands of loci
containing such trait-associated variants, and previous Mendelian randomization and
phenome-wide association study analyses have highlighted the likely causal role of blood cell
trait-associated genetic variants in a variety of disorders, including autoimmune conditions and
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 23, 2021. ; https://doi.org/10.1101/2021.02.23.432444doi: bioRxiv preprint

4
coronary heart disease [1-3]. Unfortunately, these individual SNP-based GWAS make it difficult
to identify regulatory variants with small effect sizes which in aggregate impact the same gene,
even in very large sample sizes, and they identify regions of associated variants whose
biological function is often not clear [6]. Thus, utilizing a gene-based method to aggregate the
effect of multiple regulatory variants may increase the study power to identify novel trait-
associated loci and elucidate mechanisms of biological function.
A transcriptome-wide association study (TWAS) is one gene-based method which
systematically investigates the association between genetically predicted gene expression and
phenotypes of interest [6-9]. Here, we report results from a large TWAS of hematological
measures using the PrediXcan method [6] to analyze data from 54,542 individuals of European
ancestry from the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort
(our discovery data set) [10] [11]. Hematological phenotypes are particularly good candidates
for TWAS analysis due to the availability of large RNA-sequencing datasets in a relevant tissue
type, high heritability across traits, and the large number of known genetic associations, most
with poorly understood mechanisms and target genes. We perform this analysis using whole
blood RNA-sequencing in 922 individuals from the Depression Genes and Networks (DGN) [12]
study as our primary reference panel. After association analysis of imputed gene transcript
levels with hematological indices in GERA, we performed conditional analyses, adjusting for
variants previously identified to affect hematological measures, to evaluate if TWAS-identified
genes represented novel statistical signals or were primarily driven by variants known from
GWAS [3]. These direct conditional analyses represent a major advantage of the use of
individual level data for our TWAS analyses, since these conditional tests could not be
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 23, 2021. ; https://doi.org/10.1101/2021.02.23.432444doi: bioRxiv preprint

5
performed as easily or accurately using summary statistic-based methods. We replicated our
significant set of gene-trait associations in a meta-analyzed sample of TWAS results containing
18,100 individuals from the Women’s Health Initiative (WHI), 9,345 individuals from the
Atherosclerosis Risk in Communities Study (ARIC), and 8,455 individuals from Mount Sinai
BioMe Biobank (BioMe), all of European ancestry (Supplementary Table 1). We also compared
the TWAS results from the primary DGN reference panel to three additional reference panels
(whole blood and Epstein-Barr virus (EBV) transformed lymphocytes from the Genotype-Tissue
Expression (GTEx) Project [13], and monocytes from the Multi-Ethnic Study of Atherosclerosis
(MESA)[14]); these are considered secondary reference panels due to their smaller sample
sizes. These additional analyses helped us to determine if relevant tissues with smaller sample
sizes support our primary TWAS findings with DGN.
We employ several strategies to improve our understanding and interpretation of
complex genomic regions containing multiple TWAS-identified genes. First, we used FOCUS
(fine-mapping of causal gene sets [15]) to seek to identify a set of causal genes within genomic
loci containing multiple significant TWAS gene-trait associations. FOCUS is a software used to
fine-map TWAS statistics at genomic risk regions, while accounting for linkage disequilibrium
(LD) among variants and predicted expression correlation among genes at those risk regions.
Second, we present a novel web-based tool for integrating and visualizing TWAS and GWAS
results, as well as results from multiple expression reference datasets. Additionally, we discuss
frequently overlooked challenges of TWAS interpretation, such as failure to consider the
number of proximal genes which cannot be accurately imputed with a given reference panel,
but which may still be influenced by variants identified in GWAS studies. Our results illustrate
.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 23, 2021. ; https://doi.org/10.1101/2021.02.23.432444doi: bioRxiv preprint

Citations
More filters

Posted ContentDOI
05 Aug 2021-bioRxiv
Abstract: Previous genome-wide association studies (GWAS) of hematological traits have identified over 10,000 distinct trait-specific risk loci, but the underlying causal mechanisms at these loci remain incompletely characterized. We performed a transcriptome-wide association study (TWAS) of 29 hematological traits in 399,835 UK Biobank (UKB) participants of European ancestry using gene expression prediction models trained from whole blood RNA-seq data in 922 individuals. We discovered 557 TWAS signals associated with hematological traits distinct from previously discovered GWAS variants, including 10 completely novel gene-trait pairs corresponding to 9 unique genes. Among the 557 associations, 301 were available for replication in a cohort of 141,286 participants of European ancestry from the Million Veteran Program (MVP). Of these 301 associations, 199 replicated at a nominal threshold (α = 0.05) and 108 replicated at a strict Bonferroni adjusted threshold (α = 0.05/301). Using our TWAS results, we systematically assigned 4,261 out of 16,900 previously identified hematological trait GWAS variants to putative target genes. Compared to coloc, our TWAS results show reduced specificity and increased sensitivity to assign variants to target genes.

References
More filters

Journal ArticleDOI
23 Jan 2015-Science
Abstract: Resolving the molecular details of proteome variation in the different tissues and organs of the human body will greatly increase our knowledge of human biology and disease. Here, we present a map of the human tissue proteome based on an integrated omics approach that involves quantitative transcriptomics at the tissue and organ level, combined with tissue microarray-based immunohistochemistry, to achieve spatial localization of proteins down to the single-cell level. Our tissue-based analysis detected more than 90% of the putative protein-coding genes. We used this approach to explore the human secretome, the membrane proteome, the druggable proteome, the cancer proteome, and the metabolic functions in 32 different tissues and organs. All the data are integrated in an interactive Web-based database that allows exploration of individual proteins, as well as navigation of global expression patterns, in all major tissues and organs in the human body.

6,953 citations


Journal ArticleDOI
John T. Lonsdale, Jeffrey Thomas, Mike Salvatore, Rebecca Phillips, Edmund Lo, Saboor Shad, Richard Hasz, Gary Walters, Fernando U. Garcia1, Nancy Young2, Barbara A. Foster3, Mike Moser3, Ellen Karasik3, Bryan Gillard3, Kimberley Ramsey3, Susan L. Sullivan, Jason Bridge, Harold Magazine, John Syron, Johnelle Fleming, Laura A. Siminoff4, Heather M. Traino4, Maghboeba Mosavel4, Laura Barker4, Scott D. Jewell5, Daniel C. Rohrer5, Dan Maxim5, Dana Filkins5, Philip Harbach5, Eddie Cortadillo5, Bree Berghuis5, Lisa Turner5, Eric Hudson5, Kristin Feenstra5, Leslie H. Sobin6, James A. Robb6, Phillip Branton, Greg E. Korzeniewski6, Charles Shive6, David Tabor6, Liqun Qi6, Kevin Groch6, Sreenath Nampally6, Steve Buia6, Angela Zimmerman6, Anna M. Smith6, Robin Burges6, Karna Robinson6, Kim Valentino6, Deborah Bradbury6, Mark Cosentino6, Norma Diaz-Mayoral6, Mary Kennedy6, Theresa Engel6, Penelope Williams6, Kenyon Erickson, Kristin G. Ardlie7, Wendy Winckler7, Gad Getz7, Gad Getz8, David S. DeLuca7, MacArthur Daniel MacArthur8, MacArthur Daniel MacArthur7, Manolis Kellis7, Alexander Thomson7, Taylor Young7, Ellen Gelfand7, Molly Donovan7, Yan Meng7, George B. Grant7, Deborah C. Mash9, Yvonne Marcus9, Margaret J. Basile9, Jun Liu8, Jun Zhu10, Zhidong Tu10, Nancy J. Cox11, Dan L. Nicolae11, Eric R. Gamazon11, Hae Kyung Im11, Anuar Konkashbaev11, Jonathan K. Pritchard12, Jonathan K. Pritchard11, Matthew Stevens11, Timothée Flutre11, Xiaoquan Wen11, Emmanouil T. Dermitzakis13, Tuuli Lappalainen13, Roderic Guigó, Jean Monlong, Michael Sammeth, Daphne Koller14, Alexis Battle14, Sara Mostafavi14, Mark I. McCarthy15, Manual Rivas15, Julian Maller15, Ivan Rusyn16, Andrew B. Nobel16, Fred A. Wright16, Andrey A. Shabalin16, Mike Feolo17, Nataliya Sharopova17, Anne Sturcke17, Justin Paschal17, James M. Anderson17, Elizabeth L. Wilder17, Leslie Derr17, Eric D. Green17, Jeffery P. Struewing17, Gary F. Temple17, Simona Volpi17, Joy T. Boyer17, Elizabeth J. Thomson17, Mark S. Guyer17, Cathy Ng17, Assya Abdallah17, Deborah Colantuoni17, Thomas R. Insel17, Susan E. Koester17, Roger Little17, Patrick Bender17, Thomas Lehner17, Yin Yao17, Carolyn C. Compton17, Jimmie B. Vaught17, Sherilyn Sawyer17, Nicole C. Lockhart17, Joanne P. Demchok17, Helen F. Moore17 
TL;DR: The Genotype-Tissue Expression (GTEx) project is described, which will establish a resource database and associated tissue bank for the scientific community to study the relationship between genetic variation and gene expression in human tissues.
Abstract: Genome-wide association studies have identified thousands of loci for common diseases, but, for the majority of these, the mechanisms underlying disease susceptibility remain unknown. Most associated variants are not correlated with protein-coding changes, suggesting that polymorphisms in regulatory regions probably contribute to many disease phenotypes. Here we describe the Genotype-Tissue Expression (GTEx) project, which will establish a resource database and associated tissue bank for the scientific community to study the relationship between genetic variation and gene expression in human tissues.

4,930 citations


Journal ArticleDOI
TL;DR: METAL provides a computationally efficient tool for meta-analysis of genome-wide association scans, which is a commonly used approach for improving power complex traits gene mapping studies.
Abstract: Summary: METAL provides a computationally efficient tool for meta-analysis of genome-wide association scans, which is a commonly used approach for improving power complex traits gene mapping studies. METAL provides a rich scripting interface and implements efficient memory management to allow analyses of very large data sets and to support a variety of input file formats. Availability and implementation: METAL, including source code, documentation, examples, and executables, is available at http://www.sph.umich.edu/csg/abecasis/metal/ Contact: ude.hcimu@olacnog

3,195 citations


Journal ArticleDOI
Garnet L. Anderson1, S. Cummings1, L. S. Freedman1, C. Furberg1, Maureen M. Henderson1, Susan R. Johnson1, L. Kuller1, JoAnn E. Manson1, A. Oberman1, Ross L. Prentice1, Jacques E. Rossouw1, L. Finnegan1, R. Hiatt1, L. Pottern1, J. McGowan1, C. Clifford1, B. Caan1, V. Kipnis1, B. Ettinger1, S. Sidney1, G. Bailey1, Andrea Z. LaCroix1, Anne McTiernan1, Deborah J. Bowen1, C. Chen1, Barbara B. Cochrane1, Julie R. Hunt1, Alan R. Kristal1, Brian J. Lund1, Ruth E. Patterson1, Jeffrey L. Probstfield1, Lesley F. Tinker1, Nicole Urban1, Ching Yun Wang1, Emily White1, J. M. Kotchen1, S. Shumaker1, P. Rautaharju1, F. Rautaharju1, E. Stein1, P. Laskarzewski1, P. Steiner1, K. Sagar1, M. Nevitt1, M. Dockrell1, T. Fuerst1, John H. Himes1, M. Stevens1, F. Cammarata1, S. Lindenfelser1, Bruce M. Psaty1, D. Siscovick1, W. Longstreth1, S. Heckbert1, S. Wassertheil-Smoller1, W. Frishman1, Judy Wylie-Rosett1, D. Barad1, R. Freeman1, S. Miller1, Jennifer Hays1, R. Young1, C. Crowley1, M. A. DePoe1, G. Burke1, E. Paskett1, L. Wagenknecht1, R. Crouse1, L. Parsons1, T. Kotchen1, E. Braunwald1, J. Buring1, C. Hennekens1, J. M. Gaziano1, Annlouise R. Assaf1, R. C. Carleton1, M. Miller1, C. Wheeler1, A. Hume1, M. Pedersen1, O. Strickland1, M. Huber1, V. Porter1, Shirley A.A. Beresford1, V. Taylor1, N. Woods1, J. Hsia1, V. Barnabei1, M. Bovun1, Rowan T. Chlebowski1, R. Detrano1, A. Nelson1, J. Heiner1, S. Pushkin1, B. Valanis1, V. Stevens1, E. Whitlock1, N. Karanja1, A. Clark1 
TL;DR: The rationale for the interventions being studied in each of the CT components and for the inclusion of the OS component is described, including a brief description of the scientific and logistic complexity of the WHI.
Abstract: The Women's Health Initiative (WHI) is a large and complex clinical investigation of strategies for the prevention and control of some of the most common causes of morbidity and mortality among postmenopausal women, including cancer, cardiovascular disease, and osteoporotic fractures. The WHI was initiated in 1992, with a planned completion date of 2007. Postmenopausal women ranging in age from 50 to 79 are enrolled at one of 40 WHI clinical centers nationwide into either a clinical trial (CT) that will include about 64,500 women or an observational study (OS) that will include about 100,000 women. The CT is designed to allow randomized controlled evaluation of three distinct interventions: a low-fat eating pattern, hypothesized to prevent breast cancer and colorectal cancer and, secondarily, coronary heart disease; hormone replacement therapy, hypothesized to reduce the risk of coronary heart disease and other cardiovascular diseases and, secondarily, to reduce the risk of hip and other fractures, with increased breast cancer risk as a possible adverse outcome; and calcium and vitamin D supplementation, hypothesized to prevent hip fractures and, secondarily, other fractures and colorectal cancer. Overall benefit-versus-risk assessment is a central focus in each of the three CT components. Women are screened for participation in one or both of the components--dietary modification (DM) or hormone replacement therapy (HRT)--of the CT, which will randomize 48,000 and 27,500 women, respectively. Women who prove to be ineligible for, or who are unwilling to enroll in, these CT components are invited to enroll in the OS. At their 1-year anniversary of randomization, CT women are invited to be further randomized into the calcium and vitamin D (CaD) trial component, which is projected to include 45,000 women. The average follow-up for women in either CT or OS is approximately 9 years. Concerted efforts are made to enroll women of racial and ethnic minority groups, with a target of 20% of overall enrollment in both the CT and OS. This article gives a brief description of the rationale for the interventions being studied in each of the CT components and for the inclusion of the OS component. Some detail is provided on specific study design choices, including eligibility criteria, recruitment strategy, and sample size, with attention to the partial factorial design of the CT. Some aspects of the CT monitoring approach are also outlined. The scientific and logistic complexity of the WHI implies particular leadership and management challenges. The WHI organization and committee structure employed to respond to these challenges is also briefly described.

2,110 citations


Journal ArticleDOI
Shane A. McCarthy1, Sayantan Das2, Warren W. Kretzschmar3, Olivier Delaneau4, Andrew R. Wood5, Alexander Teumer6, Hyun Min Kang2, Christian Fuchsberger2, Petr Danecek1, Kevin Sharp3, Yang Luo1, C Sidore7, Alan Kwong2, Nicholas J. Timpson8, Seppo Koskinen, Scott I. Vrieze9, Laura J. Scott2, He Zhang2, Anubha Mahajan3, Jan H. Veldink, Ulrike Peters10, Ulrike Peters11, Carlos N. Pato12, Cornelia M. van Duijn13, Christopher E. Gillies2, Ilaria Gandin14, Massimo Mezzavilla, Arthur Gilly1, Massimiliano Cocca14, Michela Traglia, Andrea Angius7, Jeffrey C. Barrett1, D.I. Boomsma15, Kari Branham2, Gerome Breen16, Gerome Breen17, Chad M. Brummett2, Fabio Busonero7, Harry Campbell18, Andrew T. Chan19, Sai Chen2, Emily Y. Chew20, Francis S. Collins20, Laura J Corbin8, George Davey Smith8, George Dedoussis21, Marcus Dörr6, Aliki-Eleni Farmaki21, Luigi Ferrucci20, Lukas Forer22, Ross M. Fraser2, Stacey Gabriel23, Shawn Levy, Leif Groop24, Leif Groop25, Tabitha A. Harrison10, Andrew T. Hattersley5, Oddgeir L. Holmen26, Kristian Hveem26, Matthias Kretzler2, James Lee27, Matt McGue28, Thomas Meitinger29, David Melzer5, Josine L. Min8, Karen L. Mohlke30, John B. Vincent31, Matthias Nauck6, Deborah A. Nickerson11, Aarno Palotie23, Aarno Palotie19, Michele T. Pato12, Nicola Pirastu14, Melvin G. McInnis2, J. Brent Richards17, J. Brent Richards32, Cinzia Sala, Veikko Salomaa, David Schlessinger20, Sebastian Schoenherr22, P. Eline Slagboom33, Kerrin S. Small17, Tim D. Spector17, Dwight Stambolian34, Marcus A. Tuke5, Jaakko Tuomilehto, Leonard H. van den Berg, Wouter van Rheenen, Uwe Völker6, Cisca Wijmenga35, Daniela Toniolo, Eleftheria Zeggini1, Paolo Gasparini14, Matthew G. Sampson2, James F. Wilson18, Timothy M. Frayling5, Paul I.W. de Bakker36, Morris A. Swertz35, Steven A. McCarroll19, Charles Kooperberg10, Annelot M. Dekker, David Altshuler, Cristen J. Willer2, William G. Iacono28, Samuli Ripatti25, Nicole Soranzo27, Nicole Soranzo1, Klaudia Walter1, Anand Swaroop20, Francesco Cucca7, Carl A. Anderson1, Richard M. Myers, Michael Boehnke2, Mark I. McCarthy3, Mark I. McCarthy37, Richard Durbin1, Gonçalo R. Abecasis2, Jonathan Marchini3 
TL;DR: A reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies.
Abstract: We describe a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry. Using this resource leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies, and it can help to discover and refine causal loci. We describe remote server resources that allow researchers to carry out imputation and phasing consistently and efficiently.

1,574 citations


Frequently Asked Questions (2)
Q1. What are the contributions in "Full title: a large-scale transcriptome-wide association study (twas) of ten blood cell phenotypes reveals complexities of twas fine-mapping short title: twas fine-mapping of blood cell phenotypes authors" ?

Authors Amanda L Tapia MS1, Bryce T Rowland BS1, Jonathan D Rosen MS1, Michael Preuss PhD2, Kris Young PhD3, Misa Graff PhD3, Hélène Choquet PhD4, David J Couper PhD1, Steve Buyske PhD5, Stephanie A Bien PhD6, Eric Jorgenson PhD4, Charles Kooperberg PhD6, Ruth J. F. Loos PhD2, Alanna C Morrison PhD7, Kari E North PhD3, Bing Yu PhD7, Alexander P Reiner MD8, Yun Li PhD9,1,10 *, Laura M Raffield PhD9 * * Contributed equally to this work 

Joint/multiple tissue TWAS approaches such as UTMOST [ 9 ] and MR-JTI [ 8 ] could be employed in the future to assess the relevance of other tissues at blood-cell related loci.