scispace - formally typeset
Search or ask a question
Author

Robert M. Plenge

Other affiliations: Broad Institute, Celgene, Merck & Co.  ...read more
Bio: Robert M. Plenge is an academic researcher from Brigham and Women's Hospital. The author has contributed to research in topics: Genome-wide association study & Population. The author has an hindex of 69, co-authored 145 publications receiving 31724 citations. Previous affiliations of Robert M. Plenge include Broad Institute & Celgene.


Papers
More filters
Journal ArticleDOI
TL;DR: This work describes a method that enables explicit detection and correction of population stratification on a genome-wide scale and uses principal components analysis to explicitly model ancestry differences between cases and controls.
Abstract: Population stratification—allele frequency differences between cases and controls due to systematic ancestry differences—can cause spurious associations in disease studies. We describe a method that enables explicit detection and correction of population stratification on a genome-wide scale. Our method uses principal components analysis to explicitly model ancestry differences between cases and controls. The resulting correction is specific to a candidate marker’s variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. Our simple, efficient approach can easily be applied to disease studies with hundreds of thousands of markers. Population stratification—allele frequency differences between cases and controls due to systematic ancestry differences—can cause spurious associations in disease studies 1‐8 . Because the effects of stratification vary in proportion to the number of samples 9 , stratification will be an increasing problem in the large-scale association studies of the future, which will analyze thousands of samples in an effort to detect common genetic variants of weak effect. The two prevailing methods for dealing with stratification are genomic control and structured association 9‐14 . Although genomic control and structured association have proven useful in a variety of contexts, they have limitations. Genomic control corrects for stratification by adjusting association statistics at each marker by a uniform overall inflation factor. However, some markers differ in their allele frequencies across ancestral populations more than others. Thus, the uniform adjustment applied by genomic control may be insufficient at markers having unusually strong differentiation across ancestral populations and may be superfluous at markers devoid of such differentiation, leading to a loss in power. Structured association uses a program such as STRUCTURE 15 to assign the samples to discrete subpopulation clusters and then aggregates evidence of association within each cluster. If fractional membership in more than one cluster is allowed, the method cannot currently be applied to genome-wide association studies because of its intensive computational cost on large data sets. Furthermore, assignments of individuals to clusters are highly sensitive to the number of clusters, which is not well defined 14,16 .

9,387 citations

Journal ArticleDOI
Yukinori Okada1, Yukinori Okada2, Di Wu2, Di Wu1, Di Wu3, Gosia Trynka2, Gosia Trynka1, Towfique Raj2, Towfique Raj1, Chikashi Terao4, Katsunori Ikari, Yuta Kochi, Koichiro Ohmura4, Akari Suzuki, Shinji Yoshida, Robert R. Graham5, A. Manoharan5, Ward Ortmann5, Tushar Bhangale5, Joshua C. Denny6, Robert J. Carroll6, Anne E. Eyler6, Jeff Greenberg7, Joel M. Kremer, Dimitrios A. Pappas8, Lei Jiang9, Jian Yin9, Lingying Ye9, Ding Feng Su9, Jian Yang10, Gang Xie11, E.C. Keystone11, Harm-Jan Westra12, Tõnu Esko2, Tõnu Esko1, Tõnu Esko13, Andres Metspalu13, Xuezhong Zhou14, Namrata Gupta1, Daniel B. Mirel1, Eli A. Stahl15, Dorothee Diogo2, Dorothee Diogo1, Jing Cui2, Jing Cui1, Katherine P. Liao1, Katherine P. Liao2, Michael H. Guo2, Michael H. Guo1, Keiko Myouzen, Takahisa Kawaguchi4, Marieke J H Coenen16, Piet L. C. M. van Riel16, Mart A F J van de Laar17, Henk-Jan Guchelaar18, Tom W J Huizinga18, Philippe Dieudé19, Xavier Mariette20, S. Louis Bridges21, Alexandra Zhernakova12, Alexandra Zhernakova18, René E. M. Toes18, Paul P. Tak22, Paul P. Tak23, Paul P. Tak24, Corinne Miceli-Richard20, So Young Bang25, Hye Soon Lee25, Javier Martin26, Miguel A. Gonzalez-Gay, Luis Rodriguez-Rodriguez27, Solbritt Rantapää-Dahlqvist28, Lisbeth Ärlestig28, Hyon K. Choi29, Hyon K. Choi2, Yoichiro Kamatani30, Pilar Galan19, Mark Lathrop31, Steve Eyre32, Steve Eyre33, John Bowes32, John Bowes33, Anne Barton32, Niek de Vries22, Larry W. Moreland34, Lindsey A. Criswell35, Elizabeth W. Karlson2, Atsuo Taniguchi, Ryo Yamada4, Michiaki Kubo, Jun Liu2, Sang Cheol Bae25, Jane Worthington33, Jane Worthington32, Leonid Padyukov36, Lars Klareskog36, Peter K. Gregersen37, Soumya Raychaudhuri1, Soumya Raychaudhuri2, Barbara E. Stranger38, Philip L. De Jager1, Philip L. De Jager2, Lude Franke12, Peter M. Visscher10, Matthew A. Brown10, Hisashi Yamanaka, Tsuneyo Mimori4, Atsushi Takahashi, Huji Xu9, Timothy W. Behrens5, Katherine A. Siminovitch11, Shigeki Momohara, Fumihiko Matsuda4, Kazuhiko Yamamoto39, Robert M. Plenge1, Robert M. Plenge2 
20 Feb 2014-Nature
TL;DR: A genome-wide association study meta-analysis in a total of >100,000 subjects of European and Asian ancestries provides empirical evidence that the genetics of RA can provide important information for drug discovery, and sheds light on fundamental genes, pathways and cell types that contribute to RA pathogenesis.
Abstract: A major challenge in human genetics is to devise a systematic strategy to integrate disease-associated variants with diverse genomic and biological data sets to provide insight into disease pathogenesis and guide drug discovery for complex traits such as rheumatoid arthritis (RA)1. Here we performed a genome-wide association study meta-analysis in a total of >100,000 subjects of European and Asian ancestries (29,880 RA cases and 73,758 controls), by evaluating ~10 million single-nucleotide polymorphisms. We discovered 42 novel RA risk loci at a genome-wide level of significance, bringing the total to 101 (refs 2, 3, 4). We devised an in silico pipeline using established bioinformatics methods based on functional annotation5, cis-acting expression quantitative trait loci6 and pathway analyses7, 8, 9—as well as novel methods based on genetic overlap with human primary immunodeficiency, haematological cancer somatic mutations and knockout mouse phenotypes—to identify 98 biological candidate genes at these 101 risk loci. We demonstrate that these genes are the targets of approved therapies for RA, and further suggest that drugs approved for other indications may be repurposed for the treatment of RA. Together, this comprehensive genetic study sheds light on fundamental genes, pathways and cell types that contribute to RA pathogenesis, and provides empirical evidence that the genetics of RA can provide important information for drug discovery.

1,910 citations

Journal ArticleDOI
TL;DR: Seven new rheumatoid arthritis risk alleles were identified at genome-wide significance (P < 5 × 10−8) in an analysis of all 41,282 samples, and an additional 11 SNPs replicated at P < 0.05, suggesting that most represent genuine rhearatoid arthritisrisk alleles.
Abstract: To identify new genetic risk factors for rheumatoid arthritis, we conducted a genome-wide association study meta-analysis of 5,539 autoantibody-positive individuals with rheumatoid arthritis (cases) and 20,169 controls of European descent, followed by replication in an independent set of 6,768 rheumatoid arthritis cases and 8,806 controls. Of 34 SNPs selected for replication, 7 new rheumatoid arthritis risk alleles were identified at genome-wide significance (P < 5 x 10(-8)) in an analysis of all 41,282 samples. The associated SNPs are near genes of known immune function, including IL6ST, SPRED2, RBPJ, CCR6, IRF5 and PXK. We also refined associations at two established rheumatoid arthritis risk loci (IL2RA and CCL21) and confirmed the association at AFF3. These new associations bring the total number of confirmed rheumatoid arthritis risk loci to 31 among individuals of European ancestry. An additional 11 SNPs replicated at P < 0.05, many of which are validated autoimmune risk alleles, suggesting that most represent genuine rheumatoid arthritis risk alleles.

1,277 citations

Journal ArticleDOI
10 Dec 2010-Science
TL;DR: Differences in binding to viral peptide antigens by HLA may be the major factors underlying genetic differences between HIV controllers and progressors, and genome-wide association results implicate the nature of the HLA–viral peptide interaction as the major factor modulating durable control of HIV infection.
Abstract: Infectious and inflammatory diseases have repeatedly shown strong genetic associations within the major histocompatibility complex (MHC); however, the basis for these associations remains elusive. To define host genetic effects on the outcome of a chronic viral infection, we performed genome-wide association analysis in a multiethnic cohort of HIV-1 controllers and progressors, and we analyzed the effects of individual amino acids within the classical human leukocyte antigen (HLA) proteins. We identified >300 genome-wide significant single-nucleotide polymorphisms (SNPs) within the MHC and none elsewhere. Specific amino acids in the HLA-B peptide binding groove, as well as an independent HLA-C effect, explain the SNP associations and reconcile both protective and risk HLA alleles. These results implicate the nature of the HLA-viral peptide interaction as the major factor modulating durable control of HIV infection.

1,038 citations

Journal ArticleDOI
TL;DR: A haplotype of STAT4 is associated with increased risk for both rheumatoid arthritis and systemic lupus erythematosus, suggesting a shared pathway for these illnesses.
Abstract: A SNP haplotype in the third intron of STAT4 was associated with susceptibility to both rheumatoid arthritis and systemic lupus erythematosus. The minor alleles of the haplotype-defining SNPs were present in 27% of chromosomes of patients with established rheumatoid arthritis, as compared with 22% of those of controls (for the SNP rs7574865, P = 2.81×10 −7 ; odds ratio for having the risk allele in chromosomes of patients vs. those of controls, 1.32). The association was replicated in Swedish patients with recent-onset rheumatoid arthritis (P = 0.02) and matched controls. The haplotype marked by rs7574865 was strongly associated with lupus, being present on 31% of chromosomes of case patients and 22% of those of controls (P = 1.87×10 −9 ; odds ratio for having the risk allele in chromosomes of patients vs. those of controls, 1.55). Homozygosity of the risk allele, as compared with absence of the allele, was associated with a more than doubled risk for lupus and a 60% increased risk for rheumatoid arthritis. CONCLUSIONS

1,008 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Abstract: Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

26,280 citations

Journal Article
TL;DR: For the next few weeks the course is going to be exploring a field that’s actually older than classical population genetics, although the approach it’ll be taking to it involves the use of population genetic machinery.
Abstract: So far in this course we have dealt entirely with the evolution of characters that are controlled by simple Mendelian inheritance at a single locus. There are notes on the course website about gametic disequilibrium and how allele frequencies change at two loci simultaneously, but we didn’t discuss them. In every example we’ve considered we’ve imagined that we could understand something about evolution by examining the evolution of a single gene. That’s the domain of classical population genetics. For the next few weeks we’re going to be exploring a field that’s actually older than classical population genetics, although the approach we’ll be taking to it involves the use of population genetic machinery. If you know a little about the history of evolutionary biology, you may know that after the rediscovery of Mendel’s work in 1900 there was a heated debate between the “biometricians” (e.g., Galton and Pearson) and the “Mendelians” (e.g., de Vries, Correns, Bateson, and Morgan). Biometricians asserted that the really important variation in evolution didn’t follow Mendelian rules. Height, weight, skin color, and similar traits seemed to

9,847 citations

Journal ArticleDOI
Paul Burton1, David Clayton2, Lon R. Cardon, Nicholas John Craddock3  +192 moreInstitutions (4)
07 Jun 2007-Nature
TL;DR: This study has demonstrated that careful use of a shared control group represents a safe and effective approach to GWA analyses of multiple disease phenotypes; generated a genome-wide genotype database for future studies of common diseases in the British population; and shown that, provided individuals with non-European ancestry are excluded, the extent of population stratification in theBritish population is generally modest.
Abstract: There is increasing evidence that genome-wide association ( GWA) studies represent a powerful approach to the identification of genes involved in common human diseases. We describe a joint GWA study ( using the Affymetrix GeneChip 500K Mapping Array Set) undertaken in the British population, which has examined similar to 2,000 individuals for each of 7 major diseases and a shared set of similar to 3,000 controls. Case-control comparisons identified 24 independent association signals at P < 5 X 10(-7): 1 in bipolar disorder, 1 in coronary artery disease, 9 in Crohn's disease, 3 in rheumatoid arthritis, 7 in type 1 diabetes and 3 in type 2 diabetes. On the basis of prior findings and replication studies thus-far completed, almost all of these signals reflect genuine susceptibility effects. We observed association at many previously identified loci, and found compelling evidence that some loci confer risk for more than one of the diseases studied. Across all diseases, we identified a large number of further signals ( including 58 loci with single-point P values between 10(-5) and 5 X 10(-7)) likely to yield additional susceptibility loci. The importance of appropriately large samples was confirmed by the modest effect sizes observed at most loci identified. This study thus represents a thorough validation of the GWA approach. It has also demonstrated that careful use of a shared control group represents a safe and effective approach to GWA analyses of multiple disease phenotypes; has generated a genome-wide genotype database for future studies of common diseases in the British population; and shown that, provided individuals with non-European ancestry are excluded, the extent of population stratification in the British population is generally modest. Our findings offer new avenues for exploring the pathophysiology of these important disorders. We anticipate that our data, results and software, which will be widely available to other investigators, will provide a powerful resource for human genetics research.

9,244 citations

Journal ArticleDOI
TL;DR: The GCTA software is a versatile tool to estimate and partition complex trait variation with large GWAS data sets and focuses on the function of estimating the variance explained by all the SNPs on the X chromosome and testing the hypotheses of dosage compensation.
Abstract: For most human complex diseases and traits, SNPs identified by genome-wide association studies (GWAS) explain only a small fraction of the heritability. Here we report a user-friendly software tool called genome-wide complex trait analysis (GCTA), which was developed based on a method we recently developed to address the “missing heritability” problem. GCTA estimates the variance explained by all the SNPs on a chromosome or on the whole genome for a complex trait rather than testing the association of any particular SNP to the trait. We introduce GCTA's five main functions: data management, estimation of the genetic relationships from SNPs, mixed linear model analysis of variance explained by the SNPs, estimation of the linkage disequilibrium structure, and GWAS simulation. We focus on the function of estimating the variance explained by all the SNPs on the X chromosome and testing the hypotheses of dosage compensation. The GCTA software is a versatile tool to estimate and partition complex trait variation with large GWAS data sets.

5,867 citations