Showing papers by "Michael Boehnke published in 2009"
••
National Institutes of Health1, University of Chicago2, Duke University3, Harvard University4, University of Oxford5, GlaxoSmithKline6, Johns Hopkins University7, Yale University8, deCODE genetics9, Howard Hughes Medical Institute10, Princeton University11, Washington University in St. Louis12, University of California, Berkeley13, Stanford University14, University of Michigan15, Cornell University16, University of Washington17, University of Queensland18, Vanderbilt University19, North Carolina State University20, QIMR Berghofer Medical Research Institute21
TL;DR: This paper examined potential sources of missing heritability and proposed research strategies, including and extending beyond current genome-wide association approaches, to illuminate the genetics of complex diseases and enhance its potential to enable effective disease prevention or treatment.
Abstract: Genome-wide association studies have identified hundreds of genetic variants associated with complex human diseases and traits, and have provided valuable insights into their genetic architecture. Most variants identified so far confer relatively small increments in risk, and explain only a small proportion of familial clustering, leading many to question how the remaining, 'missing' heritability can be explained. Here we examine potential sources of missing heritability and propose research strategies, including and extending beyond current genome-wide association approaches, to illuminate the genetics of complex diseases and enhance its potential to enable effective disease prevention or treatment.
7,797 citations
••
Cristen J. Willer, Elizabeth K. Speliotes1, Elizabeth K. Speliotes2, Ruth J. F. Loos +163 more•Institutions (36)
TL;DR: Several of the likely causal genes are highly expressed or known to act in the central nervous system (CNS), emphasizing, as in rare monogenic forms of obesity, the role of the CNS in predisposition to obesity.
Abstract: Common variants at only two loci, FTO and MC4R, have been reproducibly associated with body mass index (BMI) in humans. To identify additional loci, we conducted meta-analysis of 15 genome-wide association studies for BMI (n > 32,000) and followed up top signals in 14 additional cohorts (n > 59,000). We strongly confirm FTO and MC4R and identify six additional loci (P < 5 x 10(-8)): TMEM18, KCTD15, GNPDA2, SH2B1, MTCH2 and NEGR1 (where a 45-kb deletion polymorphism is a candidate causal variant). Several of the likely causal genes are highly expressed or known to act in the central nervous system (CNS), emphasizing, as in rare monogenic forms of obesity, the role of the CNS in predisposition to obesity.
1,710 citations
••
Massachusetts Institute of Technology1, Boston University2, Harvard University3, University of Michigan4, Merck & Co.5, University of Oxford6, National Institutes of Health7, French Institute of Health and Medical Research8, University of Eastern Finland9, University of Southern California10, National Institute for Health and Welfare11, Imperial College London12, University of Helsinki13, Lund University14, Wellcome Trust Sanger Institute15, Tufts University16, University of North Carolina at Chapel Hill17
TL;DR: The results suggest that the cumulative effect of multiple common variants contributes to polygenic dyslipidemia.
Abstract: Blood low-density lipoprotein (LDL) cholesterol, high-density lipoprotein (HDL) cholesterol and triglyceride levels are risk factors for cardiovascular disease. To dissect the polygenic basis of these traits, we conducted genome-wide association screens in 19,840 individuals and replication in up to 20,623 individuals. We identified 30 distinct loci associated with lipoprotein concentrations (each with P < 5 x 10(-8)), including 11 loci that reached genome-wide significance for the first time. The 11 newly defined loci include common variants associated with LDL cholesterol near ABCG8, MAFB, HNF1A and TIMD4; with HDL cholesterol near ANGPTL4, FADS1-FADS2-FADS3, HNF4A, LCAT, PLTP and TTC39B; and with triglycerides near AMAC1L2, FADS1-FADS2-FADS3 and PLTP. The proportion of individuals exceeding clinical cut points for high LDL cholesterol, low HDL cholesterol and high triglycerides varied according to an allelic dosage score (P < 10(-15) for each trend). These results suggest that the cumulative effect of multiple common variants contributes to polygenic dyslipidemia.
1,358 citations
••
Christopher Newton-Cheh1, Christopher Newton-Cheh2, Toby Johnson3, Toby Johnson4 +359 more•Institutions (64)
TL;DR: In this paper, the association between systolic or diastolic blood pressure and common variants in eight regions near the CYP17A1 (P = 7 × 10(-24)), CYP1A2(P = 1 × 10-23), FGF5 (P=1 × 10 -21), SH2B3(P= 3 × 10−18), MTHFR(MTHFR), c10orf107(P), ZNF652(ZNF652), PLCD3 (P,P = 5 × 10 −9),
Abstract: Elevated blood pressure is a common, heritable cause of cardiovascular disease worldwide. To date, identification of common genetic variants influencing blood pressure has proven challenging. We tested 2.5 million genotyped and imputed SNPs for association with systolic and diastolic blood pressure in 34,433 subjects of European ancestry from the Global BPgen consortium and followed up findings with direct genotyping (N ≤ 71,225 European ancestry, N ≤ 12,889 Indian Asian ancestry) and in silico comparison (CHARGE consortium, N = 29,136). We identified association between systolic or diastolic blood pressure and common variants in eight regions near the CYP17A1 (P = 7 × 10(-24)), CYP1A2 (P = 1 × 10(-23)), FGF5 (P = 1 × 10(-21)), SH2B3 (P = 3 × 10(-18)), MTHFR (P = 2 × 10(-13)), c10orf107 (P = 1 × 10(-9)), ZNF652 (P = 5 × 10(-9)) and PLCD3 (P = 1 × 10(-8)) genes. All variants associated with continuous blood pressure were associated with dichotomous hypertension. These associations between common variants and blood pressure and hypertension offer mechanistic insights into the regulation of blood pressure and may point to novel targets for interventions to prevent cardiovascular disease.
1,205 citations
••
Wellcome Trust Centre for Human Genetics1, Medical Research Council2, Harvard University3, Broad Institute4, Wellcome Trust Sanger Institute5, King's College London6, deCODE genetics7, Boston University8, University of Michigan9, Erasmus University Rotterdam10, National Institutes of Health11, VU University Amsterdam12, University of Oulu13, Lund University14, University of Virginia15, University Hospital of Lausanne16, University of Lausanne17, University of Southern California18, Imperial College London19, Ninewells Hospital20, University of California, Los Angeles21, University of Düsseldorf22, Novartis23, Swiss Institute of Bioinformatics24, European Bioinformatics Institute25, University of Eastern Finland26, GlaxoSmithKline27, University of North Carolina at Chapel Hill28, Oulu University Hospital29, University Medical Center Groningen30, University of Helsinki31, Ludwig Maximilian University of Munich32, University of Cambridge33, VU University Medical Center34, Leiden University Medical Center35, Brigham and Women's Hospital36, Massachusetts Institute of Technology37, University of Iceland38, University of Oxford39
TL;DR: Variants in the gene encoding melatonin receptor 1B (MTNR1B) were consistently associated with fasting glucose across all ten genome-wide association scans, and previous associations of fasting glucose with variants at the G6PC2 and GCK loci are confirmed.
Abstract: To identify previously unknown genetic loci associated with fasting glucose concentrations, we examined the leading association signals in ten genome-wide association scans involving a total of 36,610 individuals of European descent. Variants in the gene encoding melatonin receptor 1B (MTNR1B) were consistently associated with fasting glucose across all ten studies. The strongest signal was observed at rs10830963, where each G allele (frequency 0.30 in HapMap CEU) was associated with an increase of 0.07 (95% CI = 0.06-0.08) mmol/l in fasting glucose levels (P = 3.2 x 10(-50)) and reduced beta-cell function as measured by homeostasis model assessment (HOMA-B, P = 1.1 x 10(-15)). The same allele was associated with an increased risk of type 2 diabetes (odds ratio = 1.09 (1.05-1.12), per G allele P = 3.3 x 10(-7)) in a meta-analysis of 13 case-control studies totaling 18,236 cases and 64,453 controls. Our analyses also confirm previous associations of fasting glucose with variants at the G6PC2 (rs560887, P = 1.1 x 10(-57)) and GCK (rs4607517, P = 1.0 x 10(-25)) loci.
716 citations
••
TL;DR: In this paper, the authors showed that the risk genotype of this SNP predicts future type 2 diabetes (T2D) in two large prospective studies, and the risk was associated with impairment of early insulin response to both oral and intravenous glucose and with faster deterioration of insulin secretion over time.
Abstract: Genome-wide association studies have shown that variation in MTNR1B (melatonin receptor 1B) is associated with insulin and glucose concentrations. Here we show that the risk genotype of this SNP predicts future type 2 diabetes (T2D) in two large prospective studies. Specifically, the risk genotype was associated with impairment of early insulin response to both oral and intravenous glucose and with faster deterioration of insulin secretion over time. We also show that the MTNR1B mRNA is expressed in human islets, and immunocytochemistry confirms that it is primarily localized in beta cells in islets. Nondiabetic individuals carrying the risk allele and individuals with T2D showed increased expression of the receptor in islets. Insulin release from clonal beta cells in response to glucose was inhibited in the presence of melatonin. These data suggest that the circulating hormone melatonin, which is predominantly released from the pineal gland in the brain, is involved in the pathogenesis of T2D. Given the increased expression of MTNR1B in individuals at risk of T2D, the pathogenic effects are likely exerted via a direct inhibitory effect on beta cells. In view of these results, blocking the melatonin ligand-receptor system could be a therapeutic avenue in T2D.
652 citations
••
Cecilia M. Lindgren1, Iris M. Heid2, Joshua C. Randall1, Claudia Lamina3 +152 more•Institutions (36)
TL;DR: By focusing on anthropometric measures of central obesity and fat distribution, a meta-analysis of 16 genome-wide association studies informative for adult waist circumference and waist–hip ratio identified three loci implicated in the regulation of human adiposity.
Abstract: To identify genetic loci influencing central obesity and fat distribution, we performed a meta-analysis of 16 genome-wide association studies (GWAS, N = 38,580) informative for adult waist circumference (WC) and waist-hip ratio (WHR). We selected 26 SNPs for follow-up, for which the evidence of association with measures of central adiposity (WC and/or WHR) was strong and disproportionate to that for overall adiposity or height. Follow-up studies in a maximum of 70,689 individuals identified two loci strongly associated with measures of central adiposity; these map near TFAP2B (WC, P = 1.9x10(-11)) and MSRA (WC, P = 8.9x10(-9)). A third locus, near LYPLAL1, was associated with WHR in women only (P = 2.6x10(-8)). The variants near TFAP2B appear to influence central adiposity through an effect on overall obesity/fat-mass, whereas LYPLAL1 displays a strong female-only association with fat distribution. By focusing on anthropometric measures of central obesity and fat distribution, we have identified three loci implicated in the regulation of human adiposity.
648 citations
01 Jun 2009
TL;DR: Vandervell Foundation and Wellcome Trust (068545/Z/02, GR072960 as discussed by the authors, GR076113, GR069224, GR086596/Z /08/Z)
Abstract: Vandervell Foundation and Wellcome Trust (068545/Z/02, GR072960, GR076113, GR069224, 086596/Z/08/Z)
476 citations
••
University of Bonn1, Cardiff University2, Eli Lilly and Company3, Harvard University4, State University of New York Upstate Medical University5, NorthShore University HealthSystem6, University of California, San Diego7, National Institutes of Health8, Stanford University9, University of North Carolina at Chapel Hill10, Trinity College, Dublin11, Radboud University Nijmegen12, University of Pennsylvania13, University of St Andrews14, University of Western Australia15, University of California, Los Angeles16, Washington University in St. Louis17, University of Pittsburgh18, Johns Hopkins University19, Icahn School of Medicine at Mount Sinai20, University of Illinois at Urbana–Champaign21, University of Helsinki22, Université de Montréal23, University of Washington24, University of Toronto25, Vanderbilt University26, McMaster University27, University of Oslo28, University of Edinburgh29, University of Michigan30, University College London31, GlaxoSmithKline32, Indiana University33, Virginia Commonwealth University34, VU University Amsterdam35, University of Iowa36, University of California, San Francisco37, Howard University38, Institute of Physics39, QIMR Berghofer Medical Research Institute40, Columbia University41, Pfizer42, Rush University Medical Center43, Mayo Clinic44, Georgetown University45, Karolinska Institutet46, National Institute for Health and Welfare47, University of Queensland48, University of Aberdeen49, North Carolina State University50
TL;DR: GWAS methods have detected a remarkable number of robust genetic associations for dozens of common diseases and traits, leading to new pathophysiological hypotheses, although only small proportions of genetic variance have been explained thus far and therapeutic applications will require substantial further effort.
Abstract: Objective: The authors conducted a review of the history and empirical basis of genomewide association studies (GWAS), the rationale for GWAS of psychiatric disorders, results to date, limitations, and plans for GWAS meta-analyses. Method: A literature review was carried out, power and other issues discussed, and planned studies assessed. Results: Most of the genomic DNA sequence differences between any two people are common (frequency >5%) single nucleotide polymorphisms (SNPs). Because of localized patterns of correlation (linkage disequilibrium), 500,000 to 1,000,000 of these SNPs can test the hypothesis that one or more common variants explain part of the genetic risk for a disease. GWAS technologies can also detect some of the copy number variants (deletions and duplications) in the genome. Systematic study of rare variants will require large-scale resequencing analyses. GWAS methods have detected a remarkable number of robust genetic associations for dozens of common diseases and traits, leading to new pathophysiological hypotheses, although only small proportions of genetic variance have been explained thus far and therapeutic applications will require substantial further effort. Study design issues, power, and limitations are discussed. For psychiatric disorders, there are initial significant findings for common SNPs and for rare copy number variants, and many other studies are in progress. Conclusions: GWAS of large samples have detected associations of common SNPs and of rare copy number variants with psychiatric disorders. More findings are likely, since larger GWAS samples detect larger numbers of common susceptibility variants, with smaller effects. The Psychiatric GWAS Consortium is conducting GWAS meta-analyses for schizophrenia, bipolar disorder, major depressive disorder, autism, and attention deficit hyperactivity disorder. Based on results for other diseases, larger samples will be required. The contribution of GWAS will depend on the true genetic architecture of each disorder.
434 citations
••
University of Michigan1, GlaxoSmithKline2, University of Toronto3, Molecular and Behavioral Neuroscience Institute4, Stanford University5, University of California, Irvine6, Cornell University7, University of California, Davis8, University of Dundee9, King's College London10, Imperial College London11
TL;DR: Analysis of additional samples will be required to confirm that variant(s) in these regions influence BP risk, and these chromosomal regions harbor genes implicated in cell cycle, neurogenesis, neuroplasticity, and neurosignaling.
Abstract: Bipolar disorder (BP) is a disabling and often life-threatening disorder that affects ≈1% of the population worldwide. To identify genetic variants that increase the risk of BP, we genotyped on the Illumina HumanHap550 Beadchip 2,076 bipolar cases and 1,676 controls of European ancestry from the National Institute of Mental Health Human Genetics Initiative Repository, and the Prechter Repository and samples collected in London, Toronto, and Dundee. We imputed SNP genotypes and tested for SNP-BP association in each sample and then performed meta-analysis across samples. The strongest association P value for this 2-study meta-analysis was 2.4 × 10−6. We next imputed SNP genotypes and tested for SNP-BP association based on the publicly available Affymetrix 500K genotype data from the Wellcome Trust Case Control Consortium for 1,868 BP cases and a reference set of 12,831 individuals. A 3-study meta-analysis of 3,683 nonoverlapping cases and 14,507 extended controls on >2.3 M genotyped and imputed SNPs resulted in 3 chromosomal regions with association P ≈ 10−7: 1p31.1 (no known genes), 3p21 (>25 known genes), and 5q15 (MCTP1). The most strongly associated nonsynonymous SNP rs1042779 (OR = 1.19, P = 1.8 × 10−7) is in the ITIH1 gene on chromosome 3, with other strongly associated nonsynonymous SNPs in GNL3, NEK4, and ITIH3. Thus, these chromosomal regions harbor genes implicated in cell cycle, neurogenesis, neuroplasticity, and neurosignaling. In addition, we replicated the reported ANK3 association results for SNP rs10994336 in the nonoverlapping GSK sample (OR = 1.37, P = 0.042). Although these results are promising, analysis of additional samples will be required to confirm that variant(s) in these regions influence BP risk.
308 citations
••
TL;DR: It is shown that overestimation of the genetic effect by the uncorrected estimator decreases as the power of the association study increases and that the ascertainment‐corrected method reduces absolute bias and mean square error unless power to detect association is high.
Abstract: Genetic association studies are a powerful tool to detect genetic variants that predispose to human disease. Once an associated variant is identified, investigators are also interested in estimating the effect of the identified variant on disease risk. Estimates of the genetic effect based on new association findings tend to be upwardly biased due to a phenomenon known as the ‘‘winner’s curse.’’ Overestimation of genetic effect size in initial studies may cause follow-up studies to be underpowered and so to fail. In this paper, we quantify the impact of the winner’s curse on the allele frequency difference and odds ratio estimators for one- and two-stage case-control association studies. We then propose an ascertainmentcorrected maximum likelihood method to reduce the bias of these estimators. We show that overestimation of the genetic effect by the uncorrected estimator decreases as the power of the association study increases and that the ascertainmentcorrected method reduces absolute bias and mean square error unless power to detect association is high. Genet. Epidemiol. 33:453–462, 2009. r 2009 Wiley-Liss, Inc.
••
TL;DR: Eight type 2 diabetes–related loci were significantly or nominally associated with impaired early-phase insulin release and effects of SLC30A8, HHEX, CDKAL1, and TCF7L2 on insulin release could be partially explained by impaired proinsulin conversion.
Abstract: OBJECTIVE We investigated the effects of 18 confirmed type 2 diabetes risk single nucleotide polymorphisms (SNPs) on insulin sensitivity, insulin secretion, and conversion of proinsulin to insulin. RESEARCH DESIGN AND METHODS A total of 5,327 nondiabetic men (age 58 ± 7 years, BMI 27.0 ± 3.8 kg/m 2 ) from a large population-based cohort were included. Oral glucose tolerance tests and genotyping of SNPs in or near PPARG , KCNJ11 , TCF7L2 , SLC30A8 , HHEX , LOC387761 , CDKN2B , IGF2BP2 , CDKAL1 , HNF1B , WFS1 , JAZF1, CDC123 , TSPAN8 , THADA , ADAMTS9 , NOTCH2 , KCNQ1 , and MTNR1B were performed. HNF1B rs757210 was excluded because of failure to achieve Hardy-Weinberg equilibrium. RESULTS Six SNPs ( TCF7L2 , SLC30A8, HHEX , CDKN2B , CDKAL1 , and MTNR1B ) were significantly ( P −4 ) and two SNPs ( KCNJ11 and IGF2BP2 ) were nominally ( P 0–30 /GluAUC 0–30 ), adjusted for age, BMI, and insulin sensitivity (Matsuda ISI). Combined effects of these eight SNPs reached −32% reduction in InsAUC 0–30 /GluAUC 0–30 in carriers of ≥11 vs. ≤3 weighted risk alleles. Four SNPs ( SLC30A8 , HHEX , CDKAL1 , and TCF7L2 ) were significantly or nominally associated with indexes of proinsulin conversion. Three SNPs ( KCNJ11 , HHEX , and TSPAN8 ) were nominally associated with Matsuda ISI (adjusted for age and BMI). The effect of HHEX on Matsuda ISI became significant after additional adjustment for InsAUC 0–30 /GluAUC 0–30 . Nine SNPs did not show any associations with examined traits. CONCLUSIONS Eight type 2 diabetes–related loci were significantly or nominally associated with impaired early-phase insulin release. Effects of SLC30A8 , HHEX , CDKAL1 , and TCF7L2 on insulin release could be partially explained by impaired proinsulin conversion. HHEX might influence both insulin release and insulin sensitivity.
••
TL;DR: No association between expression of TCF7L2 in eight types of human tissue samples and T2D-associated genetic variants remained significant, and a tissue-specific pattern of alternative splicing was identified.
Abstract: Common variants in the transcription factor 7-like 2 (TCF7L2) gene have been identified as the strongest genetic risk factors for type 2 diabetes (T2D). However, the mechanisms by which these non-coding variants increase risk for T2D are not well-established. We used 13 expression assays to survey mRNA expression of multiple TCF7L2 splicing forms in up to 380 samples from eight types of human tissue (pancreas, pancreatic islets, colon, liver, monocytes, skeletal muscle, subcutaneous adipose tissue and lymphoblastoid cell lines) and observed a tissue-specific pattern of alternative splicing. We tested whether the expression of TCF7L2 splicing forms was associated with single nucleotide polymorphisms (SNPs), rs7903146 and rs12255372, located within introns 3 and 4 of the gene and most strongly associated with T2D. Expression of two splicing forms was lower in pancreatic islets with increasing counts of T2D-associated alleles of the SNPs: a ubiquitous splicing form (P = 0.018 for rs7903146 and P = 0.020 for rs12255372) and a splicing form found in pancreatic islets, pancreas and colon but not in other tissues tested here (P = 0.009 for rs12255372 and P = 0.053 for rs7903146). Expression of this form in glucose-stimulated pancreatic islets correlated with expression of proinsulin (r(2) = 0.84-0.90, P < 0.00063). In summary, we identified a tissue-specific pattern of alternative splicing of TCF7L2. After adjustment for multiple tests, no association between expression of TCF7L2 in eight types of human tissue samples and T2D-associated genetic variants remained significant. Alternative splicing of TCF7L2 in pancreatic islets warrants future studies. GenBank Accession Numbers: FJ010164-FJ010174.
••
TL;DR: The data suggest that C-allele carriers of the IL6 -174G>C polymorphism have lower fasting glucose levels on average, which substantiates previous findings of decreased T2DM risk of these subjects.
Abstract: Background. Several studies have investigated associations between the -174GC single nucleotide polymorphism (rs1800795) of the IL6 gene and phenotypes related to type 2 diabetes mellitus (T2DM) but presented inconsistent results. Aims. This joint analysis aimed to clarify whether IL6 -174GC was associated with glucose and circulating interleukin-6 concentrations as well as body mass index (BMI). Methods. Individual-level data from all studies of the IL6-T2DM consortium on Caucasian subjects with available BMI were collected. As study-specific estimates did not show heterogeneity (P0.1), they were combined by using the inverse-variance fixed-effect model. Results. The main analysis included 9440, 7398, 24,117, or 5659 non-diabetic and manifest T2DM subjects for fasting glucose, 2-hour glucose, BMI, or circulating interleukin-6 levels, respectively. IL6 -174 C-allele carriers had significantly lower fasting glucose (-0.091 mmol/L, P=0.014). There was no evidence for association between IL6 -174GC and BMI or interleukin-6 levels, except in some subgroups. Conclusions. Our data suggest that C-allele carriers of the IL6 -174GC polymorphism have lower fasting glucose levels on average, which substantiates previous findings of decreased T2DM risk of these subjects.
••
University of Oxford1, Nuffield Orthopaedic Centre2, Harvard University3, Broad Institute4, Wellcome Trust Sanger Institute5, Wellcome Trust6, Helmholtz Zentrum München7, Steno Diabetes Center8, Lund University9, University of Southern Denmark10, Norwegian University of Science and Technology11, University of Eastern Finland12, Churchill Hospital13, University of Dundee14, Boston University15
TL;DR: A Bayesian meta-analysis approach to data from 19 studies on 17 replicated associations with type 2 diabetes yielded point estimates for the genetic effects that were very similar to those previously reported based on fixed- or random-effects models, but uncertainty about several of the effects was substantially larger.
Abstract: For most associations of common single nucleotide polymorphisms (SNPs) with common diseases, the genetic model of inheritance is unknown. The authors extended and applied a Bayesian meta-analysis approach to data from 19 studies on 17 replicated associations with type 2 diabetes. For 13 SNPs, the data fitted very well to an additive model of inheritance for the diabetes risk allele; for 4 SNPs, the data were consistent with either an additive model or a dominant model; and for 2 SNPs, the data were consistent with an additive or recessive model. Results were robust to the use of different priors and after exclusion of data for which index SNPs had been examined indirectly through proxy markers. The Bayesian meta-analysis model yielded point estimates for the genetic effects that were very similar to those previously reported based on fixed- or random-effects models, but uncertainty about several of the effects was substantially larger. The authors also examined the extent of between-study heterogeneity in the genetic model and found generally small between-study deviation values for the genetic model parameter. Heterosis could not be excluded for 4 SNPs. Information on the genetic model of robustly replicated association signals derived from genome-wide association studies may be useful for predictive modeling and for designing biologic and functional experiments.
••
TL;DR: Through computer simulation, it is shown that GSM correctly controls false‐positive rates and improves power to detect true disease predisposing variants and compares GSM to genomic control using computer simulations, and finds improved power using GSM.
Abstract: Genome-wide association studies are helping to dissect the etiology of complex diseases. Although case-control association tests are generally more powerful than family-based association tests, population stratification can lead to spurious diseasemarker association or mask a true association. Several methods have been proposed to match cases and controls prior to genotyping, using family information or epidemiological data, or using genotype data for a modest number of genetic markers. Here, we describe a genetic similarity score matching (GSM) method for efficient matched analysis of cases and controls in a genome-wide or large-scale candidate gene association study. GSM comprises three steps: (1) calculating similarity scores for pairs of individuals using the genotype data; (2) matching sets of cases and controls based on the similarity scores so that matched cases and controls have similar genetic background; and (3) using conditional logistic regression to perform association tests. Through computer simulation we show that GSM correctly controls false-positive rates and improves power to detect true disease predisposing variants. We compare GSM to genomic control using computer simulations, and find improved power using GSM. We suggest that initial matching of cases and controls prior to genotyping combined with careful re-matching after genotyping is a method of choice for genome-wide association studies. Genet. Epidemiol. 33:508–517, 2009. r 2009 Wiley-Liss, Inc.
••
TL;DR: It is demonstrated that ontology fingerprints can be used effectively to prioritize genes from GWA studies for experimental validation and identified genes relevant to lipid metabolism from the literature even in cases where such knowledge was not reflected in current annotation of these genes.
Abstract: Motivation: Genome-wide association (GWA) studies may identify multiple variants that are associated with a disease or trait. To narrow down candidates for further validation, quantitatively assessing how identified genes relate to a phenotype of interest is important.
Results: We describe an approach to characterize genes or biological concepts (phenotypes, pathways, diseases, etc.) by ontology fingerprint—the set of Gene Ontology (GO) terms that are overrepresented among the PubMed abstracts discussing the gene or biological concept together with the enrichment p-value of these terms generated from a hypergeometric enrichment test. We then quantify the relevance of genes to the trait from a GWA study by calculating similarity scores between their ontology fingerprints using enrichment p-values. We validate this approach by correctly identifying corresponding genes for biological pathways with a 90% average area under the ROC curve (AUC). We applied this approach to rank genes identified through a GWA study that are associated with the lipid concentrations in plasma as well as to prioritize genes within linkage disequilibrium (LD) block. We found that the genes with highest scores were: ABCA1, lipoprotein lipase (LPL) and cholesterol ester transfer protein, plasma for high-density lipoprotein; low-density lipoprotein receptor, APOE and APOB for low-density lipoprotein; and LPL, APOA1 and APOB for triglyceride. In addition, we identified genes relevant to lipid metabolism from the literature even in cases where such knowledge was not reflected in current annotation of these genes. These results demonstrate that ontology fingerprints can be used effectively to prioritize genes from GWA studies for experimental validation.
Contact: [email protected]
Supplementary information: Supplementary data are available at Bioinformatics online.
••
TL;DR: This paper develops and applies a general variance- component framework for pedigree analysis of continuous and categorical outcomes and demonstrates that one can perform variance-component pedigree analysis on outcomes that follow any exponential-family distribution.
Abstract: Variance-component methods are popular and flexible analytic tools for elucidating the genetic mechanisms of complex quantitative traits from pedigree data. However, variance-component methods typically assume that the trait of interest follows a multivariate normal distribution within a pedigree. Studies have shown that violation of this normality assumption can lead to biased parameter estimates and inflations in type-I error. This limits the application of variance-component methods to more general trait outcomes, whether continuous or categorical in nature. In this paper, we develop and apply a general variance-component framework for pedigree analysis of continuous and categorical outcomes. We develop appropriate models using generalized-linear mixed model theory and fit such models using approximate maximum-likelihood procedures. Using our proposed method, we demonstrate that one can perform variance-component pedigree analysis on outcomes that follow any exponential-family distribution. Additionally, we also show how one can modify the method to perform pedigree analysis of ordinal outcomes. We also discuss extensions of our variance-component framework to accommodate pedigrees ascertained based on trait outcome. We demonstrate the feasibility of our method using both simulated data and data from a genetic study of ovarian insufficiency.