Showing papers in "American Journal of Human Genetics in 2014"
••
TL;DR: An overview of statistical issues in rare-variant association studies with a focus on study designs and statistical tests is provided and various gene- or region-based association tests are compared in terms of their assumptions and performance.
Abstract: Despite the extensive discovery of trait- and disease-associated common variants, much of the genetic contribution to complex traits remains unexplained. Rare variants can explain additional disease risk or trait variability. An increasing number of studies are underway to identify trait- and disease-associated rare variants. In this review, we provide an overview of statistical issues in rare-variant association studies with a focus on study designs and statistical tests. We present the design and analysis pipeline of rare-variant studies and review cost-effective sequencing designs and genotyping platforms. We compare various gene- or region-based association tests, including burden tests, variance-component tests, and combined omnibus tests, in terms of their assumptions and performance. Also discussed are the related topics of meta-analysis, population-stratification adjustment, genotype imputation, follow-up studies, and heritability due to rare variants. We provide guidelines for analysis and discuss some of the challenges inherent in these studies and future research directions.
869 citations
••
Icahn School of Medicine at Mount Sinai1, Pierre-and-Marie-Curie University2, French Institute of Health and Medical Research3, Centre national de la recherche scientifique4, University of Toronto5, Trinity College, Dublin6, University of Pittsburgh7, Utrecht University8, McMaster University9, Our Lady's Children's Hospital10, University College Dublin11, University of Oxford12, University of Lisbon13, Instituto Nacional de Saúde Dr. Ricardo Jorge14, University of California, Los Angeles15, University of Miami16, Goethe University Frankfurt17, University of Pennsylvania18, Vanderbilt University19, Temple University20, University of Bologna21, Cancer Care Ontario22, University of Southern California23, University of Alberta24, University of Birmingham25, Université de Montréal26, Rush University Medical Center27, University of Coimbra28, Kaiser Permanente29, Cornell University30, Newcastle University31, University of Minnesota32, University of Illinois at Chicago33, University of Gothenburg34, Memorial University of Newfoundland35, Duke University36, University of Paris37, Centre for Mental Health38, King's College London39, University of Washington40, Nationwide Children's Hospital41, Indiana University42, Tufts University43, German Cancer Research Center44, University of Utah45, Stanford University46
TL;DR: For example, the authors analyzed 2,446 ASD-affected families and confirmed an excess of genic deletions and duplications in affected versus control groups (1.41-fold, p = 1.0 × 10(-5)) and an increase in affected subjects carrying exonic pathogenic CNVs overlapping known loci associated with dominant or X-linked ASD and intellectual disability.
Abstract: Rare copy-number variation (CNV) is an important source of risk for autism spectrum disorders (ASDs). We analyzed 2,446 ASD-affected families and confirmed an excess of genic deletions and duplications in affected versus control groups (1.41-fold, p = 1.0 × 10(-5)) and an increase in affected subjects carrying exonic pathogenic CNVs overlapping known loci associated with dominant or X-linked ASD and intellectual disability (odds ratio = 12.62, p = 2.7 × 10(-15), ∼3% of ASD subjects). Pathogenic CNVs, often showing variable expressivity, included rare de novo and inherited events at 36 loci, implicating ASD-associated genes (CHD2, HDAC4, and GDI1) previously linked to other neurodevelopmental disorders, as well as other genes such as SETD5, MIR137, and HDAC9. Consistent with hypothesized gender-specific modulators, females with ASD were more likely to have highly penetrant CNVs (p = 0.017) and were also overrepresented among subjects with fragile X syndrome protein targets (p = 0.02). Genes affected by de novo CNVs and/or loss-of-function single-nucleotide variants converged on networks related to neuronal signaling and development, synapse function, and chromatin regulation.
833 citations
••
TL;DR: In this paper, the authors applied variance-component methods to imputed genotype data for 11 common diseases to partition the heritability explained by genotyped SNPs (hg(2)) across functional categories.
Abstract: Regulatory and coding variants are known to be enriched with associations identified by genome-wide association studies (GWASs) of complex disease, but their contributions to trait heritability are currently unknown. We applied variance-component methods to imputed genotype data for 11 common diseases to partition the heritability explained by genotyped SNPs (hg(2)) across functional categories (while accounting for shared variance due to linkage disequilibrium). Extensive simulations showed that in contrast to current estimates from GWAS summary statistics, the variance-component approach partitions heritability accurately under a wide range of complex-disease architectures. Across the 11 diseases DNaseI hypersensitivity sites (DHSs) from 217 cell types spanned 16% of imputed SNPs (and 24% of genotyped SNPs) but explained an average of 79% (SE = 8%) of hg(2) from imputed SNPs (5.1× enrichment; p = 3.7 × 10(-17)) and 38% (SE = 4%) of hg(2) from genotyped SNPs (1.6× enrichment, p = 1.0 × 10(-4)). Further enrichment was observed at enhancer DHSs and cell-type-specific DHSs. In contrast, coding variants, which span 1% of the genome, explained <10% of hg(2) despite having the highest enrichment. We replicated these findings but found no significant contribution from rare coding variants in independent schizophrenia cohorts genotyped on GWAS and exome chips. Our results highlight the value of analyzing components of heritability to unravel the functional architecture of common disease.
562 citations
••
TL;DR: A statistical model is described that uses association statistics computed across the genome to identify classes of genomic elements that are enriched with or depleted of loci influencing a trait, and naturally incorporates multiple types of annotations.
Abstract: Annotations of gene structures and regulatory elements can inform genome-wide association studies (GWASs). However, choosing the relevant annotations for interpreting an association study of a given trait remains challenging. I describe a statistical model that uses association statistics computed across the genome to identify classes of genomic elements that are enriched with or depleted of loci influencing a trait. The model naturally incorporates multiple types of annotations. I applied the model to GWASs of 18 human traits, including red blood cell traits, platelet traits, glucose levels, lipid levels, height, body mass index, and Crohn disease. For each trait, I used the model to evaluate the relevance of 450 different genomic annotations, including protein-coding genes, enhancers, and DNase-I hypersensitive sites in over 100 tissues and cell lines. The fraction of phenotype-associated SNPs influencing protein sequence ranged from around 2% (for platelet volume) up to around 20% (for low-density lipoprotein cholesterol), repressed chromatin was significantly depleted for SNPs associated with several traits, and cell-type-specific DNase-I hypersensitive sites were enriched with SNPs associated with several traits (for example, the spleen in platelet volume). Finally, reweighting each GWAS by using information from functional genomics increased the number of loci with high-confidence associations by around 5%.
546 citations
••
TL;DR: It is strongly suggested that females have an increased etiological burden unlinked to rare deleterious variants on the X chromosome.
Abstract: Increased male prevalence has been repeatedly reported in several neurodevelopmental disorders (NDs), leading to the concept of a “female protective model.” We investigated the molecular basis of this sex-based difference in liability and demonstrated an excess of deleterious autosomal copy-number variants (CNVs) in females compared to males (odds ratio [OR] = 1.46, p = 8 × 10−10) in a cohort of 15,585 probands ascertained for NDs. In an independent autism spectrum disorder (ASD) cohort of 762 families, we found a 3-fold increase in deleterious autosomal CNVs (p = 7 × 10−4) and an excess of private deleterious single-nucleotide variants (SNVs) in female compared to male probands (OR = 1.34, p = 0.03). We also showed that the deleteriousness of autosomal SNVs was significantly higher in female probands (p = 0.0006). A similar bias was observed in parents of probands ascertained for NDs. Deleterious CNVs (>400 kb) were maternally inherited more often (up to 64%, p = 10−15) than small CNVs < 400 kb (OR = 1.45, p = 0.0003). In the ASD cohort, increased maternal transmission was also observed for deleterious CNVs and SNVs. Although ASD females showed higher mutational burden and lower cognition, the excess mutational burden remained, even after adjustment for those cognitive differences. These results strongly suggest that females have an increased etiological burden unlinked to rare deleterious variants on the X chromosome. Carefully phenotyped and genotyped cohorts will be required for identifying the symptoms, which show gender-specific liability to mutational burden.
465 citations
••
TL;DR: Exome-sequencing data is analyzed of 356 trios with the "classical" epileptic encephalopathies, infantile spasms and Lennox Gastaut syndrome, finding suggestive evidence for a role of three additional genes, and supporting a prominent role for de novo mutations in epilepsy.
Abstract: Emerging evidence indicates that epileptic encephalopathies are genetically highly heterogeneous, underscoring the need for large cohorts of well-characterized individuals to further define the genetic landscape. Through a collaboration between two consortia (EuroEPINOMICS and Epi4K/EPGP), we analyzed exome-sequencing data of 356 trios with the “classical” epileptic encephalopathies, infantile spasms and Lennox Gastaut syndrome, including 264 trios previously analyzed by the Epi4K/EPGP consortium. In this expanded cohort, we find 429 de novo mutations, including de novo mutations in DNM1 in five individuals and de novo mutations in GABBR2, FASN, and RYR3 in two individuals each. Unlike previous studies, this cohort is sufficiently large to show a significant excess of de novo mutations in epileptic encephalopathy probands compared to the general population using a likelihood analysis (p = 8.2 × 10−4), supporting a prominent role for de novo mutations in epileptic encephalopathies. We bring statistical evidence that mutations in DNM1 cause epileptic encephalopathy, find suggestive evidence for a role of three additional genes, and show that at least 12% of analyzed individuals have an identifiable causal de novo mutation. Strikingly, 75% of mutations in these probands are predicted to disrupt a protein involved in regulating synaptic transmission, and there is a significant enrichment of de novo mutations in genes in this pathway in the entire cohort as well. These findings emphasize an important role for synaptic dysregulation in epileptic encephalopathies, above and beyond that caused by ion channel dysfunction.
365 citations
••
University of Washington1, University of North Carolina at Chapel Hill2, Veterans Health Administration3, Columbia University4, University of Houston5, Harvard University6, Mayo Clinic7, Cincinnati Children's Hospital Medical Center8, Icahn School of Medicine at Mount Sinai9, Geisinger Medical Center10, University of Minnesota11
TL;DR: Research investigators should be prepared to return research results and incidental findings discovered in the course of their research and meeting an actionability threshold, but they have no ethical obligation to actively search for such results.
Abstract: As more research studies incorporate next-generation sequencing (including whole-genome or whole-exome sequencing), investigators and institutional review boards face difficult questions regarding which genomic results to return to research participants and how. An American College of Medical Genetics and Genomics 2013 policy paper suggesting that pathogenic mutations in 56 specified genes should be returned in the clinical setting has raised the question of whether comparable recommendations should be considered in research settings. The Clinical Sequencing Exploratory Research (CSER) Consortium and the Electronic Medical Records and Genomics (eMERGE) Network are multisite research programs that aim to develop practical strategies for addressing questions concerning the return of results in genomic research. CSER and eMERGE committees have identified areas of consensus regarding the return of genomic results to research participants. In most circumstances, if results meet an actionability threshold for return and the research participant has consented to return, genomic results, along with referral for appropriate clinical follow-up, should be offered to participants. However, participants have a right to decline the receipt of genomic results, even when doing so might be viewed as a threat to the participants' health. Research investigators should be prepared to return research results and incidental findings discovered in the course of their research and meeting an actionability threshold, but they have no ethical obligation to actively search for such results. These positions are consistent with the recognition that clinical research is distinct from medical care in both its aims and its guiding moral principles.
339 citations
••
Broad Institute1, University of Wisconsin–Milwaukee2, University of Washington3, University of Texas Health Science Center at Houston4, Washington University in St. Louis5, University of Pennsylvania6, Erasmus University Rotterdam7, University of Edinburgh8, University of Virginia9, Wake Forest University10, University of Iceland11, Boston University12, Fred Hutchinson Cancer Research Center13, Ohio State University14, University of Iowa15, University of Pittsburgh16, University of Minnesota17, National Institutes of Health18, University of California, Los Angeles19, Yale University20, Icahn School of Medicine at Mount Sinai21, Lund University22, University of Milan23, University of Oxford24, University of Verona25, Duke University26, University of Ottawa27, Merck & Co.28, University of Mississippi29, University of North Carolina at Chapel Hill30, University of Split31, Tufts University32, Harvard University33, Baylor College of Medicine34
TL;DR: Although the "Exome Array" was used to genotype >200,000 low-frequency and rare coding sequence variants across the genome in 56,538 individuals, none of these four variants was associated with risk for CHD, suggesting that examples of low- frequencies with robust effects on both lipids and CHD will be limited.
Abstract: Low-frequency coding DNA sequence variants in the proprotein convertase subtilisin/kexin type 9 gene (PCSK9) lower plasma low-density lipoprotein cholesterol (LDL-C), protect against risk of coronary heart disease (CHD), and have prompted the development of a new class of therapeutics. It is uncertain whether the PCSK9 example represents a paradigm or an isolated exception. We used the "Exome Array" to genotype >200,000 low-frequency and rare coding sequence variants across the genome in 56,538 individuals (42,208 European ancestry [EA] and 14,330 African ancestry [AA]) and tested these variants for association with LDL-C, high-density lipoprotein cholesterol (HDL-C), and triglycerides. Although we did not identify new genes associated with LDL-C, we did identify four low-frequency (frequencies between 0.1% and 2%) variants (ANGPTL8 rs145464906 [c.361C>T; p.Gln121*], PAFAH1B2 rs186808413 [c.482C>T; p.Ser161Leu], COL18A1 rs114139997 [c.331G>A; p.Gly111Arg], and PCSK7 rs142953140 [c.1511G>A; p.Arg504His]) with large effects on HDL-C and/or triglycerides. None of these four variants was associated with risk for CHD, suggesting that examples of low-frequency coding variants with robust effects on both lipids and CHD will be limited.
309 citations
••
TL;DR: Infertility was the only symptom of primary ciliary dyskinesia observed in affected subjects, suggesting that DNAH1 function in cilium is not as critical as in sperm flagellum, and genetic etiology is likely in most cases.
Abstract: Ten to fifteen percent of couples are confronted with infertility and a male factor is involved in approximately half the cases. A genetic etiology is likely in most cases yet only few genes have been formally correlated with male infertility. Homozygosity mapping was carried out on a cohort of 20 North African individuals, including 18 index cases, presenting with primary infertility resulting from impaired sperm motility caused by a mosaic of multiple morphological abnormalities of the flagella (MMAF) including absent, short, coiled, bent, and irregular flagella. Five unrelated subjects out of 18 (28%) carried a homozygous variant in DNAH1, which encodes an inner dynein heavy chain and is expressed in testis. RT-PCR, immunostaining, and electronic microscopy were carried out on samples from one of the subjects with a mutation located on a donor splice site. Neither the transcript nor the protein was observed in this individual, confirming the pathogenicity of this variant. A general axonemal disorganization including mislocalization of the microtubule doublets and loss of the inner dynein arms was observed. Although DNAH1 is also expressed in other ciliated cells, infertility was the only symptom of primary ciliary dyskinesia observed in affected subjects, suggesting that DNAH1 function in cilium is not as critical as in sperm flagellum.
305 citations
••
TL;DR: The FORGE (Finding of Rare Disease Genes) Canada Consortium brought together clinicians and scientists from 21 genetics centers and three science and technology innovation centers and discussed the way forward for rare-disease genetic discovery both in Canada and internationally.
Abstract: Inherited monogenic disease has an enormous impact on the well-being of children and their families. Over half of the children living with one of these conditions are without a molecular diagnosis because of the rarity of the disease, the marked clinical heterogeneity, and the reality that there are thousands of rare diseases for which causative mutations have yet to be identified. It is in this context that in 2010 a Canadian consortium was formed to rapidly identify mutations causing a wide spectrum of pediatric-onset rare diseases by using whole-exome sequencing. The FORGE (Finding of Rare Disease Genes) Canada Consortium brought together clinicians and scientists from 21 genetics centers and three science and technology innovation centers from across Canada. From nation-wide requests for proposals, 264 disorders were selected for study from the 371 submitted; disease-causing variants (including in 67 genes not previously associated with human disease; 41 of these have been genetically or functionally validated, and 26 are currently under study) were identified for 146 disorders over a 2-year period. Here, we present our experience with four strategies employed for gene discovery and discuss FORGE's impact in a number of realms, from clinical diagnostics to the broadening of the phenotypic spectrum of many diseases to the biological insight gained into both disease states and normal human development. Lastly, on the basis of this experience, we discuss the way forward for rare-disease genetic discovery both in Canada and internationally.
239 citations
••
University of Cincinnati1, Harvard University2, Broad Institute3, Wake Forest University4, Cincinnati Children's Hospital Medical Center5, Jagiellonian University Medical College6, Medical University of Graz7, Lund University8, Pompeu Fabra University9, Autonomous University of Barcelona10, University of Miami11, University of Washington12, University of Michigan13, University of Florida14, University of Virginia15, Mayo Clinic16, University of Arizona17, Ludwig Maximilian University of Munich18, University of Oxford19
TL;DR: A genome-wide association study of this condition that meta-analyzed data from six studies that enrolled individuals of European ancestry demonstrated biological heterogeneity across ICH subtypes and highlighted the importance of ascertaining ICH cases accordingly.
Abstract: Intracerebral hemorrhage (ICH) is the stroke subtype with the worst prognosis and has no established acute treatment. ICH is classified as lobar or nonlobar based on the location of ruptured blood vessels within the brain. These different locations also signal different underlying vascular pathologies. Heritability estimates indicate a substantial genetic contribution to risk of ICH in both locations. We report a genome-wide association study of this condition that meta-analyzed data from six studies that enrolled individuals of European ancestry. Case subjects were ascertained by neurologists blinded to genotype data and classified as lobar or nonlobar based on brain computed tomography. ICH-free control subjects were sampled from ambulatory clinics or random digit dialing. Replication of signals identified in the discovery cohort with p < 1 × 10(-6) was pursued in an independent multiethnic sample utilizing both direct and genome-wide genotyping. The discovery phase included a case cohort of 1,545 individuals (664 lobar and 881 nonlobar cases) and a control cohort of 1,481 individuals and identified two susceptibility loci: for lobar ICH, chromosomal region 12q21.1 (rs11179580, odds ratio [OR] = 1.56, p = 7.0 × 10(-8)); and for nonlobar ICH, chromosomal region 1q22 (rs2984613, OR = 1.44, p = 1.6 × 10(-8)). The replication included a case cohort of 1,681 individuals (484 lobar and 1,194 nonlobar cases) and a control cohort of 2,261 individuals and corroborated the association for 1q22 (p = 6.5 × 10(-4); meta-analysis p = 2.2 × 10(-10)) but not for 12q21.1 (p = 0.55; meta-analysis p = 2.6 × 10(-5)). These results demonstrate biological heterogeneity across ICH subtypes and highlight the importance of ascertaining ICH cases accordingly.
••
TL;DR: It is shown that somatic mosaicism for transmitted mutations among parents of children with simplex genetic disease is more common than currently appreciated, and integrated probabilistic modeling of gametogenesis predicts that mutations in parental blood increase recurrence risk substantially more than parental mutations confined to the germline.
Abstract: New human mutations are thought to originate in germ cells, thus making a recurrence of the same mutation in a sibling exceedingly rare. However, increasing sensitivity of genomic technologies has anecdotally revealed mosaicism for mutations in somatic tissues of apparently healthy parents. Such somatically mosaic parents might also have germline mosaicism that can potentially cause unexpected intergenerational recurrences. Here, we show that somatic mosaicism for transmitted mutations among parents of children with simplex genetic disease is more common than currently appreciated. Using the sensitivity of individual-specific breakpoint PCR, we prospectively screened 100 families with children affected by genomic disorders due to rare deletion copy-number variants (CNVs) determined to be de novo by clinical analysis of parental DNA. Surprisingly, we identified four cases of low-level somatic mosaicism for the transmitted CNV in DNA isolated from parental blood. Integrated probabilistic modeling of gametogenesis developed in response to our observations predicts that mutations in parental blood increase recurrence risk substantially more than parental mutations confined to the germline. Moreover, despite the fact that maternally transmitted mutations are the minority of alleles, our model suggests that sexual dimorphisms in gametogenesis result in a greater proportion of somatically mosaic transmitting mothers who are thus at increased risk of recurrence. Therefore, somatic mosaicism together with sexual differences in gametogenesis might explain a considerable fraction of unexpected recurrences of X-linked recessive disease. Overall, our results underscore an important role for somatic mosaicism and mitotic replicative mutational mechanisms in transmission genetics.
••
TL;DR: It is predicted that some glycosylation disorders might occur with greater frequency than current estimates of their prevalence, and the prevalence of some disorders differs substantially between European and African Americans.
Abstract: Over 100 human genetic disorders result from mutations in glycosylation-related genes. In 2013, a new glycosylation disorder was reported every 17 days. This trend will probably continue given that at least 2% of the human genome encodes glycan-biosynthesis and -recognition proteins. Established biosynthetic pathways provide many candidate genes, but finding unanticipated mutated genes will offer new insights into glycosylation. Simple glycobiomarkers can be used in narrowing the candidates identified by exome and genome sequencing, and those can be validated by glycosylation analysis of serum or cells from affected individuals. Model organisms will expand the understanding of these mutations' impact on glycosylation and pathology. Here, we highlight some recently discovered glycosylation disorders and the barriers, breakthroughs, and surprises they presented. We predict that some glycosylation disorders might occur with greater frequency than current estimates of their prevalence. Moreover, the prevalence of some disorders differs substantially between European and African Americans.
••
University of Pennsylvania1, University College London2, University of North Carolina at Chapel Hill3, University of Warwick4, McMaster University5, Jackson State University6, Utrecht University7, Baylor College of Medicine8, University of Vermont9, Merck & Co.10, University of Texas Health Science Center at Houston11, Harvard University12, University of Washington13, University of Minnesota14, University of Mississippi15, University of London16, University of Virginia17, Icahn School of Medicine at Mount Sinai18, Fred Hutchinson Cancer Research Center19
TL;DR: Investigating the causal role of BMI in cardiometabolic traits and events using a genetic score comprising 14 BMI-associated SNPs from a recent discovery analysis identified causal effects of BMI on several cardiometric traits; however, whether BMI causally impacts CHD risk requires further evidence.
Abstract: Elevated body mass index (BMI) associates with cardiometabolic traits on observational analysis, yet the underlying causal relationships remain unclear. We conducted Mendelian randomization analyses by using a genetic score (GS) comprising 14 BMI-associated SNPs from a recent discovery analysis to investigate the causal role of BMI in cardiometabolic traits and events. We used eight population-based cohorts, including 34,538 European-descent individuals (4,407 type 2 diabetes (T2D), 6,073 coronary heart disease (CHD), and 3,813 stroke cases). A 1 kg/m(2) genetically elevated BMI increased fasting glucose (0.18 mmol/l; 95% confidence interval (CI) = 0.12-0.24), fasting insulin (8.5%; 95% CI = 5.9-11.1), interleukin-6 (7.0%; 95% CI = 4.0-10.1), and systolic blood pressure (0.70 mmHg; 95% CI = 0.24-1.16) and reduced high-density lipoprotein cholesterol (-0.02 mmol/l; 95% CI = -0.03 to -0.01) and low-density lipoprotein cholesterol (LDL-C; -0.04 mmol/l; 95% CI = -0.07 to -0.01). Observational and causal estimates were directionally concordant, except for LDL-C. A 1 kg/m(2) genetically elevated BMI increased the odds of T2D (odds ratio [OR] = 1.27; 95% CI = 1.18-1.36) but did not alter risk of CHD (OR 1.01; 95% CI = 0.94-1.08) or stroke (OR = 1.03; 95% CI = 0.95-1.12). A meta-analysis incorporating published studies reporting 27,465 CHD events in 219,423 individuals yielded a pooled OR of 1.04 (95% CI = 0.97-1.12) per 1 kg/m(2) increase in BMI. In conclusion, we identified causal effects of BMI on several cardiometabolic traits; however, whether BMI causally impacts CHD risk requires further evidence.
••
University of North Carolina at Chapel Hill1, University of Michigan2, Renaissance Computing Institute3, University of Washington4, Broad Institute5, University of Wisconsin–Milwaukee6, Harvard University7, University of Oxford8, Norwegian University of Science and Technology9, Icahn School of Medicine at Mount Sinai10, University of Vermont11, Fred Hutchinson Cancer Research Center12, Erasmus University Rotterdam13, University of Mississippi14, University of Iceland15, University of Minnesota16, Washington University in St. Louis17, University of Edinburgh18, University of Texas Health Science Center at Houston19, University of Pittsburgh20, George Washington University21, University of Iowa22, Stanford University23, University of Auckland24, Ohio State University25, Boston University26, University of California, Los Angeles27, Jackson State University28, University of Copenhagen29, Technische Universität München30, Baylor College of Medicine31, Johns Hopkins University32, Group Health Cooperative33, University of Virginia34
TL;DR: This large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL- C and provides unique insight into the design and analysis of similar experiments.
Abstract: Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98(th) or <2(nd) percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments.
••
University of Melbourne1, Hoffmann-La Roche2, Monash University3, Max Planck Society4, University of Wisconsin-Madison5, University of Louisville6, Walter and Eliza Hall Institute of Medical Research7, Royal Hobart Hospital8, Royal Melbourne Hospital9, Boston Children's Hospital10, Oregon Health & Science University11, Griffith University12
TL;DR: Overall, it is shown that loss-of-function mutations in RAB39B cause intellectual disability and pathologically confirmed early-onset Parkinson disease and a spectrum of neuropathological features that implicate RAB 39B in the pathogenesis of Parkinson Disease and potentially other neurodegenerative disorders.
Abstract: Advances in understanding the etiology of Parkinson disease have been driven by the identification of causative mutations in families. Genetic analysis of an Australian family with three males displaying clinical features of early-onset parkinsonism and intellectual disability identified a ∼45 kb deletion resulting in the complete loss of RAB39B. We subsequently identified a missense mutation (c.503C>A [p.Thr168Lys]) in RAB39B in an unrelated Wisconsin kindred affected by a similar clinical phenotype. In silico and in vitro studies demonstrated that the mutation destabilized the protein, consistent with loss of function. In vitro small-hairpin-RNA-mediated knockdown of Rab39b resulted in a reduction in the density of α-synuclein immunoreactive puncta in dendritic processes of cultured neurons. In addition, in multiple cell models, we demonstrated that knockdown of Rab39b was associated with reduced steady-state levels of α-synuclein. Post mortem studies demonstrated that loss of RAB39B resulted in pathologically confirmed Parkinson disease. There was extensive dopaminergic neuron loss in the substantia nigra and widespread classic Lewy body pathology. Additional pathological features included cortical Lewy bodies, brain iron accumulation, tau immunoreactivity, and axonal spheroids. Overall, we have shown that loss-of-function mutations in RAB39B cause intellectual disability and pathologically confirmed early-onset Parkinson disease. The loss of RAB39B results in dysregulation of α-synuclein homeostasis and a spectrum of neuropathological features that implicate RAB39B in the pathogenesis of Parkinson disease and potentially other neurodegenerative disorders.
••
TL;DR: A large-scale fine-mapping study of PsV risk in the MHC region in 9,247 PsV-affected individuals and 13,589 controls of European descent by imputing class I and II human leukocyte antigen (HLA) genes from SNP genotype data shows the value of high-resolution HLA and MICA imputation for fine mapping causal variants in theMHC.
Abstract: Psoriasis vulgaris (PsV) risk is strongly associated with variation within the major histocompatibility complex (MHC) region, but its genetic architecture has yet to be fully elucidated. Here, we conducted a large-scale fine-mapping study of PsV risk in the MHC region in 9,247 PsV-affected individuals and 13,589 controls of European descent by imputing class I and II human leukocyte antigen (HLA) genes from SNP genotype data. In addition, we imputed sequence variants for MICA, an MHC HLA-like gene that has been associated with PsV, to evaluate association at that locus as well. We observed that HLA-C∗06:02 demonstrated the lowest p value for overall PsV risk (p = 1.7 × 10−364). Stepwise analysis revealed multiple HLA-C∗06:02-independent risk variants in both class I and class II HLA genes for PsV susceptibility (HLA-C∗12:03, HLA-B amino acid positions 67 and 9, HLA-A amino acid position 95, and HLA-DQα1 amino acid position 53; p
••
TL;DR: Phevor improves diagnostic accuracy not only for individuals presenting with established disease phenotypes but also for those with previously undescribed and atypical disease presentations, as it is shown that Phevor is not limited to known diseases or known disease-causing alleles.
Abstract: Phevor integrates phenotype, gene function, and disease information with personal genomic data for improved power to identify disease-causing alleles. Phevor works by combining knowledge resident in multiple biomedical ontologies with the outputs of variant-prioritization tools. It does so by using an algorithm that propagates information across and between ontologies. This process enables Phevor to accurately reprioritize potentially damaging alleles identified by variant-prioritization tools in light of gene function, disease, and phenotype knowledge. Phevor is especially useful for single-exome and family-trio-based diagnostic analyses, the most commonly occurring clinical scenarios and ones for which existing personal genome diagnostic tools are most inaccurate and underpowered. Here, we present a series of benchmark analyses illustrating Phevor’s performance characteristics. Also presented are three recent Utah Genome Project case studies in which Phevor was used to identify disease-causing alleles. Collectively, these results show that Phevor improves diagnostic accuracy not only for individuals presenting with established disease phenotypes but also for those with previously undescribed and atypical disease presentations. Importantly, Phevor is not limited to known diseases or known disease-causing alleles. As we demonstrate, Phevor can also use latent information in ontologies to discover genes and disease-causing alleles not previously associated with disease.
••
TL;DR: Data demonstrated that mutations in two genes, IRF6 and GRHL3, can lead to nearly identical phenotypes of orofacial cleft and supported the hypotheses that both genes are essential for the presence of a functional oral periderm and that failure of this process contributes to VWS.
Abstract: Mutations in interferon regulatory factor 6 (IRF6) account for ∼70% of cases of Van der Woude syndrome (VWS), the most common syndromic form of cleft lip and palate. In 8 of 45 VWS-affected families lacking a mutation in IRF6, we found coding mutations in grainyhead-like 3 (GRHL3). According to a zebrafish-based assay, the disease-associated GRHL3 mutations abrogated periderm development and were consistent with a dominant-negative effect, in contrast to haploinsufficiency seen in most VWS cases caused by IRF6 mutations. In mouse, all embryos lacking Grhl3 exhibited abnormal oral periderm and 17% developed a cleft palate. Analysis of the oral phenotype of double heterozygote (Irf6+/−;Grhl3+/−) murine embryos failed to detect epistasis between the two genes, suggesting that they function in separate but convergent pathways during palatogenesis. Taken together, our data demonstrated that mutations in two genes, IRF6 and GRHL3, can lead to nearly identical phenotypes of orofacial cleft. They supported the hypotheses that both genes are essential for the presence of a functional oral periderm and that failure of this process contributes to VWS.
••
TL;DR: Strong signatures of recent positive selection in eastern African populations and the Fulani from central Africa are detected and haplotype analysis supported an eastern African origin of the C-14010 LP-associated mutation in southern Africa.
Abstract: In humans, the ability to digest lactose, the sugar in milk, declines after weaning because of decreasing levels of the enzyme lactase-phlorizin hydrolase, encoded by LCT. However, some individuals maintain high enzyme amounts and are able to digest lactose into adulthood (i.e., they have the lactase-persistence [LP] trait). It is thought that selection has played a major role in maintaining this genetically determined phenotypic trait in different human populations that practice pastoralism. To identify variants associated with the LP trait and to study its evolutionary history in Africa, we sequenced MCM6 introns 9 and 13 and ∼2 kb of the LCT promoter region in 819 individuals from 63 African populations and in 154 non-Africans from nine populations. We also genotyped four microsatellites in an ∼198 kb region in a subset of 252 individuals to reconstruct the origin and spread of LP-associated variants in Africa. Additionally, we examined the association between LP and genetic variability at candidate regulatory regions in 513 individuals from eastern Africa. Our analyses confirmed the association between the LP trait and three common variants in intron 13 (C-14010, G-13907, and G-13915). Furthermore, we identified two additional LP-associated SNPs in intron 13 and the promoter region (G-12962 and T-956, respectively). Using neutrality tests based on the allele frequency spectrum and long-range linkage disequilibrium, we detected strong signatures of recent positive selection in eastern African populations and the Fulani from central Africa. In addition, haplotype analysis supported an eastern African origin of the C-14010 LP-associated mutation in southern Africa.
••
University of Texas Health Science Center at Houston1, Spanish National Research Council2, University of Texas MD Anderson Cancer Center3, University College London4, North Shore-LIJ Health System5, Autonomous University of Madrid6, Autonomous University of Barcelona7, University of Groningen8, Karolinska Institutet9, Oklahoma Medical Research Foundation10, University of Granada11, University of Queensland12, University of Milan13, Charité14, Hannover Medical School15, University of Cologne16, University of Erlangen-Nuremberg17, VU University Amsterdam18, Leiden University19, Lund University20, University of Verona21, University of Glasgow22, Newcastle University23, University of Manchester24, Johns Hopkins University25, Northwestern University26, McGill University27, University of Western Ontario28, University of California, Los Angeles29, University of Michigan30, University of Minnesota31, Medical University of South Carolina32, Georgetown University33, Boston University34, University of Alabama at Birmingham35, University of Utah36, Carolinas Healthcare System37, Moncton Hospital38, Alberta Health Services39, McMaster University40, University of Saskatchewan41, University of Manitoba42, Utrecht University43, Radboud University Nijmegen44
TL;DR: This study has increased the number of known genetic associations with SSc, provided further insight into the pleiotropic effects of shared autoimmune risk factors, and highlighted the power of dense mapping for detecting previously overlooked susceptibility loci.
Abstract: In this study, 1,833 systemic sclerosis (SSc) cases and 3,466 controls were genotyped with the Immunochip array. Classical alleles, amino acid residues, and SNPs across the human leukocyte antigen (HLA) region were imputed and tested. These analyses resulted in a model composed of six polymorphic amino acid positions and seven SNPs that explained the observed significant associations in the region. In addition, a replication step comprising 4,017 SSc cases and 5,935 controls was carried out for several selected non-HLA variants, reaching a total of 5,850 cases and 9,401 controls of European ancestry. Following this strategy, we identified and validated three SSc risk loci, including DNASE1L3 at 3p14, the SCHIP1-IL12A locus at 3q25, and ATG5 at 6q21, as well as a suggested association of the TREH-DDX6 locus at 11q23. The associations of several previously reported SSc risk loci were validated and further refined, and the observed peak of association in PXK was related to DNASE1L3. Our study has increased the number of known genetic associations with SSc, provided further insight into the pleiotropic effects of shared autoimmune risk factors, and highlighted the power of dense mapping for detecting previously overlooked susceptibility loci.
••
TL;DR: Autosomal-recessive variants in MCM9 cause a genomic-instability syndrome associated with hypergonadotropic hypogonadism and short stature, and preferential sensitivity of germline meiosis to MCM 9 functional deficiency and compromised DNA repair in the somatic component most likely account for the ovarian failure and short size.
Abstract: Premature ovarian failure (POF) is genetically heterogeneous and manifests as hypergonadotropic hypogonadism either as part of a syndrome or in isolation. We studied two unrelated consanguineous families with daughters exhibiting primary amenorrhea, short stature, and a 46,XX karyotype. A combination of SNP arrays, comparative genomic hybridization arrays, and whole-exome sequencing analyses identified homozygous pathogenic variants in MCM9, a gene implicated in homologous recombination and repair of double-stranded DNA breaks. In one family, the MCM9 c.1732+2T>C variant alters a splice donor site, resulting in abnormal alternative splicing and truncated forms of MCM9 that are unable to be recruited to sites of DNA damage. In the second family, MCM9 c.394C>T (p.Arg132∗) results in a predicted loss of functional MCM9. Repair of chromosome breaks was impaired in lymphocytes from affected, but not unaffected, females in both families, consistent with MCM9 function in homologous recombination. Autosomal-recessive variants in MCM9 cause a genomic-instability syndrome associated with hypergonadotropic hypogonadism and short stature. Preferential sensitivity of germline meiosis to MCM9 functional deficiency and compromised DNA repair in the somatic component most likely account for the ovarian failure and short stature.
••
Broad Institute1, Partners HealthCare2, Brigham and Women's Hospital3, Central Manchester University Hospitals NHS Foundation Trust4, University of Manchester5, Karolinska University Hospital6, Leiden University7, University Medical Center Groningen8, Umeå University9, Spanish National Research Council10, Merck & Co.11, North Shore-LIJ Health System12, Utrecht University13
TL;DR: A statistical approach to identify and adjust for clinical heterogeneity within ACPA(-) RA and observed independent associations for serine and leucine at position 11 in HLA-DRβ1 and for aspartate at position 9 in H LA-B within the peptide binding grooves contribute to mounting evidence that ACPA (+) and ACPA--) RA are genetically distinct and potentially have separate autoantigens contributing to pathogenesis.
Abstract: Despite progress in defining human leukocyte antigen (HLA) alleles for anti-citrullinated-protein-autoantibody-positive (ACPA(+)) rheumatoid arthritis (RA), identifying HLA alleles for ACPA-negativ ...
••
TL;DR: This study suggests that the IFIH1 mutations are responsible for the AGS phenotype due to an excessive production of type I interferon.
Abstract: Aicardi-Goutieres syndrome (AGS) is a rare, genetically determined early-onset progressive encephalopathy. To date, mutations in six genes have been identified as etiologic for AGS. Our Japanese nationwide AGS survey identified six AGS-affected individuals without a molecular diagnosis; we performed whole-exome sequencing on three of these individuals. After removal of the common polymorphisms found in SNP databases, we were able to identify IFIH1 heterozygous missense mutations in all three. In vitro functional analysis revealed that IFIH1 mutations increased type I interferon production, and the transcription of interferon-stimulated genes were elevated. IFIH1 encodes MDA5, and mutant MDA5 lacked ligand-specific responsiveness, similarly to the dominant Ifih1 mutation responsible for the SLE mouse model that results in type I interferon overproduction. This study suggests that the IFIH1 mutations are responsible for the AGS phenotype due to an excessive production of type I interferon.
••
TL;DR: Changes in RNA and protein expression levels of CoA synthase, as well as CoA amount, are demonstrated in fibroblasts derived from the two clinical cases and in yeast, demonstrating the second inborn error of coenzyme A biosynthesis to be implicated in NBIA.
Abstract: Neurodegeneration with brain iron accumulation (NBIA) comprises a clinically and genetically heterogeneous group of disorders with progressive extrapyramidal signs and neurological deterioration, characterized by iron accumulation in the basal ganglia. Exome sequencing revealed the presence of recessive missense mutations in COASY, encoding coenzyme A (CoA) synthase in one NBIA-affected subject. A second unrelated individual carrying mutations in COASY was identified by Sanger sequence analysis. CoA synthase is a bifunctional enzyme catalyzing the final steps of CoA biosynthesis by coupling phosphopantetheine with ATP to form dephospho-CoA and its subsequent phosphorylation to generate CoA. We demonstrate alterations in RNA and protein expression levels of CoA synthase, as well as CoA amount, in fibroblasts derived from the two clinical cases and in yeast. This is the second inborn error of coenzyme A biosynthesis to be implicated in NBIA.
••
University of Washington1, Boston Children's Hospital2, Pontifical Catholic University of Chile3, University of North Carolina at Chapel Hill4, University of Utah5, University of New Mexico6, University of Manchester7, University of California, San Francisco8, Katholieke Universiteit Leuven9, Christchurch Hospital10, University of Florence11, Cedars-Sinai Medical Center12, University of British Columbia13, University of Texas Health Science Center at Houston14, University College London15, University of Oxford16, Ghent University17, University of New South Wales18, University of Otago19, Cincinnati Children's Hospital Medical Center20, University of Hawaii at Manoa21, Near East University22, Uludağ University23, Indiana University24, Southern General Hospital25, Geisinger Medical Center26
TL;DR: Findings indicate that GS, DA5, and MWS have traditionally been considered separate disorders, are etiologically related and perhaps represent variable expressivity of the same condition.
Abstract: Gordon syndrome (GS), or distal arthrogryposis type 3, is a rare, autosomal-dominant disorder characterized by cleft palate and congenital contractures of the hands and feet. Exome sequencing of five GS-affected families identified mutations in piezo-type mechanosensitive ion channel component 2 (PIEZO2) in each family. Sanger sequencing revealed PIEZO2 mutations in five of seven additional families studied (for a total of 10/12 [83%] individuals), and nine families had an identical c.8057G>A (p.Arg2686His) mutation. The phenotype of GS overlaps with distal arthrogryposis type 5 (DA5) and Marden-Walker syndrome (MWS). Using molecular inversion probes for targeted sequencing to screen PIEZO2, we found mutations in 24/29 (82%) DA5-affected families and one of two MWS-affected families. The presence of cleft palate was significantly associated with c.8057G>A (Fisher's exact test, adjusted p value < 0.0001). Collectively, although GS, DA5, and MWS have traditionally been considered separate disorders, our findings indicate that they are etiologically related and perhaps represent variable expressivity of the same condition.
••
Utrecht University1, Queen Mary University of London2, University of Michigan3, McMaster University4, Case Western Reserve University5, University of North Carolina at Chapel Hill6, University of California, San Diego7, University of Pennsylvania8, University of Glasgow9, Stanford University10, Cleveland Clinic11, University of Maryland, Baltimore12, University of Oxford13, University of Wisconsin-Madison14, University of Bristol15, University of Florida16, Erasmus University Rotterdam17, Heidelberg University18, University of Groningen19, Lund University20, Glenfield Hospital21, University of Leicester22, University of Minnesota23, Columbia University24, University College London25, Tulane University26, Harvard University27, University of Cambridge28, University of Washington29, University of Dundee30, Merck & Co.31, Cedars-Sinai Medical Center32, Dalhousie University33, University College Dublin34, VU University Amsterdam35, Fred Hutchinson Cancer Research Center36, Boston University37, Imperial College London38, University of Mississippi39, Johns Hopkins University40, University of Amsterdam41, University of Texas Health Science Center at Houston42, University of London43, University of Pittsburgh44, Hannover Medical School45, University of Ulm46, Medical University of Graz47, Icahn School of Medicine at Mount Sinai48, Royal College of Surgeons in Ireland49, Brigham and Women's Hospital50
TL;DR: The findings extend the understanding of genes involved in BP regulation, which may provide new targets for therapeutic intervention or drug response stratification and provide support for a putative role in hypertension of several genes.
Abstract: Blood pressure (BP) is a heritable risk factor for cardiovascular disease To investigate genetic associations with systolic BP (SBP), diastolic BP (DBP), mean arterial pressure (MAP), and pulse pressure (PP), we genotyped ~50,000 SNPs in up to 87,736 individuals of European ancestry and combined these in a meta-analysis We replicated findings in an independent set of 68,368 individuals of European ancestry Our analyses identified 11 previously undescribed associations in independent loci containing 31 genes including PDE1A, HLA-DQB1, CDK6, PRKAG2, VCL, H19, NUCB2, RELA, HOXC@ complex, FBN1, and NFAT5 at the Bonferroni-corrected array-wide significance threshold (p < 6 × 10(-7)) and confirmed 27 previously reported associations Bioinformatic analysis of the 11 loci provided support for a putative role in hypertension of several genes, such as CDK6 and NUCB2 Analysis of potential pharmacological targets in databases of small molecules showed that ten of the genes are predicted to be a target for small molecules In summary, we identified previously unknown loci associated with BP Our findings extend our understanding of genes involved in BP regulation, which may provide new targets for therapeutic intervention or drug response stratification
••
TL;DR: Bayesian phylogenetic analysis allows us to reconstruct a history of early Austronesians arriving in Taiwan in the north, spreading rapidly to the south, and leaving Taiwan ~4,000 years ago to spread throughout Island Southeast Asia, Madagascar, and Oceania.
Abstract: A Taiwan origin for the expansion of the Austronesian languages and their speakers is well supported by linguistic and archaeological evidence. However, human genetic evidence is more controversial. Until now, there had been no ancient skeletal evidence of a potential Austronesian-speaking ancestor prior to the Taiwan Neolithic ∼6,000 years ago, and genetic studies have largely ignored the role of genetic diversity within Taiwan as well as the origins of Formosans. We address these issues via analysis of a complete mitochondrial DNA genome sequence of an ∼8,000-year-old skeleton from Liang Island (located between China and Taiwan) and 550 mtDNA genome sequences from 8 aboriginal (highland) Formosan and 4 other Taiwanese groups. We show that the Liangdao Man mtDNA sequence is closest to Formosans, provides a link to southern China, and has the most ancestral haplogroup E sequence found among extant Austronesian speakers. Bayesian phylogenetic analysis allows us to reconstruct a history of early Austronesians arriving in Taiwan in the north ∼6,000 years ago, spreading rapidly to the south, and leaving Taiwan ∼4,000 years ago to spread throughout Island Southeast Asia, Madagascar, and Oceania.
••
TL;DR: In this article, the theoretical basis of principal component analysis (PCA) was reviewed and the behavior of PCA when testing for association between a SNP and correlated traits, and the power of various PCA-based strategies was compared when analyzing up to 100 correlated traits.
Abstract: Many human traits are highly correlated. This correlation can be leveraged to improve the power of genetic association tests to identify markers associated with one or more of the traits. Principal component analysis (PCA) is a useful tool that has been widely used for the multivariate analysis of correlated variables. PCA is usually applied as a dimension reduction method: the few top principal components (PCs) explaining most of total trait variance are tested for association with a predictor of interest, and the remaining components are not analyzed. In this study we review the theoretical basis of PCA and describe the behavior of PCA when testing for association between a SNP and correlated traits. We then use simulation to compare the power of various PCA-based strategies when analyzing up to 100 correlated traits. We show that contrary to widespread practice, testing only the top PCs often has low power, whereas combining signal across all PCs can have greater power. This power gain is primarily due to increased power to detect genetic variants with opposite effects on positively correlated traits and variants that are exclusively associated with a single trait. Relative to other methods, the combined-PC approach has close to optimal power in all scenarios considered while offering more flexibility and more robustness to potential confounders. Finally, we apply the proposed PCA strategy to the genome-wide association study of five correlated coagulation traits where we identify two candidate SNPs that were not found by the standard approach.
••
TL;DR: It is suggested that mutations in NOTCH1 are the most common cause of AOS and add to a growing list of human diseases that have a vascular and/or bony component and are caused by alterations in the Notch signaling pathway.
Abstract: Notch signaling determines and reinforces cell fate in bilaterally symmetric multicellular eukaryotes Despite the involvement of Notch in many key developmental systems, human mutations in Notch signaling components have mainly been described in disorders with vascular and bone effects Here, we report five heterozygous NOTCH1 variants in unrelated individuals with Adams-Oliver syndrome (AOS), a rare disease with major features of aplasia cutis of the scalp and terminal transverse limb defects Using whole-genome sequencing in a cohort of 11 families lacking mutations in the four genes with known roles in AOS pathology (ARHGAP31, RBPJ, DOCK6, and EOGT), we found a heterozygous de novo 85 kb deletion spanning the NOTCH1 5′ region and three coding variants (c1285T>C [pCys429Arg], c4487G>A [pCys1496Tyr], and c5965G>A [pAsp1989Asn]), two of which are de novo, in four unrelated probands In a fifth family, we identified a heterozygous canonical splice-site variant (c743−1 G>T) in an affected father and daughter These variants were not present in 5,077 in-house control genomes or in public databases In keeping with the prominent developmental role described for Notch1 in mouse vasculature, we observed cardiac and multiple vascular defects in four of the five families We propose that the limb and scalp defects might also be due to a vasculopathy in NOTCH1-related AOS Our results suggest that mutations in NOTCH1 are the most common cause of AOS and add to a growing list of human diseases that have a vascular and/or bony component and are caused by alterations in the Notch signaling pathway