scispace - formally typeset
Search or ask a question

Showing papers in "American Journal of Human Genetics in 2015"


Journal ArticleDOI
TL;DR: LDpred is introduced, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel, and outperforms the approach of pruning followed by thresholding, particularly at large sample sizes.
Abstract: Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R(2) increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.

1,088 citations


Journal ArticleDOI
TL;DR: This statement represents current opinion by the ASHG on the ethical, legal, and social issues concerning genetic testing in children and a broad range of test technologies and their applications in clinical medicine and research.
Abstract: In 1995, the American Society of Human Genetics (ASHG) and American College of Medical Genetics and Genomics (ACMG) jointly published a statement on genetic testing in children and adolescents. In the past 20 years, much has changed in the field of genetics, including the development of powerful new technologies, new data from genetic research on children and adolescents, and substantial clinical experience. This statement represents current opinion by the ASHG on the ethical, legal, and social issues concerning genetic testing in children. These recommendations are relevant to families, clinicians, and investigators. After a brief review of the 1995 statement and major changes in genetic technologies in recent years, this statement offers points to consider on a broad range of test technologies and their applications in clinical medicine and research. Recommendations are also made for record and communication issues in this domain and for professional education.

693 citations


Journal ArticleDOI
TL;DR: This collaborative effort has identified 956 genes, including 375 not previously associated with human health, that underlie a Mendelian phenotype, providing insight into study design and analytical strategies, identify novel mechanisms of disease, and reveal the extensive clinical variability of Mendelia phenotypes.
Abstract: Discovering the genetic basis of a Mendelian phenotype establishes a causal link between genotype and phenotype, making possible carrier and population screening and direct diagnosis. Such discoveries also contribute to our knowledge of gene function, gene regulation, development, and biological mechanisms that can be used for developing new therapeutics. As of February 2015, 2,937 genes underlying 4,163 Mendelian phenotypes have been discovered, but the genes underlying ∼50% (i.e., 3,152) of all known Mendelian phenotypes are still unknown, and many more Mendelian conditions have yet to be recognized. This is a formidable gap in biomedical knowledge. Accordingly, in December 2011, the NIH established the Centers for Mendelian Genomics (CMGs) to provide the collaborative framework and infrastructure necessary for undertaking large-scale whole-exome sequencing and discovery of the genetic variants responsible for Mendelian phenotypes. In partnership with 529 investigators from 261 institutions in 36 countries, the CMGs assessed 18,863 samples from 8,838 families representing 579 known and 470 novel Mendelian phenotypes as of January 2015. This collaborative effort has identified 956 genes, including 375 not previously associated with human health, that underlie a Mendelian phenotype. These results provide insight into study design and analytical strategies, identify novel mechanisms of disease, and reveal the extensive clinical variability of Mendelian phenotypes. Discovering the gene underlying every Mendelian phenotype will require tackling challenges such as worldwide ascertainment and phenotypic characterization of families affected by Mendelian conditions, improvement in sequencing and analytical techniques, and pervasive sharing of phenotypic and genomic data among researchers, clinicians, and families.

579 citations


Journal ArticleDOI
TL;DR: The authors studied the genetic ancestry of 5,269 self-described African Americans, 8,663 Latinos, and 148,789 European Americans who are 23andMe customers and showed that the legacy of these historical interactions is visible in the genetic lineage of present-day Americans.
Abstract: Over the past 500 years, North America has been the site of ongoing mixing of Native Americans, European settlers, and Africans (brought largely by the trans-Atlantic slave trade), shaping the early history of what became the United States. We studied the genetic ancestry of 5,269 self-described African Americans, 8,663 Latinos, and 148,789 European Americans who are 23andMe customers and show that the legacy of these historical interactions is visible in the genetic ancestry of present-day Americans. We document pervasive mixed ancestry and asymmetrical male and female ancestry contributions in all groups studied. We show that regional ancestry differences reflect historical events, such as early Spanish colonization, waves of immigration from many regions of Europe, and forced relocation of Native Americans within the US. This study sheds light on the fine-scale differences in ancestry within and across the United States and informs our understanding of the relationship between racial and ethnic identities and genetic ancestry.

484 citations


Journal ArticleDOI
TL;DR: This study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs, and provides a way to study a cross phenotype (CP) association by using summary Statistics fromGWASs of multiple phenotype.
Abstract: Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple—even distinct—traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci ( CHIC2 , HOXA-EVX1 , IGFBP1/IGFBP3 , and CDH17 ; p −8 ) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p −7 ) were also observed, including CACNA1D and WNT3 . Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes.

286 citations


Journal ArticleDOI
TL;DR: High activity of hedgehog signaling and the PI3K pathway in approximately 60% of 104 ESCC tumors indicates that therapies targeting these pathways might be particularly promising strategies for ESCC.
Abstract: Esophageal squamous cell carcinoma (ESCC) is one of the most common cancers worldwide and the fourth most lethal cancer in China. However, although genomic studies have identified some mutations associated with ESCC, we know little of the mutational processes responsible. To identify genome-wide mutational signatures, we performed either whole-genome sequencing (WGS) or whole-exome sequencing (WES) on 104 ESCC individuals and combined our data with those of 88 previously reported samples. An APOBEC -mediated mutational signature in 47% of 192 tumors suggests that APOBEC-catalyzed deamination provides a source of DNA damage in ESCC. Moreover, PIK3CA hotspot mutations (c.1624G>A [p.Glu542Lys] and c.1633G>A [p.Glu545Lys]) were enriched in APOBEC -signature tumors, and no smoking-associated signature was observed in ESCC. In the samples analyzed by WGS, we identified focal ( CBX4 and CBX8 . In our combined cohort, we identified frequent inactivating mutations in AJUBA , ZNF750 , and PTCH1 and the chromatin-remodeling genes CREBBP and BAP1 , in addition to known mutations. Functional analyses suggest roles for several genes ( CBX4 , CBX8 , AJUBA , and ZNF750 ) in ESCC. Notably, high activity of hedgehog signaling and the PI3K pathway in approximately 60% of 104 ESCC tumors indicates that therapies targeting these pathways might be particularly promising strategies for ESCC. Collectively, our data provide comprehensive insights into the mutational signatures of ESCC and identify markers for early diagnosis and potential therapeutic targets.

269 citations


Journal ArticleDOI
TL;DR: The microbiome regression-based kernel association test (MiRKAT), which directly regresses the outcome on the microbiome profiles via the semi-parametric kernel machine regression framework, is proposed and applied to real microbiome datasets to show that microbial communities are associated with smoking and with fecal protease levels after confounders are controlled for.
Abstract: High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Distance-based analysis is a popular strategy for evaluating the overall association between microbiome diversity and outcome, wherein the phylogenetic distance between individuals' microbiome profiles is computed and tested for association via permutation. Despite their practical popularity, distance-based approaches suffer from important challenges, especially in selecting the best distance and extending the methods to alternative outcomes, such as survival outcomes. We propose the microbiome regression-based kernel association test (MiRKAT), which directly regresses the outcome on the microbiome profiles via the semi-parametric kernel machine regression framework. MiRKAT allows for easy covariate adjustment and extension to alternative outcomes while non-parametrically modeling the microbiome through a kernel that incorporates phylogenetic distance. It uses a variance-component score statistic to test for the association with analytical p value calculation. The model also allows simultaneous examination of multiple distances, alleviating the problem of choosing the best distance. Our simulations demonstrated that MiRKAT provides correctly controlled type I error and adequate power in detecting overall association. "Optimal" MiRKAT, which considers multiple candidate distances, is robust in that it suffers from little power loss in comparison to when the best distance is used and can achieve tremendous power gain in comparison to when a poor distance is chosen. Finally, we applied MiRKAT to real microbiome datasets to show that microbial communities are associated with smoking and with fecal protease levels after confounders are controlled for.

239 citations


Journal ArticleDOI
TL;DR: This work presents a non-parametric method for accurately estimating recent effective population size by using inferred long segments of identity by descent (IBD), and finds that inferred segments of IBD contain information abouteffective population size from around 4 generations to around 50 generations ago for SNP array data and to over 200 generations agofor sequence data.
Abstract: Existing methods for estimating historical effective population size from genetic data have been unable to accurately estimate effective population size during the most recent past. We present a non-parametric method for accurately estimating recent effective population size by using inferred long segments of identity by descent (IBD). We found that inferred segments of IBD contain information about effective population size from around 4 generations to around 50 generations ago for SNP array data and to over 200 generations ago for sequence data. In human populations that we examined, the estimates of effective size were approximately one-third of the census size. We estimate the effective population size of European-ancestry individuals in the UK four generations ago to be eight million and the effective population size of Finland four generations ago to be 0.7 million. Our method is implemented in the open-source IBDNe software package.

238 citations


Journal ArticleDOI
TL;DR: This report contends that genetic effects might be biased as a result of adjustment for body mass index, and illustrates this point by providing examples from published genome-wide association studies, including large meta-analysis of waist-to-hip ratio and waist circumference adjusted forBody mass index.
Abstract: In recent years, a number of large-scale genome-wide association studies have been published for human traits adjusted for other correlated traits with a genetic basis. In most studies, the motivation for such an adjustment is to discover genetic variants associated with the primary outcome independently of the correlated trait. In this report, we contend that this objective is fulfilled when the tested variants have no effect on the covariate or when the correlation between the covariate and the outcome is fully explained by a direct effect of the covariate on the outcome. For all other scenarios, an unintended bias is introduced with respect to the primary outcome as a result of the adjustment, and this bias might lead to false positives. Here, we illustrate this point by providing examples from published genome-wide association studies, including large meta-analysis of waist-to-hip ratio and waist circumference adjusted for body mass index (BMI), where genetic effects might be biased as a result of adjustment for body mass index. Using both theory and simulations, we explore this phenomenon in detail and discuss the ramifications for future genome-wide association studies of correlated traits and diseases.

237 citations


Journal ArticleDOI
TL;DR: A meta-analysis of genome-wide association studies to identify additional VTE susceptibility genes uncovered unexpected actors of VTE etiology and pave the way for novel mechanistic concepts of V TE pathophysiology.
Abstract: Venous thromboembolism (VTE), the third leading cause of cardiovascular mortality, is a complex thrombotic disorder with environmental and genetic determinants. Although several genetic variants have been found associated with VTE, they explain a minor proportion of VTE risk in cases. We undertook a meta-analysis of genome-wide association studies (GWASs) to identify additional VTE susceptibility genes. Twelve GWASs totaling 7,507 VTE case subjects and 52,632 control subjects formed our discovery stage where 6,751,884 SNPs were tested for association with VTE. Nine loci reached the genome-wide significance level of 5 × 10−8 including six already known to associate with VTE (ABO, F2, F5, F11, FGG, and PROCR) and three unsuspected loci. SNPs mapping to these latter were selected for replication in three independent case-control studies totaling 3,009 VTE-affected individuals and 2,586 control subjects. This strategy led to the identification and replication of two VTE-associated loci, TSPAN15 and SLC44A2, with lead risk alleles associated with odds ratio for disease of 1.31 (p = 1.67 × 10−16) and 1.21 (p = 2.75 × 10−15), respectively. The lead SNP at the TSPAN15 locus is the intronic rs78707713 and the lead SLC44A2 SNP is the non-synonymous rs2288904 previously shown to associate with transfusion-related acute lung injury. We further showed that these two variants did not associate with known hemostatic plasma markers. TSPAN15 and SLC44A2 do not belong to conventional pathways for thrombosis and have not been associated to other cardiovascular diseases nor related quantitative biomarkers. Our findings uncovered unexpected actors of VTE etiology and pave the way for novel mechanistic concepts of VTE pathophysiology.

236 citations


Journal ArticleDOI
TL;DR: A genome-wide association study to discern differences in genetic risk factors for PsA and cutaneous-only psoriasis (PsC) and finds multiple independent susceptibility variants in the IL12B, NOS2, and IFIH1 regions.
Abstract: Psoriasis vulgaris (PsV) is a common inflammatory and hyperproliferative skin disease. Up to 30% of people with PsV eventually develop psoriatic arthritis (PsA), an inflammatory musculoskeletal condition. To discern differences in genetic risk factors for PsA and cutaneous-only psoriasis (PsC), we carried out a genome-wide association study (GWAS) of 1,430 PsA case subjects and 1,417 unaffected control subjects. Meta-analysis of this study with three other GWASs and two targeted genotyping studies, encompassing a total of 9,293 PsV case subjects, 3,061 PsA case subjects, 3,110 PsC case subjects, and 13,670 unaffected control subjects of European descent, detected 10 regions associated with PsA and 11 with PsC at genome-wide (GW) significance. Several of these association signals (IFNLR1, IFIH1, NFKBIA for PsA; TNFRSF9, LCE3C/B, TRAF3IP2, IL23A, NFKBIA for PsC) have not previously achieved GW significance. After replication, we also identified a PsV-associated SNP near CDKAL1 (rs4712528, odds ratio [OR] = 1.16, p = 8.4 × 10(-11)). Among identified psoriasis risk variants, three were more strongly associated with PsC than PsA (rs12189871 near HLA-C, p = 5.0 × 10(-19); rs4908742 near TNFRSF9, p = 0.00020; rs10888503 near LCE3A, p = 0.0014), and two were more strongly associated with PsA than PsC (rs12044149 near IL23R, p = 0.00018; rs9321623 near TNFAIP3, p = 0.00022). The PsA-specific variants were independent of previously identified psoriasis variants near IL23R and TNFAIP3. We also found multiple independent susceptibility variants in the IL12B, NOS2, and IFIH1 regions. These results provide insights into the pathogenetic similarities and differences between PsC and PsA.

Journal ArticleDOI
TL;DR: Variants in SLC39A8 impair the function of manganese-dependent enzymes, most notably β-1,4-galactosyltransferase, a Golgi enzyme essential for biosynthesis of the carbohydrate part of glycoproteins that leads to a severe disorder with deformed skull, severe seizures, short limbs, profound psychomotor retardation, and hearing loss.
Abstract: SLC39A8 is a membrane transporter responsible for manganese uptake into the cell. Via whole-exome sequencing, we studied a child that presented with cranial asymmetry, severe infantile spasms with hypsarrhythmia, and dysproportionate dwarfism. Analysis of transferrin glycosylation revealed severe dysglycosylation corresponding to a type II congenital disorder of glycosylation (CDG) and the blood manganese levels were below the detection limit. The variants c.112G>C (p.Gly38Arg) and c.1019T>A (p.Ile340Asn) were identified in SLC39A8. A second individual with the variants c.97G>A (p.Val33Met) and c.1004G>C (p.Ser335Thr) on the paternal allele and c.610G>T (p.Gly204Cys) on the maternal allele was identified among a group of unresolved case subjects with CDG. These data demonstrate that variants in SLC39A8 impair the function of manganese-dependent enzymes, most notably β-1,4-galactosyltransferase, a Golgi enzyme essential for biosynthesis of the carbohydrate part of glycoproteins. Impaired galactosylation leads to a severe disorder with deformed skull, severe seizures, short limbs, profound psychomotor retardation, and hearing loss. Oral galactose supplementation is a treatment option and results in complete normalization of glycosylation. SLC39A8 deficiency links a trace element deficiency with inherited glycosylation disorders.

Journal ArticleDOI
TL;DR: Significant genotype-phenotype correlations in lesion localization and histology are observed between individuals with mutations in PIK3CA versus TEK, pointing to gene-specific effects.
Abstract: Somatic mutations in TEK, the gene encoding endothelial cell tyrosine kinase receptor TIE2, cause more than half of sporadically occurring unifocal venous malformations (VMs). Here, we report that somatic mutations in PIK3CA, the gene encoding the catalytic p110α subunit of PI3K, cause 54% (27 out of 50) of VMs with no detected TEK mutation. The hotspot mutations c.1624G>A, c.1633G>A, and c.3140A>G (p.Glu542Lys, p.Glu545Lys, and p.His1047Arg), frequent in PIK3CA-associated cancers, overgrowth syndromes, and lymphatic malformation (LM), account for >92% of individuals who carry mutations. Like VM-causative mutations in TEK, the PIK3CA mutations cause chronic activation of AKT, dysregulation of certain important angiogenic factors, and abnormal endothelial cell morphology when expressed in human umbilical vein endothelial cells (HUVECs). The p110α-specific inhibitor BYL719 restores all abnormal phenotypes tested, in PIK3CA- as well as TEK-mutant HUVECs, demonstrating that they operate via the same pathogenic pathways. Nevertheless, significant genotype-phenotype correlations in lesion localization and histology are observed between individuals with mutations in PIK3CA versus TEK, pointing to gene-specific effects.

Journal ArticleDOI
TL;DR: Results show that an important fraction of de novo mutations presumed to be germline in fact occurred either post-zygotically in the offspring or were inherited as a consequence of low-level mosaicism in one of the parents.
Abstract: De novo mutations are recognized both as an important source of genetic variation and as a prominent cause of sporadic disease in humans. Mutations identified as de novo are generally assumed to have occurred during gametogenesis and, consequently, to be present as germline events in an individual. Because Sanger sequencing does not provide the sensitivity to reliably distinguish somatic from germline mutations, the proportion of de novo mutations that occur somatically rather than in the germline remains largely unknown. To determine the contribution of post-zygotic events to de novo mutations, we analyzed a set of 107 de novo mutations in 50 parent-offspring trios. Using four different sequencing techniques, we found that 7 (6.5%) of these presumed germline de novo mutations were in fact present as mosaic mutations in the blood of the offspring and were therefore likely to have occurred post-zygotically. Furthermore, genome-wide analysis of "de novo" variants in the proband led to the identification of 4/4,081 variants that were also detectable in the blood of one of the parents, implying parental mosaicism as the origin of these variants. Thus, our results show that an important fraction of de novo mutations presumed to be germline in fact occurred either post-zygotically in the offspring or were inherited as a consequence of low-level mosaicism in one of the parents.

Journal ArticleDOI
Lot Snijders Blok1, Erik C. Madsen2, Jane Juusola3, Christian Gilissen1, Diana Baralle4, Margot R.F. Reijnders1, Hanka Venselaar1, Céline Helsmoortel5, Megan T. Cho3, Alexander Hoischen1, Lisenka E.L.M. Vissers1, Tom S. Koemans1, Willemijn M. Wissink-Lindhout1, Evan E. Eichler6, Evan E. Eichler7, Corrado Romano, Hilde Van Esch8, Connie T.R.M. Stumpel9, Maaike Vreeburg9, E. Smeets9, Karin Oberndorff, Bregje W.M. van Bon1, Bregje W.M. van Bon10, Marie Shaw10, Jozef Gecz10, Eric Haan10, M Bienek11, C Jensen11, Bart Loeys5, Anke Van Dijck5, A. Micheil Innes12, Hilary Racher12, Sascha Vermeer13, Nataliya Di Donato14, Andreas Rump14, Katrina Tatton-Brown15, Michael Parker16, Alex Henderson17, Sally Ann Lynch16, Alan Fryer, Alison Ross, Pradeep Vasudevan18, Usha Kini19, Ruth Newbury-Ecob20, Kate Chandler21, Alison Male22, Sybe Dijkstra, Jolanda H. Schieving1, Jacques C. Giltay23, Koen L.I. van Gassen23, Janneke H M Schuurs-Hoeijmakers1, Perciliz L. Tan2, Igor Pediaditakis2, Stefan A. Haas11, Kyle Retterer3, Patrick Reed3, Kristin G. Monaghan3, Eden Haverfield3, Marvin R. Natowicz24, Angela Myers, Michael C. Kruer16, Quinn Stein16, Kevin A. Strauss25, Karlla W. Brigatti25, Katherine G. Keating26, Barbara K. Burton26, Katherine H. Kim26, Joel Charrow26, Jennifer Norman, Audrey Foster-Barber27, Antonie D. Kline28, Amy S. Kimball28, Elaine H. Zackai29, Margaret H. Harr29, Joyce Fox, Julie McLaughlin, Kristin Lindstrom16, Katrina Haude30, Kees E. P. van Roozendaal9, Han G. Brunner9, Wendy K. Chung31, R. Frank Kooy5, Rolph Pfundt1, Vera M. Kalscheuer11, Sarju G. Mehta, Nicholas Katsanis2, Tjitske Kleefstra1 
TL;DR: A consistent loss-of-function effect of all tested de novo mutations on the Wnt pathway is demonstrated, and a differential effect by gender is shown, possibly reflects a dose-dependent effect of DDX3X expression in the context of functional mosaic females versus one-copy males, which reflects the complex biological nature of DDx3X mutations.
Abstract: Intellectual disability (ID) affects approximately 1%–3% of humans with a gender bias toward males. Previous studies have identified mutations in more than 100 genes on the X chromosome in males with ID, but there is less evidence for de novo mutations on the X chromosome causing ID in females. In this study we present 35 unique deleterious de novo mutations in DDX3X identified by whole exome sequencing in 38 females with ID and various other features including hypotonia, movement disorders, behavior problems, corpus callosum hypoplasia, and epilepsy. Based on our findings, mutations in DDX3X are one of the more common causes of ID, accounting for 1%–3% of unexplained ID in females. Although no de novo DDX3X mutations were identified in males, we present three families with segregating missense mutations in DDX3X, suggestive of an X-linked recessive inheritance pattern. In these families, all males with the DDX3X variant had ID, whereas carrier females were unaffected. To explore the pathogenic mechanisms accounting for the differences in disease transmission and phenotype between affected females and affected males with DDX3X missense variants, we used canonical Wnt defects in zebrafish as a surrogate measure of DDX3X function in vivo. We demonstrate a consistent loss-of-function effect of all tested de novo mutations on the Wnt pathway, and we further show a differential effect by gender. The differential activity possibly reflects a dose-dependent effect of DDX3X expression in the context of functional mosaic females versus one-copy males, which reflects the complex biological nature of DDX3X mutations.

Journal ArticleDOI
TL;DR: It is suggested that dominance variation at common SNPs explains only a small fraction of phenotypic variation for human complex traits and contributes little to the missing narrow-sense heritability problem.
Abstract: For human complex traits, non-additive genetic variation has been invoked to explain "missing heritability," but its discovery is often neglected in genome-wide association studies. Here we propose a method of using SNP data to partition and estimate the proportion of phenotypic variance attributed to additive and dominance genetic variation at all SNPs (hSNP(2) and δSNP(2)) in unrelated individuals based on an orthogonal model where the estimate of hSNP(2) is independent of that of δSNP(2). With this method, we analyzed 79 quantitative traits in 6,715 unrelated European Americans. The estimate of δSNP(2) averaged across all the 79 quantitative traits was 0.03, approximately a fifth of that for additive variation (average hSNP(2) = 0.15). There were a few traits that showed substantial estimates of δSNP(2), none of which were replicated in a larger sample of 11,965 individuals. We further performed genome-wide association analyses of the 79 quantitative traits and detected SNPs with genome-wide significant dominance effects only at the ABO locus for factor VIII and von Willebrand factor. All these results suggest that dominance variation at common SNPs explains only a small fraction of phenotypic variation for human complex traits and contributes little to the missing narrow-sense heritability problem.

Journal ArticleDOI
TL;DR: A concept-recognition procedure is developed that analyzes the frequencies of HPO disease annotations as identified in over five million PubMed abstracts by employing an iterative procedure to optimize precision and recall of the identified terms.
Abstract: The Human Phenotype Ontology (HPO) is widely used in the rare disease community for differential diagnostics, phenotype-driven analysis of next-generation sequence-variation data, and translational research, but a comparable resource has not been available for common disease. Here, we have developed a concept-recognition procedure that analyzes the frequencies of HPO disease annotations as identified in over five million PubMed abstracts by employing an iterative procedure to optimize precision and recall of the identified terms. We derived disease models for 3,145 common human diseases comprising a total of 132,006 HPO annotations. The HPO now comprises over 250,000 phenotypic annotations for over 10,000 rare and common diseases and can be used for examining the phenotypic overlap among common diseases that share risk alleles, as well as between Mendelian diseases and common diseases linked by genomic location. The annotations, as well as the HPO itself, are freely available.

Journal ArticleDOI
TL;DR: A fast analytic method that uses polygenic scores, based on the formula for the non-centrality parameter of the association test of the score, that provides nearly unbiased estimates and confidence intervals with good coverage, although estimation of the variance is less reliable when jointly estimated with the covariance.
Abstract: Several methods have been proposed to estimate the variance in disease liability explained by large sets of genetic markers. However, current methods do not scale up well to large sample sizes. Linear mixed models require solving high-dimensional matrix equations, and methods that use polygenic scores are very computationally intensive. Here we propose a fast analytic method that uses polygenic scores, based on the formula for the non-centrality parameter of the association test of the score. We estimate model parameters from the results of multiple polygenic score tests based on markers with p values in different intervals. We estimate parameters by maximum likelihood and use profile likelihood to compute confidence intervals. We compare various options for constructing polygenic scores, based on nested or disjoint intervals of p values, weighted or unweighted effect sizes, and different numbers of intervals, in estimating the variance explained by a set of markers, the proportion of markers with effects, and the genetic covariance between a pair of traits. Our method provides nearly unbiased estimates and confidence intervals with good coverage, although estimation of the variance is less reliable when jointly estimated with the covariance. We find that disjoint p value intervals perform better than nested intervals, but the weighting did not affect our results. A particular advantage of our method is that it can be applied to summary statistics from single markers, and so can be quickly applied to large consortium datasets. Our method, named AVENGEME (Additive Variance Explained and Number of Genetic Effects Method of Estimation), is implemented in R software.

Journal ArticleDOI
TL;DR: The CVID phenotype in these families is caused by NF-κB1 p50 haploinsufficiency, with a Dutch-Australian CVID-affected family identified a NFKB1 heterozygous splice-donor-site mutation, causing in-frame skipping of exon 8.
Abstract: Common variable immunodeficiency (CVID), characterized by recurrent infections, is the most prevalent symptomatic antibody deficiency. In ∼90% of CVID-affected individuals, no genetic cause of the disease has been identified. In a Dutch-Australian CVID-affected family, we identified a NFKB1 heterozygous splice-donor-site mutation (c.730+4A>G), causing in-frame skipping of exon 8. NFKB1 encodes the transcription-factor precursor p105, which is processed to p50 (canonical NF-κB pathway). The altered protein bearing an internal deletion (p.Asp191_Lys244delinsGlu; p105ΔEx8) is degraded, but is not processed to p50ΔEx8. Altered NF-κB1 proteins were also undetectable in a German CVID-affected family with a heterozygous in-frame exon 9 skipping mutation (c.835+2T>G) and in a CVID-affected family from New Zealand with a heterozygous frameshift mutation (c.465dupA) in exon 7. Given that residual p105 and p50—translated from the non-mutated alleles—were normal, and altered p50 proteins were absent, we conclude that the CVID phenotype in these families is caused by NF-κB1 p50 haploinsufficiency.

Journal ArticleDOI
TL;DR: Methods that integrate the strength of association between genotype and phenotype, the variability in the genetic backgrounds across populations, and the genomic map of tissue-specific functional elements to increase trans-ethnic fine-mapping accuracy are introduced.
Abstract: Localization of causal variants underlying known risk loci is one of the main research challenges following genome-wide association studies. Risk loci are typically dissected through fine-mapping experiments in trans-ethnic cohorts for leveraging the variability in the local genetic structure across populations. More recent works have shown that genomic functional annotations (i.e., localization of tissue-specific regulatory marks) can be integrated for increasing fine-mapping performance within single-population studies. Here, we introduce methods that integrate the strength of association between genotype and phenotype, the variability in the genetic backgrounds across populations, and the genomic map of tissue-specific functional elements to increase trans-ethnic fine-mapping accuracy. Through extensive simulations and empirical data, we have demonstrated that our approach increases fine-mapping resolution over existing methods. We analyzed empirical data from a large-scale trans-ethnic rheumatoid arthritis (RA) study and showed that the functional genetic architecture of RA is consistent across European and Asian ancestries. In these data, we used our proposed methods to reduce the average size of the 90% credible set from 29 variants per locus for standard non-integrative approaches to 22 variants.

Journal ArticleDOI
TL;DR: In this paper, a multivariate linear mixed model was used to predict genetic risk for schizophrenia, bipolar disorder, and major depressive disorder in the discovery as well as in independent validation datasets.
Abstract: Genetic risk prediction has several potential applications in medical research and clinical practice and could be used, for example, to stratify a heterogeneous population of patients by their predicted genetic risk. However, for polygenic traits, such as psychiatric disorders, the accuracy of risk prediction is low. Here we use a multivariate linear mixed model and apply multi-trait genomic best linear unbiased prediction for genetic risk prediction. This method exploits correlations between disorders and simultaneously evaluates individual risk for each disorder. We show that the multivariate approach significantly increases the prediction accuracy for schizophrenia, bipolar disorder, and major depressive disorder in the discovery as well as in independent validation datasets. By grouping SNPs based on genome annotation and fitting multiple random effects, we show that the prediction accuracy could be further improved. The gain in prediction accuracy of the multivariate approach is equivalent to an increase in sample size of 34% for schizophrenia, 68% for bipolar disorder, and 76% for major depressive disorders using single trait models. Because our approach can be readily applied to any number of GWAS datasets of correlated traits, it is a flexible and powerful tool to maximize prediction accuracy. With current sample size, risk predictors are not useful in a clinical setting but already are a valuable research tool, for example in experimental designs comparing cases with high and low polygenic risk.

Journal ArticleDOI
TL;DR: It is argued that few of the genes involved could have been predicted based on expression patterns alone and fit into a limited number of functional modules active at different stages of cranial suture development, which provides a useful approach both when defining the potential role of new candidate genes in craniosynostosis and for devising pharmacological approaches to therapy.
Abstract: Craniosynostosis, the premature fusion of one or more cranial sutures of the skull, provides a paradigm for investigating the interplay of genetic and environmental factors leading to malformation. Over the past 20 years molecular genetic techniques have provided a new approach to dissect the underlying causes; success has mostly come from investigation of clinical samples, and recent advances in high-throughput DNA sequencing have dramatically enhanced the study of the human as the preferred "model organism." In parallel, however, we need a pathogenetic classification to describe the pathways and processes that lead to cranial suture fusion. Given the prenatal onset of most craniosynostosis, investigation of mechanisms requires more conventional model organisms; principally the mouse, because of similarities in cranial suture development. We present a framework for classifying genetic causes of craniosynostosis based on current understanding of cranial suture biology and molecular and developmental pathogenesis. Of note, few pathologies result from complete loss of gene function. Instead, biochemical mechanisms involving haploinsufficiency, dominant gain-of-function and recessive hypomorphic mutations, and an unusual X-linked cellular interference process have all been implicated. Although few of the genes involved could have been predicted based on expression patterns alone (because the genes play much wider roles in embryonic development or cellular homeostasis), we argue that they fit into a limited number of functional modules active at different stages of cranial suture development. This provides a useful approach both when defining the potential role of new candidate genes in craniosynostosis and, potentially, for devising pharmacological approaches to therapy.

Journal ArticleDOI
TL;DR: The results show that beacons can disclose membership and implied phenotypic information about participants and do not protect privacy a priori and discuss risk mitigation through policies and standards such as not allowing anonymous pings of genetic beacons and requiring minimum beacon sizes.
Abstract: The human genetics community needs robust protocols that enable secure sharing of genomic data from participants in genetic research. Beacons are web servers that answer allele-presence queries—such as “Do you have a genome that has a specific nucleotide (e.g., A) at a specific genomic position (e.g., position 11,272 on chromosome 1)?”—with either “yes” or “no.” Here, we show that individuals in a beacon are susceptible to re-identification even if the only data shared include presence or absence information about alleles in a beacon. Specifically, we propose a likelihood-ratio test of whether a given individual is present in a given genetic beacon. Our test is not dependent on allele frequencies and is the most powerful test for a specified false-positive rate. Through simulations, we showed that in a beacon with 1,000 individuals, re-identification is possible with just 5,000 queries. Relatives can also be identified in the beacon. Re-identification is possible even in the presence of sequencing errors and variant-calling differences. In a beacon constructed with 65 European individuals from the 1000 Genomes Project, we demonstrated that it is possible to detect membership in the beacon with just 250 SNPs. With just 1,000 SNP queries, we were able to detect the presence of an individual genome from the Personal Genome Project in an existing beacon. Our results show that beacons can disclose membership and implied phenotypic information about participants and do not protect privacy a priori. We discuss risk mitigation through policies and standards such as not allowing anonymous pings of genetic beacons and requiring minimum beacon sizes.

Journal ArticleDOI
TL;DR: A gain-of-function IFIH1 mutation is identified as causing Singleton-Merten syndrome and leading to early arterial calcification and dental inflammation and resorption.
Abstract: Singleton-Merten syndrome (SMS) is an infrequently described autosomal-dominant disorder characterized by early and extreme aortic and valvular calcification, dental anomalies (early-onset periodontitis and root resorption), osteopenia, and acro-osteolysis. To determine the molecular etiology of this disease, we performed whole-exome sequencing and targeted Sanger sequencing. We identified a common missense mutation, c.2465G>A (p.Arg822Gln), in interferon induced with helicase C domain 1 (IFIH1, encoding melanoma differentiation-associated protein 5 [MDA5]) in four SMS subjects from two families and a simplex case. IFIH1 has been linked to a number of autoimmune disorders, including Aicardi-Goutieres syndrome. Immunohistochemistry demonstrated the localization of MDA5 in all affected target tissues. In vitro functional analysis revealed that the IFIH1 c.2465G>A mutation enhanced MDA5 function in interferon beta induction. Interferon signature genes were upregulated in SMS individuals’ blood and dental cells. Our data identify a gain-of-function IFIH1 mutation as causing SMS and leading to early arterial calcification and dental inflammation and resorption.

Journal ArticleDOI
TL;DR: It is demonstrated that DDX58 mutations cause atypical SMS manifesting with variable expression of glaucoma, aortic calcification, and skeletal abnormalities without dental anomalies.
Abstract: Singleton-Merten syndrome (SMS) is an autosomal-dominant multi-system disorder characterized by dental dysplasia, aortic calcification, skeletal abnormalities, glaucoma, psoriasis, and other conditions. Despite an apparent autosomal-dominant pattern of inheritance, the genetic background of SMS and information about its phenotypic heterogeneity remain unknown. Recently, we found a family affected by glaucoma, aortic calcification, and skeletal abnormalities. Unlike subjects with classic SMS, affected individuals showed normal dentition, suggesting atypical SMS. To identify genetic causes of the disease, we performed exome sequencing in this family and identified a variant (c.1118A>C [p.Glu373Ala]) of DDX58, whose protein product is also known as RIG-I. Further analysis of DDX58 in 100 individuals with congenital glaucoma identified another variant (c.803G>T [p.Cys268Phe]) in a family who harbored neither dental anomalies nor aortic calcification but who suffered from glaucoma and skeletal abnormalities. Cys268 and Glu373 residues of DDX58 belong to ATP-binding motifs I and II, respectively, and these residues are predicted to be located closer to the ADP and RNA molecules than other nonpathogenic missense variants by protein structure analysis. Functional assays revealed that DDX58 alterations confer constitutive activation and thus lead to increased interferon (IFN) activity and IFN-stimulated gene expression. In addition, when we transduced primary human trabecular meshwork cells with c.803G>T (p.Cys268Phe) and c.1118A>C (p.Glu373Ala) mutants, cytopathic effects and a significant decrease in cell number were observed. Taken together, our results demonstrate that DDX58 mutations cause atypical SMS manifesting with variable expression of glaucoma, aortic calcification, and skeletal abnormalities without dental anomalies.

Journal ArticleDOI
TL;DR: The results show that atopic dermatitis and psoriasis have distinct genetic mechanisms with opposing effects in shared pathways influencing epidermal differentiation and immune response.
Abstract: Atopic dermatitis and psoriasis are the two most common immune-mediated inflammatory disorders affecting the skin. Genome-wide studies demonstrate a high degree of genetic overlap, but these diseases have mutually exclusive clinical phenotypes and opposing immune mechanisms. Despite their prevalence, atopic dermatitis and psoriasis very rarely co-occur within one individual. By utilizing genome-wide association study and ImmunoChip data from >19,000 individuals and methodologies developed from meta-analysis, we have identified opposing risk alleles at shared loci as well as independent disease-specific loci within the epidermal differentiation complex (chromosome 1q21.3), the Th2 locus control region (chromosome 5q31.1), and the major histocompatibility complex (chromosome 6p21-22). We further identified previously unreported pleiotropic alleles with opposing effects on atopic dermatitis and psoriasis risk in PRKRA and ANXA6/TNIP1. In contrast, there was no evidence for shared loci with effects operating in the same direction on both diseases. Our results show that atopic dermatitis and psoriasis have distinct genetic mechanisms with opposing effects in shared pathways influencing epidermal differentiation and immune response. The statistical analysis methods developed in the conduct of this study have produced additional insight from previously published data sets. The approach is likely to be applicable to the investigation of the genetic basis of other complex traits with overlapping and distinct clinical features.

Journal ArticleDOI
TL;DR: Both the haplotype and MSMC analyses suggest a predominant northern route out of Africa via Egypt, pointing to Egypt as the more likely gateway in the exodus to the rest of the world.
Abstract: The predominantly African origin of all modern human populations is well established, but the route taken out of Africa is still unclear Two alternative routes, via Egypt and Sinai or across the Bab el Mandeb strait into Arabia, have traditionally been proposed as feasible gateways in light of geographic, paleoclimatic, archaeological, and genetic evidence Distinguishing among these alternatives has been difficult We generated 225 whole-genome sequences (225 at 8× depth, of which 8 were increased to 30×; Illumina HiSeq 2000) from six modern Northeast African populations (100 Egyptians and five Ethiopian populations each represented by 25 individuals) West Eurasian components were masked out, and the remaining African haplotypes were compared with a panel of sub-Saharan African and non-African genomes We showed that masked Northeast African haplotypes overall were more similar to non-African haplotypes and more frequently present outside Africa than were any sets of haplotypes derived from a West African population Furthermore, the masked Egyptian haplotypes showed these properties more markedly than the masked Ethiopian haplotypes, pointing to Egypt as the more likely gateway in the exodus to the rest of the world Using five Ethiopian and three Egyptian high-coverage masked genomes and the multiple sequentially Markovian coalescent (MSMC) approach, we estimated the genetic split times of Egyptians and Ethiopians from non-African populations at 55,000 and 65,000 years ago, respectively, whereas that of West Africans was estimated to be 75,000 years ago Both the haplotype and MSMC analyses thus suggest a predominant northern route out of Africa via Egypt

Journal ArticleDOI
TL;DR: An autosomal-recessive disorder in six individuals from the Hutterite community and in an unrelated Egyptian sibpair is described, providing insight into the roles of Mn and Zn homeostasis in human health and development.
Abstract: Manganese (Mn) and zinc (Zn) are essential divalent cations used by cells as protein cofactors; various human studies and animal models have demonstrated the importance of Mn and Zn for development. Here we describe an autosomal-recessive disorder in six individuals from the Hutterite community and in an unrelated Egyptian sibpair; the disorder is characterized by intellectual disability, developmental delay, hypotonia, strabismus, cerebellar atrophy, and variable short stature. Exome sequencing in one affected Hutterite individual and the Egyptian family identified the same homozygous variant, c.112G>C (p.Gly38Arg), affecting a conserved residue of SLC39A8. The affected Hutterite and Egyptian individuals did not share an extended common haplotype, suggesting that the mutation arose independently. SLC39A8 is a member of the solute carrier gene family known to import Mn, Zn, and other divalent cations across the plasma membrane. Evaluation of these two metal ions in the affected individuals revealed variably low levels of Mn and Zn in blood and elevated levels in urine, indicating renal wasting. Our findings identify a human Mn and Zn transporter deficiency syndrome linked to SLC39A8, providing insight into the roles of Mn and Zn homeostasis in human health and development.

Journal ArticleDOI
TL;DR: This targeted sequencing study provides strong functional evidence implicating several specific variants as primary contributory risk alleles for nonsyndromic clefting in humans.
Abstract: Although genome-wide association studies (GWASs) for nonsyndromic orofacial clefts have identified multiple strongly associated regions, the causal variants are unknown. To address this, we selected 13 regions from GWASs and other studies, performed targeted sequencing in 1,409 Asian and European trios, and carried out a series of statistical and functional analyses. Within a cluster of strongly associated common variants near NOG, we found that one, rs227727, disrupts enhancer activity. We furthermore identified significant clusters of non-coding rare variants near NTN1 and NOG and found several rare coding variants likely to affect protein function, including four nonsense variants in ARHGAP29. We confirmed 48 de novo mutations and, based on best biological evidence available, chose two of these for functional assays. One mutation in PAX7 disrupted the DNA binding of the encoded transcription factor in an in vitro assay. The second, a non-coding mutation, disrupted the activity of a neural crest enhancer downstream of FGFR2 both in vitro and in vivo. This targeted sequencing study provides strong functional evidence implicating several specific variants as primary contributory risk alleles for nonsyndromic clefting in humans.

Journal ArticleDOI
TL;DR: Target resequencing of 644 individuals with epileptic encephalopathies led to the identification of six SLC6A1 mutations in seven individuals, all of whom have epilepsy with myoclonic-atonic seizures (MAE), accounting for ~4% of unsolved MAE cases.
Abstract: GAT-1, encoded by SLC6A1, is one of the major gamma-aminobutyric acid (GABA) transporters in the brain and is responsible for re-uptake of GABA from the synapse In this study, targeted resequencing of 644 individuals with epileptic encephalopathies led to the identification of six SLC6A1 mutations in seven individuals, all of whom have epilepsy with myoclonic-atonic seizures (MAE) We describe two truncations and four missense alterations, all of which most likely lead to loss of function of GAT-1 and thus reduced GABA re-uptake from the synapse These individuals share many of the electrophysiological properties of Gat1-deficient mice, including spontaneous spike-wave discharges Overall, pathogenic mutations occurred in 6/160 individuals with MAE, accounting for ∼4% of unsolved MAE cases