scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Identification of genetic variants associated with Huntington's disease progression: a genome-wide association study

01 Sep 2017-Lancet Neurology (Elsevier)-Vol. 16, Iss: 9, pp 701-711
TL;DR: A novel measure of disease progression and a genome-wide significant signal on chromosome 5 spanning three genes: MSH3, DHFR, and MTRNR2L2 is generated, suggesting this mechanism as an area for future therapeutic investigation.
Abstract: Summary Background Huntington's disease is caused by a CAG repeat expansion in the huntingtin gene, HTT . Age at onset has been used as a quantitative phenotype in genetic analysis looking for Huntington's disease modifiers, but is hard to define and not always available. Therefore, we aimed to generate a novel measure of disease progression and to identify genetic markers associated with this progression measure. Methods We generated a progression score on the basis of principal component analysis of prospectively acquired longitudinal changes in motor, cognitive, and imaging measures in the 218 indivduals in the TRACK-HD cohort of Huntington's disease gene mutation carriers (data collected 2008–11). We generated a parallel progression score using data from 1773 previously genotyped participants from the European Huntington's Disease Network REGISTRY study of Huntington's disease mutation carriers (data collected 2003–13). We did a genome-wide association analyses in terms of progression for 216 TRACK-HD participants and 1773 REGISTRY participants, then a meta-analysis of these results was undertaken. Findings Longitudinal motor, cognitive, and imaging scores were correlated with each other in TRACK-HD participants, justifying use of a single, cross-domain measure of disease progression in both studies. The TRACK-HD and REGISTRY progression measures were correlated with each other (r=0·674), and with age at onset (TRACK-HD, r=0·315; REGISTRY, r=0·234). The meta-analysis of progression in TRACK-HD and REGISTRY gave a genome-wide significant signal (p=1·12 × 10 −10 ) on chromosome 5 spanning three genes: MSH3, DHFR , and MTRNR2L2 . The genes in this locus were associated with progression in TRACK-HD ( MSH3 p=2·94 × 10 −8 DHFR p=8·37 × 10 −7 MTRNR2L2 p=2·15 × 10 −9 ) and to a lesser extent in REGISTRY ( MSH3 p=9·36 × 10 −4 DHFR p=8·45 × 10 −4 MTRNR2L2 p=1·20 × 10 −3 ). The lead single nucleotide polymorphism (SNP) in TRACK-HD (rs557874766) was genome-wide significant in the meta-analysis (p=1·58 × 10 −8 ), and encodes an aminoacid change (Pro67Ala) in MSH3. In TRACK-HD, each copy of the minor allele at this SNP was associated with a 0·4 units per year (95% CI 0·16–0·66) reduction in the rate of change of the Unified Huntington's Disease Rating Scale (UHDRS) Total Motor Score, and a reduction of 0·12 units per year (95% CI 0·06–0·18) in the rate of change of UHDRS Total Functional Capacity score. These associations remained significant after adjusting for age of onset. Interpretation The multidomain progression measure in TRACK-HD was associated with a functional variant that was genome-wide significant in our meta-analysis. The association in only 216 participants implies that the progression measure is a sensitive reflection of disease burden, that the effect size at this locus is large, or both. Knockout of Msh3 reduces somatic expansion in Huntington's disease mouse models, suggesting this mechanism as an area for future therapeutic investigation. Funding The European Commission FP7 NeurOmics project; CHDI Foundation; the Medical Research Council UK; the Brain Research Trust; and the Guarantors of Brain.

Summary (2 min read)

INTRODUCTION

  • Huntington’s disease (HD) is a autosomal dominant fatal neurodegenerative condition caused by a CAG repeat expansion in HTT (1).
  • The transition from premanifest to manifest HD is gradual (4, 5), making clinical definition challenging, furthermore psychiatric and cognitive changes may not be concurrent with motor onset (6).
  • The need for clinical trials close to disease onset has motivated a raft of observational studies (5, 9, 10).
  • TRACK-HD represents the most deeply phenotyped cohort of premanifest and symptomatic disease with annual visits involving clinical, cognitive and motor testing alongside detailed brain imaging (5, 6).
  • The authors developed a similar measure in subjects from the REGISTRY study to replicate their findings (9).

Study design and participants

  • All experiments were performed in accordance with the Declaration of Helsinki and approved by the University College London (UCL)/UCL Hospitals Joint Research Ethics Committee; ethical approval for the REGISTRY analysis is outlined in (8).
  • It provides annually collected high quality longitudinal prospective multivariate data over three years (2008-2011) with 243 subjects at baseline (6) .
  • Demographic details of these individuals are shown in Supplementary Information.
  • REGISTRY(9) was a multisite prospective observational study which collected phenotypic data between 2003 – 2013 on over 13,000 subjects, mostly manifest HD gene carriers.
  • Age, CAG repeat length, UHDRS Total Motor Score (TMS) and Total Functional Capacity (TFC); some patients have further assessments such as a cognitive battery (9), also known as The core data include.

Procedures

  • For both studies, atypical severity scores were derived with a combination of principal component analysis (PCA) and regression of the predictable effects of the primary gene HTT CAG repeat length.
  • Details differed however, due to differences in nature of the two data sets.
  • This model regressed the observed values on clinical probability of onset statistic (CPO) derived from CAG repeat length and age, and its interaction with follow-up length.
  • Principal Component Analyses (PCA) of the random slopes was then used to study the dimensionality of these age and CAG-length corrected longitudinal changes.
  • Further methodological detail, including control for potential demographic confounders, is given in Supplementary Methods and a flow chart is given in Figure 1.

Statistical and genetic analysis

  • Data analyses were performed using SAS/STAT 14·0 and 14·1 primarily via the MIXED, FACTOR and GML procedures (11).
  • Genotypes for the REGISTRY subjects were obtained from the GeM-HD Consortium (8), where details of their genotyping, quality control, curation and imputation are provided.
  • Association analyses were performed with the mixed linear model (MLM) functions included in GCTA v1·26(12).
  • Because of the relatively small sample sizes, analyses were restricted to SNPs with minor allele frequency >1%.
  • Gene-wide p-values were calculated using MAGMA v1·05, a powerful alternative to SNP-based analyses which aggregates the association signal inside genes while taking linkage disequilibrium (LD) between SNPs into account (15), using a window of 35kb upstream and 10kb downstream of genes (16).

RESULTS

  • The authors performed individual PCA of each domain and found that first PC scores were highly correlated between the domains (P < 0·0001 in all cases, Supplementary Information.).
  • The first PC of this combined analysis accounted for 23.4% of the joint variance, and was at least moderately correlated (r>0·4) with most of the variables that contributed heavily to each domainspecific first PC (Supplementary Tables 3 and 4).
  • Notably, the genic associations at the MSH3 locus in the TRACK-HD sample also remain significant after correcting for AAO (http://hdresearch.ucl.ac.uk/data-resources/), as does the association with rs557874766 (p=6·30x10 -6 ).
  • Msh3 is required for both somatic expansion of HTT CAG repeats and for enhancing an early disease phenotype in mouse striatum (32), Msh3 expression level is associated with repeat instability in mouse brain, (whereas DHFR is not) (30) and expansion of CAG and CTG repeats is prevented by msh3Δ in Saccharomyces cerevisiae (33).
  • This indicates that either their progression measure developed in TRACK-HD is an excellent reflection of disease pathophysiological progression or that this is a locus with a very large effect size, or, most likely, both.

Author contributions and declarations

  • DJHM collected data, undertook analysis, and wrote the first draft of the ms.
  • DL undertook the statistical analysis of phenotype, co-wrote the ms.
  • LJ helped secure funding, supervised data analyses, co-wrote the ms.
  • DL reports grant funding from CHDI via University College London (UCL), and personal fees from Roche Pharmaceutical, Voyager Pharmaceutical, and Teva Pharmaceuticals.
  • DJHM, KL, AD, AFP, SM, LJ, RR, PH, and SJT declare no competing interests.

Figure & Table legends

  • After establishing that brain imaging, quantitative motor and cognitive variables are correlated and follow a similar trajectory, the authors scored the TRACK-HD subjects using principal component 1 as a Unified progression measure, and used this measure to look for genomewide associations with HD progression.
  • Assessing progression in Huntington’s disease (A) Graphical illustration of the trajectory of HD symptoms and signs over time, annotated to show what time period the different measures of onset and progression discussed in this paper cover, also known as Figure 2.
  • (B) Manhattan plot of REGISTRY GWA analysis showing suggestive trails on chromosome 15 in the same area as the GeM GWAS significant locus (8), and chromosome 5 in the same area as the TRACK progression GWAS.
  • (iii) Repair of the strand break leads to expansion of the CAG repeat.
  • The p-values in columns 2 – 4 refer to the association between the pathway indicated and rate of progression described in this paper (TRACK- TRACK-HD study; REGISTRYREGISTRY study; META- meta-analysis).

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

1
Identification of genetic variants associated with Huntington’s disease
progression: a genome-wide association study
Davina J Hensman Moss*
1
, MBBS, Antonio F. Pardiñas*
2
, PhD, Prof Douglas Langbehn
3
,
PhD,
Kitty Lo
4
, PhD, Prof Blair R. Leavitt
5
, MD,CM, Prof Raymund Roos
6
, MD, Prof
Alexandra Durr
7
, MD, Prof Simon Mead
8
, PhD, the REGISTRY investigators and the
TRACK-HD investigators, Prof Peter Holmans
2
, PhD, Prof Lesley Jones
§2
, PhD, Prof Sarah J
Tabrizi
§1
, PhD.
* These authors contributed equally to this work
§
These authors contributed equally to this work
1) UCL Huntington’s Disease Centre, UCL Institute of Neurology, Dept. of Neurodegenerative
Disease, London, UK
2) MRC Centre for Neuropsychiatric Genetics and Genomics, Cardiff University, Cardiff, UK
3) University of Iowa Carver College of Medicine, Dept. of Psychiatry and Biostatistics, Iowa,
USA
4) UCL Genetics Institute, Div. of Biosciences, London, UK
5) Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics,
University of British Columbia, Vancouver, British Columbia, Canada
6) Department of Neurology, Leiden University Medical Centre, Leiden, Netherlands
7) ICM and APHP Department of Genetics, Inserm U 1127, CNRS UMR 7225, Sorbonne
Universités, UPMC Univ Paris 06 UMR S 1127, Pitié- Salpêtrière University Hospital, Paris,
France
8) MRC Prion Unit, UCL Institute of Neurology, London, UK
Corresponding authors:
Sarah J Tabrizi at s.tabrizi@ucl.ac.uk
Lesley Jones at JonesL1@cardiff.ac.uk

2
ABSTRACT
Background Huntington’s disease (HD) is a fatal inherited neurodegenerative disease, caused by a
CAG repeat expansion in HTT. Age at onset (AAO) has been used as a quantitative phenotype in
genetic analysis looking for HD modifiers, but is hard to define and not always available. Therefore
here we aimed to generate a novel measure of disease progression, and identify genetic markers
associated with this progression measure.
Methods We generated a progression score based on principal component analysis of prospectively
acquired longitudinal changes in motor, behavioural, cognitive and imaging measures in the
TRACK-HD cohort of HD gene mutation carriers (data collected 2008 2011). We generated a
parallel progression score using 1773 previously genotyped subjects from the REGISTRY study of
HD mutation carriers (data collected 2003 2013). 216 subjects from TRACK-HD were genotyped.
Association analyses was performed using GCTA, gene-wide analysis using MAGMA and meta-
analysis using METAL.
Findings Longitudinal motor, cognitive and imaging scores were correlated with each other in
TRACK-HD subjects, justifying a single, cross-domain measure as a unified progression measure in
both studies. The TRACK-HD and REGISTRY progression measures were correlated with each
other (r=0·674), and with AAO (r=0·315, r=0.234 respectively). A meta-analysis of progression in
TRACK-HD and REGISTRY gave a genome-wide significant signal (p=1.12x10
-10
) on chromosome
5 spanning 3 genes, MSH3, DHFR and MTRNR2L2. The lead SNP in TRACK-HD (rs557874766) is
genome-wide significant in the meta-analysis (p=1.58x10
-8
), and encodes an amino acid change
(Pro67Ala) in MSH3. In TRACK-HD, each copy of the minor allele at this SNP is associated with a
0.4 (95% CI=0.16,0.66) units per year reduction in the rate of change of the Unified Huntington’s
Disease Rating Scale (UHDRS) Total Motor Score, and 0.12 (95% CI=0.06,0.18) units per year in

3
the rate of change of UHDRS Total Functional Capacity. The associations remained significant after
adjusting for AAO.
Interpretation The multi-domain progression measure in TRACK-HD is associated with a
functional variant that is genome-wide significant in a meta-analysis. The strong association in only
216 subjects implies that the progression measure is a sensitive reflection of disease burden, that the
effect size at this locus is large, or both. As knock out of Msh3 reduces somatic expansion in HD
mouse models, this highlights somatic expansion as a potential pathogenic modulator, informing
therapeutic development in this untreatable disease.
Funding sources The European Commission FP7 NeurOmics project; CHDI Foundation; the
Medical Research Council UK, the Brain Research Trust, the Guarantors of Brain.
Research in context
Evidence before this study
Huntington’s disease (HD) is universally caused by a tract of 36 or more CAG in exon 1 of HTT.
Genetic modifiers of age at motor onset have recently been identified in HD that highlight pathways,
which if modulated in people, might delay disease onset. Onset of disease is preceded by a long
prodromal phase accompanied by substantial brain cell death and age at motor onset is difficult to
assess accurately and is not available in disease free at risk subjects. We searched all of PubMed up
to Oct 31st 2016 for articles published in English containing “Huntington* disease” AND “genetic
modifier” AND “onset” which identified 13 studies, then Huntington* disease” AND “genetic
modifier” AND “progression” which identified one review article. Amongst the 13 studies of
genetic modification of HD onset most were small candidate gene studies; these were superseded by
the one large genome wide genetic modifiers of HD study which identified three genome-wide
significant loci, and implicated DNA handling in HD disease modification

4
Added value of this study
We examined the prospective data from TRACK-HD and developed a measure of disease
progression that reflected correlated progression in the brain imaging, motor and cognitive symptom
domains: there is substantial correlation among these variables. We used the disease progression
measure as a quantitative variable in a genome-wide association study and in only 216 people from
TRACK-HD detected a locus on chromosome 5 containing three significant genes, MTRNR2L2,
MSH3 and DHFR. The index variant encodes an amino acid change in MSH3. We replicated this
finding by generating a parallel progression measure in the less intensively phenotyped REGISTRY
study and detected a similar signal on chromosome 5, likely attributable to the same variants. A
meta-analysis of the two studies strengthened the associations. There was some correlation between
the progression measures and AAO of disease but this was not responsible for the association with
disease progression. We also detected a signal on chromosome 15 in the REGISTRY study at the
same locus as that previously associated with AAO.
Implications of all the available evidence
The progression measures used in this study can be generated in asymptomatic and symptomatic
subjects using a subset of the clinically relevant parameters gathered in TRACK-HD. We use these
measures to identify genetic modifiers of disease progression in HD. We saw a signal in only 216
subjects, which replicates in a larger sample, becoming genome-wide significant, thus reducing the
chance of it being a false positive. This argues for the power of better phenotypic measures in
genetic studies and implies that this locus has a large effect size on disease progression. The index
associated genetic variant in TRACK-HD encodes a Pro67Ala change in MSH3, which implicates
MSH3 as the associated gene on chromosome 5. Notably, altering levels of Msh3 in HD mice
reduces somatic instability and crossing Msh3 null mice with HD mouse models prevents somatic
instability of the HTT CAG repeat and reduces pathological phenotypes. Polymorphism in MSH3 has
been linked to somatic instability in myotonic dystrophy type 1 patients. MSH3 is a non-essential

5
neuronally expressed member of the DNA mismatch repair pathway and these data reinforce its
candidacy as a therapeutic target in HD and potentially in other neurodegenerative expanded repeat
disorders.

Citations
More filters
Journal ArticleDOI
TL;DR: Antisense oligonucleotide therapy is one such approach with clinical trials currently under way that may bring us one step closer to treating and potentially preventing this devastating condition.
Abstract: Huntington's disease (HD) is a fully penetrant neurodegenerative disease caused by a dominantly inherited CAG trinucleotide repeat expansion in the huntingtin gene on chromosome 4. In Western populations HD has a prevalence of 10.6-13.7 individuals per 100 000. It is characterized by cognitive, motor and psychiatric disturbance. At the cellular level mutant huntingtin results in neuronal dysfunction and death through a number of mechanisms, including disruption of proteostasis, transcription and mitochondrial function and direct toxicity of the mutant protein. Early macroscopic changes are seen in the striatum with involvement of the cortex as the disease progresses. There are currently no disease modifying treatments; therefore supportive and symptomatic management is the mainstay of treatment. In recent years there have been significant advances in understanding both the cellular pathology and the macroscopic structural brain changes that occur as the disease progresses. In the last decade there has been a large growth in potential therapeutic targets and clinical trials. Perhaps the most promising of these are the emerging therapies aimed at lowering levels of mutant huntingtin. Antisense oligonucleotide therapy is one such approach with clinical trials currently under way. This may bring us one step closer to treating and potentially preventing this devastating condition.

547 citations

Journal ArticleDOI
TL;DR: It is proposed that accounting for polygenic background is likely to increase accuracy of risk estimation for individuals who inherit a monogenic risk variant, and in carriers of monogenic variants, they show that disease risk is a gradient influenced by polygenic Background.
Abstract: Genetic variation can predispose to disease both through (i) monogenic risk variants that disrupt a physiologic pathway with large effect on disease and (ii) polygenic risk that involves many variants of small effect in different pathways. Few studies have explored the interplay between monogenic and polygenic risk. Here, we study 80,928 individuals to examine whether polygenic background can modify penetrance of disease in tier 1 genomic conditions - familial hypercholesterolemia, hereditary breast and ovarian cancer, and Lynch syndrome. Among carriers of a monogenic risk variant, we estimate substantial gradients in disease risk based on polygenic background - the probability of disease by age 75 years ranged from 17% to 78% for coronary artery disease, 13% to 76% for breast cancer, and 11% to 80% for colon cancer. We propose that accounting for polygenic background is likely to increase accuracy of risk estimation for individuals who inherit a monogenic risk variant.

244 citations

Journal ArticleDOI
14 Jun 2018-Cell
TL;DR: It is concluded that a focus on patient stratification is needed to achieve the goals of precision medicine, and the recent "omnigenic" or "core genes" model may underestimate the biological complexity of common disease.

219 citations


Cites background from "Identification of genetic variants ..."

  • ...Disease-risk genes are expected to harbor both common and rare risk variants, and empirical data for height (Kemper et al., 2012; Marouli et al., 2017), type 2 diabetes (Fuchsberger et al., 2016), inflammatory bowel disease (IBD) (Luo et al., 2017), and high-density lipoprotein (HDL) cholesterol (Rosenson et al., 2018) show commonvariant associations variants in genes responsible for related monogenic disorders....

    [...]

  • ...For IBD, WES (4,280 cases) identified a single rare variant (in a previously known locus) and an excess of very rare, damaging missense variants in known Crohn’s disease risk genes (including those identified through GWAS) (Luo et al., 2017)....

    [...]

  • ..., 2016), inflammatory bowel disease (IBD) (Luo et al., 2017), and high-density lipoprotein (HDL) cholesterol (Rosenson et al....

    [...]

  • ...Yet important advances for identification of potential therapeutic targets have been made through GWASs of age of onset (GeM-HD, 2015) and rate of disease progression (HensmanMoss et al., 2017)....

    [...]

Journal ArticleDOI
TL;DR: New insights are discussed into the molecular pathogenesis of Huntington disease and future therapeutic strategies, including the modulation of DNA repair and targeting the DNA mutation itself are discussed.
Abstract: Huntington disease (HD) is a neurodegenerative disease caused by CAG repeat expansion in the huntingtin gene (HTT) and involves a complex web of pathogenic mechanisms. Mutant HTT (mHTT) disrupts transcription, interferes with immune and mitochondrial function, and is aberrantly modified post-translationally. Evidence suggests that the mHTT RNA is toxic, and at the DNA level, somatic CAG repeat expansion in vulnerable cells influences the disease course. Genome-wide association studies have identified DNA repair pathways as modifiers of somatic instability and disease course in HD and other repeat expansion diseases. In animal models of HD, nucleocytoplasmic transport is disrupted and its restoration is neuroprotective. Novel cerebrospinal fluid (CSF) and plasma biomarkers are among the earliest detectable changes in individuals with premanifest HD and have the sensitivity to detect therapeutic benefit. Therapeutically, the first human trial of an HTT-lowering antisense oligonucleotide successfully, and safely, reduced the CSF concentration of mHTT in individuals with HD. A larger trial, powered to detect clinical efficacy, is underway, along with trials of other HTT-lowering approaches. In this Review, we discuss new insights into the molecular pathogenesis of HD and future therapeutic strategies, including the modulation of DNA repair and targeting the DNA mutation itself.

193 citations

References
More filters
Journal ArticleDOI
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Abstract: Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

35,225 citations


"Identification of genetic variants ..." refers background in this paper

  • ...For both studies, atypical severity scores were derived with a combination of principal component analysis (PCA) and regression of the predictable effects of the primary gene HTT CAG repeat length....

    [...]

  • ...The GO and KEGG terms in the first column refer to pathways of biologically related genes in the Gene Ontology Consortium(1) and Kyoto Encyclopedia of Genes and Genomes (2) databases respectively....

    [...]

  • ...The colours of the circles are based on r 2 with the lead SNP in TRACK-HD as shown in the bottom of the plot; intensity of colour reflects multiple overlying SNPs. Dashed lines: 5x10 -8 Figure 4: Significant genes are functionally linked and may cause somatic expansion of the HTT CAG repeat tract....

    [...]

  • ...Background Huntington’s disease (HD) is a fatal inherited neurodegenerative disease, caused by a CAG repeat expansion in HTT. Age at onset (AAO) has been used as a quantitative phenotype in genetic analysis looking for HD modifiers, but is hard to define and not always available....

    [...]

  • ...Huntington’s disease (HD) is a autosomal dominant fatal neurodegenerative condition caused by a CAG repeat expansion in HTT (1)....

    [...]

Journal ArticleDOI
TL;DR: The Kyoto Encyclopedia of Genes and Genomes (KEGG) as discussed by the authors is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules.
Abstract: Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules. The major component of KEGG is the PATHWAY database that consists of graphical diagrams of biochemical pathways including most of the known metabolic pathways and some of the known regulatory pathways. The pathway information is also represented by the ortholog group tables summarizing orthologous and paralogous gene groups among different organisms. KEGG maintains the GENES database for the gene catalogs of all organisms with complete genomes and selected organisms with partial genomes, which are continuously re-annotated, as well as the LIGAND database for chemical compounds and enzymes. Each gene catalog is associated with the graphical genome map for chromosomal locations that is represented by Java applet. In addition to the data collection efforts, KEGG develops and provides various computational tools, such as for reconstructing biochemical pathways from the complete genome sequence and for predicting gene regulatory networks from the gene expression profiles. The KEGG databases are daily updated and made freely available (http://www.genome.ad.jp/kegg/).

24,024 citations

Book
01 Jan 1987
TL;DR: In this article, a survey of drinking behavior among men of retirement age was conducted and the results showed that the majority of the participants reported that they did not receive any benefits from the Social Security Administration.
Abstract: Tables and Figures. Glossary. 1. Introduction. 1.1 Overview. 1.2 Examples of Surveys with Nonresponse. 1.3 Properly Handling Nonresponse. 1.4 Single Imputation. 1.5 Multiple Imputation. 1.6 Numerical Example Using Multiple Imputation. 1.7 Guidance for the Reader. 2. Statistical Background. 2.1 Introduction. 2.2 Variables in the Finite Population. 2.3 Probability Distributions and Related Calculations. 2.4 Probability Specifications for Indicator Variables. 2.5 Probability Specifications for (X,Y). 2.6 Bayesian Inference for a Population Quality. 2.7 Interval Estimation. 2.8 Bayesian Procedures for Constructing Interval Estimates, Including Significance Levels and Point Estimates. 2.9 Evaluating the Performance of Procedures. 2.10 Similarity of Bayesian and Randomization--Based Inferences in Many Practical Cases. 3. Underlying Bayesian Theory. 3.1 Introduction and Summary of Repeated--Imputation Inferences. 3.2 Key Results for Analysis When the Multiple Imputations are Repeated Draws from the Posterior Distribution of the Missing Values. 3.3 Inference for Scalar Estimands from a Modest Number of Repeated Completed--Data Means and Variances. 3.4 Significance Levels for Multicomponent Estimands from a Modest Number of Repeated Completed--Data Means and Variance--Covariance Matrices. 3.5 Significance Levels from Repeated Completed--Data Significance Levels. 3.6 Relating the Completed--Data and Completed--Data Posterior Distributions When the Sampling Mechanism is Ignorable. 4. Randomization--Based Evaluations. 4.1 Introduction. 4.2 General Conditions for the Randomization--Validity of Infinite--m Repeated--Imputation Inferences. 4.3Examples of Proper and Improper Imputation Methods in a Simple Case with Ignorable Nonresponse. 4.4 Further Discussion of Proper Imputation Methods. 4.5 The Asymptotic Distibution of (Qm,Um,Bm) for Proper Imputation Methods. 4.6 Evaluations of Finite--m Inferences with Scalar Estimands. 4.7 Evaluation of Significance Levels from the Moment--Based Statistics Dm and Dm with Multicomponent Estimands. 4.8 Evaluation of Significance Levels Based on Repeated Significance Levels. 5. Procedures with Ignorable Nonresponse. 5.1 Introduction. 5.2 Creating Imputed Values under an Explicit Model. 5.3 Some Explicit Imputation Models with Univariate YI and Covariates. 5.4 Monotone Patterns of Missingness in Multivariate YI. 5.5 Missing Social Security Benefits in the Current Population Survey. 5.6 Beyond Monotone Missingness. 6. Procedures with Nonignorable Nonresponse. 6.1 Introduction. 6.2 Nonignorable Nonresponse with Univariate YI and No XI. 6.3 Formal Tasks with Nonignorable Nonresponse. 6.4 Illustrating Mixture Modeling Using Educational Testing Service Data. 6.5 Illustrating Selection Modeling Using CPS Data. 6.6 Extensions to Surveys with Follow--Ups. 6.7 Follow--Up Response in a Survey of Drinking Behavior Among Men of Retirement Age. References. Author Index. Subject Index. Appendix I. Report Written for the Social Security Administration in 1977. Appendix II. Report Written for the Census Bureau in 1983.

14,574 citations

Journal ArticleDOI
Monkol Lek, Konrad J. Karczewski1, Konrad J. Karczewski2, Eric Vallabh Minikel2, Eric Vallabh Minikel1, Kaitlin E. Samocha, Eric Banks1, Timothy Fennell1, Anne H. O’Donnell-Luria2, Anne H. O’Donnell-Luria3, Anne H. O’Donnell-Luria1, James S. Ware, Andrew J. Hill1, Andrew J. Hill4, Andrew J. Hill2, Beryl B. Cummings1, Beryl B. Cummings2, Taru Tukiainen1, Taru Tukiainen2, Daniel P. Birnbaum1, Jack A. Kosmicki, Laramie E. Duncan2, Laramie E. Duncan1, Karol Estrada2, Karol Estrada1, Fengmei Zhao2, Fengmei Zhao1, James Zou1, Emma Pierce-Hoffman2, Emma Pierce-Hoffman1, Joanne Berghout5, David Neil Cooper6, Nicole A. Deflaux7, Mark A. DePristo1, Ron Do, Jason Flannick1, Jason Flannick2, Menachem Fromer, Laura D. Gauthier1, Jackie Goldstein2, Jackie Goldstein1, Namrata Gupta1, Daniel P. Howrigan1, Daniel P. Howrigan2, Adam Kiezun1, Mitja I. Kurki2, Mitja I. Kurki1, Ami Levy Moonshine1, Pradeep Natarajan, Lorena Orozco, Gina M. Peloso1, Gina M. Peloso2, Ryan Poplin1, Manuel A. Rivas1, Valentin Ruano-Rubio1, Samuel A. Rose1, Douglas M. Ruderfer8, Khalid Shakir1, Peter D. Stenson6, Christine Stevens1, Brett Thomas1, Brett Thomas2, Grace Tiao1, María Teresa Tusié-Luna, Ben Weisburd1, Hong-Hee Won9, Dongmei Yu, David Altshuler1, David Altshuler10, Diego Ardissino, Michael Boehnke11, John Danesh12, Stacey Donnelly1, Roberto Elosua, Jose C. Florez2, Jose C. Florez1, Stacey Gabriel1, Gad Getz1, Gad Getz2, Stephen J. Glatt13, Christina M. Hultman14, Sekar Kathiresan, Markku Laakso15, Steven A. McCarroll1, Steven A. McCarroll2, Mark I. McCarthy16, Mark I. McCarthy17, Dermot P.B. McGovern18, Ruth McPherson19, Benjamin M. Neale2, Benjamin M. Neale1, Aarno Palotie, Shaun Purcell8, Danish Saleheen20, Jeremiah M. Scharf, Pamela Sklar, Patrick F. Sullivan14, Patrick F. Sullivan21, Jaakko Tuomilehto22, Ming T. Tsuang23, Hugh Watkins17, Hugh Watkins16, James G. Wilson24, Mark J. Daly1, Mark J. Daly2, Daniel G. MacArthur1, Daniel G. MacArthur2 
18 Aug 2016-Nature
TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.
Abstract: Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

8,758 citations


"Identification of genetic variants ..." refers background in this paper

  • ...Loss of or variation in mismatch repair complexes can cause malignancy and thus they are not regarded as ideal drug targets, but MSH3 is not essential as it can tolerate loss of function variation (36) and could provide a therapeutic target in HD....

    [...]

Journal ArticleDOI
TL;DR: H hierarchical and self-consistent orthology annotations are introduced for all interacting proteins, grouping the proteins into families at various levels of phylogenetic resolution in the STRING database.
Abstract: The many functional partnerships and interactions that occur between proteins are at the core of cellular processing and their systematic characterization helps to provide context in molecular systems biology. However, known and predicted interactions are scattered over multiple resources, and the available data exhibit notable differences in terms of quality and completeness. The STRING database (http://string-db.org) aims to provide a critical assessment and integration of protein-protein interactions, including direct (physical) as well as indirect (functional) associations. The new version 10.0 of STRING covers more than 2000 organisms, which has necessitated novel, scalable algorithms for transferring interaction information between organisms. For this purpose, we have introduced hierarchical and self-consistent orthology annotations for all interacting proteins, grouping the proteins into families at various levels of phylogenetic resolution. Further improvements in version 10.0 include a completely redesigned prediction pipeline for inferring protein-protein associations from co-expression data, an API interface for the R computing environment and improved statistical analysis for enrichment tests in user-provided networks.

8,224 citations


"Identification of genetic variants ..." refers background in this paper

  • ...org/cgi/ accessed October 2016 and January 2017 (37)....

    [...]

Related Papers (5)
Frequently Asked Questions (11)
Q1. What contributions have the authors mentioned in the paper "Identification of genetic variants associated with huntington’s disease progression: a genome-wide association study" ?

In this paper, the authors used the TRACK-HD data to generate a novel unified disease progression measure for use in a genetic association analysis and developed a similar measure in subjects from the REGISTRY study. 

altering levels of Msh3 in HD mice reduces somatic instability and crossing Msh3 null mice with HD mouse models prevents somatic instability of the HTT CAG repeat and reduces pathological phenotypes. 

The most significant SNP in the meta-analysis is rs1232027, which is genome-wide significant (p=1.12x10 -10 ), with the p-value of rs557874766 being 1.58x10 -8 . 

The progression measures used in this study can be generated in asymptomatic and symptomatic subjects using a subset of the clinically relevant parameters gathered in TRACK-HD. 

Despite the lack of co-localisation between the TRACK GWAS and MSH3 expression signal, several of the most significant GWAS SNPs were associated with decreased MSH3 expression and slower progression (Supplementary Information). 

The cross-domain first principal component was used as a unified Huntington’s disease progression measure in the TRACK-HD cohort (Figure 1 and 2B). 

Genetic modifiers of age at motor onset have recently been identified in HD that highlight pathways, which if modulated in people, might delay disease onset. 

A complete list of genes in the Pearl et al. (20) pathways is given in http://hdresearch.ucl.ac.uk/data-resources/.DISCUSSIONThe evidence from their study suggests that MSH3 is likely to be a modifier of disease progression in Huntington’s disease. 

the signal on chromosome 5 could be due to the coding change in MSH3, or to expression changes in MSH3, DHFR or both, and both effects may operate in disease. 

MSH3 is a neuronally expressed member of a family of DNA mismatch repair proteins (29); it forms a heteromeric complex with MSH2 to form MutSβ, which recognises insertion-deletion loops of up to 13 nucleotides (30) (Figure 4D). 

Changes in CAG repeat size occur in terminally differentiated neurons in several HD mouse models and in human patient striatum, the brain area most affected in HD, and notably, somatic expansion of the CAG repeat in HD patient brain predicts onset (31).