scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Rare variants in CFI, C3 and C9 are associated with high risk of advanced age-related macular degeneration.

TL;DR: The results implicate loss of C3 protein regulation and excessive alternative complement activation in AMD pathogenesis, thus informing both the direction of effect and mechanistic underpinnings of this disorder.
Abstract: To define the role of rare variants in advanced age-related macular degeneration (AMD) risk, we sequenced the exons of 681 genes within all reported AMD loci and related pathways in 2,493 cases and controls. We first tested each gene for increased or decreased burden of rare variants in cases compared to controls. We found that 7.8% of AMD cases compared to 2.3% of controls are carriers of rare missense CFI variants (odds ratio (OR) = 3.6; P = 2 × 10(-8)). There was a predominance of dysfunctional variants in cases compared to controls. We then tested individual variants for association with disease. We observed significant association with rare missense alleles in genes other than CFI. Genotyping in 5,115 independent samples confirmed associations with AMD of an allele in C3 encoding p.Lys155Gln (replication P = 3.5 × 10(-5), OR = 2.8; joint P = 5.2 × 10(-9), OR = 3.8) and an allele in C9 encoding p.Pro167Ser (replication P = 2.4 × 10(-5), OR = 2.2; joint P = 6.5 × 10(-7), OR = 2.2). Finally, we show that the allele of C3 encoding Gln155 results in resistance to proteolytic inactivation by CFH and CFI. These results implicate loss of C3 protein regulation and excessive alternative complement activation in AMD pathogenesis, thus informing both the direction of effect and mechanistic underpinnings of this disorder.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: The results support the hypothesis that rare coding variants can pinpoint causal genes within known genetic loci and illustrate that applying the approach systematically to detect new loci requires extremely large sample sizes.
Abstract: Advanced age-related macular degeneration (AMD) is the leading cause of blindness in the elderly, with limited therapeutic options. Here we report on a study of >12 million variants, including 163,714 directly genotyped, mostly rare, protein-altering variants. Analyzing 16,144 patients and 17,832 controls, we identify 52 independently associated common and rare variants (P < 5 × 10(-8)) distributed across 34 loci. Although wet and dry AMD subtypes exhibit predominantly shared genetics, we identify the first genetic association signal specific to wet AMD, near MMP9 (difference P value = 4.1 × 10(-10)). Very rare coding variants (frequency <0.1%) in CFH, CFI and TIMP3 suggest causal roles for these genes, as does a splice variant in SLC16A8. Our results support the hypothesis that rare coding variants can pinpoint causal genes within known genetic loci and illustrate that applying the approach systematically to detect new loci requires extremely large sample sizes.

1,088 citations

Journal ArticleDOI
TL;DR: Recent advances in the understanding of the role of complement in physiology and pathology are discussed, showing that complement contributes to a large variety of conditions, far exceeding the classical examples of diseases associated with complement deficiencies.
Abstract: The complement system has been considered for a long time as a simple lytic cascade, aimed to kill bacteria infecting the host organism. Nowadays, this vision has changed and it is well accepted that complement is a complex innate immune surveillance system, playing a key role in host homeostasis, inflammation, and in the defense against pathogens. This review discusses recent advances in the understanding of the role of complement in physiology and pathology. It starts with a description of complement contribution to the normal physiology (homeostasis) of a healthy organism, including the silent clearance of apoptotic cells and maintenance of cell survival. In pathology, complement can be a friend or a foe. It acts as a friend in the defense against pathogens, by inducing opsonization and a direct killing by C5b–9 membrane attack complex and by triggering inflammatory responses with the anaphylatoxins C3a and C5a. Opsonization plays also a major role in the mounting of an adaptive immune response, involving antigen presenting cells, T-, and B-lymphocytes. Nevertheless, it can be also an enemy, when pathogens hijack complement regulators to protect themselves from the immune system. Inadequate complement activation becomes a disease cause, as in atypical hemolytic uremic syndrome, C3 glomerulopathies, and systemic lupus erythematosus. Age-related macular degeneration and cancer will be described as examples showing that complement contributes to a large variety of conditions, far exceeding the classical examples of diseases associated with complement deficiencies. Finally, we discuss complement as a therapeutic target.

727 citations


Cites background from "Rare variants in CFI, C3 and C9 are..."

  • ...Increased risk for AMD is conferred also by polymorphisms in several other genes of the alternative complement pathway, including FI, C3, C2/FB, and C9 (251)....

    [...]

Journal ArticleDOI
TL;DR: Elevations in levels of local and systemic biomarkers indicate that chronic inflammation is involved in the pathogenesis of both disease forms of age-related macular degeneration.
Abstract: Inflammation is a cellular response to factors that challenge the homeostasis of cells and tissues. Cell-associated and soluble pattern-recognition receptors, e.g. Toll-like receptors, inflammasome receptors, and complement components initiate complex cellular cascades by recognizing or sensing different pathogen and damage-associated molecular patterns, respectively. Cytokines and chemokines represent alarm messages for leukocytes and once activated, these cells travel long distances to targeted inflamed tissues. Although it is a crucial survival mechanism, prolonged inflammation is detrimental and participates in numerous chronic age-related diseases. This article will review the onset of inflammation and link its functions to the pathogenesis of age-related macular degeneration (AMD), which is the leading cause of severe vision loss in aged individuals in the developed countries. In this progressive disease, degeneration of the retinal pigment epithelium (RPE) results in the death of photoreceptors, leading to a loss of central vision. The RPE is prone to oxidative stress, a factor that together with deteriorating functionality, e.g. decreased intracellular recycling and degradation due to attenuated heterophagy/autophagy, induces inflammation. In the early phases, accumulation of intracellular lipofuscin in the RPE and extracellular drusen between RPE cells and Bruch’s membrane can be clinically detected. Subsequently, in dry (atrophic) AMD there is geographic atrophy with discrete areas of RPE loss whereas in the wet (exudative) form there is neovascularization penetrating from the choroid to retinal layers. Elevations in levels of local and systemic biomarkers indicate that chronic inflammation is involved in the pathogenesis of both disease forms.

457 citations

Journal ArticleDOI
TL;DR: A critical review of the ongoing genetic studies and of common and rare risk variants at a total of 20 susceptibility loci, which together explain 40-60% of the disease heritability but provide limited power for diagnostic testing of disease risk.
Abstract: Genetic and genomic studies have enhanced our understanding of complex neurodegenerative diseases that exert a devastating impact on individuals and society. One such disease, age-related macular degeneration (AMD), is a major cause of progressive and debilitating visual impairment. Since the pioneering discovery in 2005 of complement factor H (CFH) as a major AMD susceptibility gene, extensive investigations have confirmed 19 additional genetic risk loci, and more are anticipated. In addition to common variants identified by now-conventional genome-wide association studies, targeted genomic sequencing and exome-chip analyses are uncovering rare variant alleles of high impact. Here, we provide a critical review of the ongoing genetic studies and of common and rare risk variants at a total of 20 susceptibility loci, which together explain 40-60% of the disease heritability but provide limited power for diagnostic testing of disease risk. Identification of these susceptibility loci has begun to untangle the complex biological pathways underlying AMD pathophysiology, pointing to new testable paradigms for treatment.

390 citations


Cites background from "Rare variants in CFI, C3 and C9 are..."

  • ...1% and 1% in controls and are associated with odds ratios of 2–4 (40, 75, 90, 113, 121)....

    [...]

  • ...Locus names indicate genes that overlap or are close to the observed association signal but do not necessarily represent the underlying disease-causing gene (33, 40, 75, 90, 113, 121)....

    [...]

Journal ArticleDOI
TL;DR: Reduction in drusen burden, slowing the enlargement rate of GA lesion area, and slowing or eliminating the progression of intermediate to advanced AMD seems to be a clinically suitable primary efficacy endpoint.

302 citations

References
More filters
Journal ArticleDOI
TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

43,862 citations

Journal ArticleDOI
TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Abstract: Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

26,280 citations

Journal ArticleDOI
TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

20,557 citations

Journal ArticleDOI
TL;DR: A new method and the corresponding software tool, PolyPhen-2, which is different from the early tool polyPhen1 in the set of predictive features, alignment pipeline, and the method of classification is presented and performance, as presented by its receiver operating characteristic curves, was consistently superior.
Abstract: To the Editor: Applications of rapidly advancing sequencing technologies exacerbate the need to interpret individual sequence variants. Sequencing of phenotyped clinical subjects will soon become a method of choice in studies of the genetic causes of Mendelian and complex diseases. New exon capture techniques will direct sequencing efforts towards the most informative and easily interpretable protein-coding fraction of the genome. Thus, the demand for computational predictions of the impact of protein sequence variants will continue to grow. Here we present a new method and the corresponding software tool, PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/), which is different from the early tool PolyPhen1 in the set of predictive features, alignment pipeline, and the method of classification (Fig. 1a). PolyPhen-2 uses eight sequence-based and three structure-based predictive features (Supplementary Table 1) which were selected automatically by an iterative greedy algorithm (Supplementary Methods). Majority of these features involve comparison of a property of the wild-type (ancestral, normal) allele and the corresponding property of the mutant (derived, disease-causing) allele, which together define an amino acid replacement. Most informative features characterize how well the two human alleles fit into the pattern of amino acid replacements within the multiple sequence alignment of homologous proteins, how distant the protein harboring the first deviation from the human wild-type allele is from the human protein, and whether the mutant allele originated at a hypermutable site2. The alignment pipeline selects the set of homologous sequences for the analysis using a clustering algorithm and then constructs and refines their multiple alignment (Supplementary Fig. 1). The functional significance of an allele replacement is predicted from its individual features (Supplementary Figs. 2–4) by Naive Bayes classifier (Supplementary Methods). Figure 1 PolyPhen-2 pipeline and prediction accuracy. (a) Overview of the algorithm. (b) Receiver operating characteristic (ROC) curves for predictions made by PolyPhen-2 using five-fold cross-validation on HumDiv (red) and HumVar3 (light green). UniRef100 (solid ... We used two pairs of datasets to train and test PolyPhen-2. We compiled the first pair, HumDiv, from all 3,155 damaging alleles with known effects on the molecular function causing human Mendelian diseases, present in the UniProt database, together with 6,321 differences between human proteins and their closely related mammalian homologs, assumed to be non-damaging (Supplementary Methods). The second pair, HumVar3, consists of all the 13,032 human disease-causing mutations from UniProt, together with 8,946 human nsSNPs without annotated involvement in disease, which were treated as non-damaging. We found that PolyPhen-2 performance, as presented by its receiver operating characteristic curves, was consistently superior compared to PolyPhen (Fig. 1b) and it also compared favorably with the three other popular prediction tools4–6 (Fig. 1c). For a false positive rate of 20%, PolyPhen-2 achieves the rate of true positive predictions of 92% and 73% on HumDiv and HumVar, respectively (Supplementary Table 2). One reason for a lower accuracy of predictions on HumVar is that nsSNPs assumed to be non-damaging in HumVar contain a sizable fraction of mildly deleterious alleles. In contrast, most of amino acid replacements assumed non-damaging in HumDiv must be close to selective neutrality. Because alleles that are even mildly but unconditionally deleterious cannot be fixed in the evolving lineage, no method based on comparative sequence analysis is ideal for discriminating between drastically and mildly deleterious mutations, which are assigned to the opposite categories in HumVar. Another reason is that HumDiv uses an extra criterion to avoid possible erroneous annotations of damaging mutations. For a mutation, PolyPhen-2 calculates Naive Bayes posterior probability that this mutation is damaging and reports estimates of false positive (the chance that the mutation is classified as damaging when it is in fact non-damaging) and true positive (the chance that the mutation is classified as damaging when it is indeed damaging) rates. A mutation is also appraised qualitatively, as benign, possibly damaging, or probably damaging (Supplementary Methods). The user can choose between HumDiv- and HumVar-trained PolyPhen-2. Diagnostics of Mendelian diseases requires distinguishing mutations with drastic effects from all the remaining human variation, including abundant mildly deleterious alleles. Thus, HumVar-trained PolyPhen-2 should be used for this task. In contrast, HumDiv-trained PolyPhen-2 should be used for evaluating rare alleles at loci potentially involved in complex phenotypes, dense mapping of regions identified by genome-wide association studies, and analysis of natural selection from sequence data, where even mildly deleterious alleles must be treated as damaging.

11,571 citations

Journal ArticleDOI
TL;DR: A unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs is presented.
Abstract: Recent advances in sequencing technology make it possible to comprehensively catalogue genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (1) initial read mapping; (2) local realignment around indels; (3) base quality score recalibration; (4) SNP discovery and genotyping to find all potential variants; and (5) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We discuss the application of these tools, instantiated in the Genome Analysis Toolkit (GATK), to deep whole-genome, whole-exome capture, and multi-sample low-pass (~4×) 1000 Genomes Project datasets.

10,056 citations

Related Papers (5)