scispace - formally typeset
Search or ask a question
Posted Content•DOI•

EyeG2P: an automated variant filtering approach improves efficiency of diagnostic genomic testing for inherited ophthalmic disorders.

TL;DR: EyeG2P as discussed by the authors is a publically available resource to assist diagnostic filtering of genomic datasets for ophthalmic conditions, utilising the Ensembl Variant Effect Predictor, which enabled a significant increase in precision in comparison to routine testing strategies.
Abstract: PurposeThe widespread adoption of genomic testing for individuals with ophthalmic disorders has increased demand on diagnostic genomic services for these conditions. Moreover, the clinical utility of a molecular diagnosis for individuals with inherited ophthalmic disorders is increasingly placing pressure on the speed and accuracy of genomic testing. MethodsWe created EyeG2P, a publically available resource to assist diagnostic filtering of genomic datasets for ophthalmic conditions, utilising the Ensembl Variant Effect Predictor. We assessed the sensitivity of EyeG2P for 1234 individuals with a broad range of conditions, who had previously received a confirmed molecular diagnosis through routine genomic diagnostic approaches. For a prospective cohort of 83 individuals, we also assessed the precision of EyeG2P in comparision to routine genomic diagnostic approaches. ResultsWe observed that EyeG2P had a 99.5% sensitivity for genomic variants previously identified as a molecular diagnosis for 1234 individuals. EyeG2P enabled a significant increase in precision in comparison to routine testing strategies (p<0.001), with an increased precision in variant analysis of 35% per individual, on average. ConclusionAutomated filtering of genomic variants through EyeG2P can increase the efficiency of diagnostic testing for individuals with a broad range of inherited ophthalmic disorders.

Summary (2 min read)

Jump to: [Introduction] – [Methods] – [Sequencing and variant identification] – [Routine diagnostics] – [EyeG2P] – [Results] and [Discussion]

Introduction

  • Inherited ophthalmic disorders are a major cause of blindness in children and working age adults.
  • 1,2 Obtaining a genetic diagnosis in affected individuals can inform management, and clinical genomic testing is increasingly being used as a frontline diagnostic tool for these disorders.
  • 3-8 Notably, the more widespread availability of gene-directed interventions including gene therapy and preimplantation genetic testing has increased both the value and risk of genomic testing.
  • This places substantial demands on the delivery of testing in a timely and accurate manner.
  • Here the authors describe and evaluate the diagnostic utility of EyeG2P, a publically available resource for the analysis of genomic variants identified in genes known as a cause of inherited ophthalmic conditions.

Methods

  • The G2P web portal (https://www.ebi.ac.uk/gene2phenotype/)15 was used to develop and curate the ophthalmic disorders panel.
  • New entries were initiated by selection of a relevant gene symbol from the list of preloaded genes (with their associated Ensembl identifiers).
  • These connections were made after inspecting MEDLINE (through the PubMed interface); search terms included the gene name (HGNC) and the disease name (as a minimum).
  • A disease mechanism was defined as both an allelic requirement (mode of inheritance, for example biallelic or monoallelic) and a mutation consequence (mode of pathogenicity, for example lossof-function).
  • 15 Each locus-genotypemechanism-disease-evidence link was further characterized by assigning to it a set of phenotype terms (i.e. clinical signs and symptoms) from the Human Phenotype Ontology (HPO).

Sequencing and variant identification

  • All genomic sequencing datasets were generated in a tertiary healthcare setting (North West Genomic Laboratory Hub, Manchester, UK; ISO 15189:2012; UKAS Medical reference 9865).
  • All data collected is part of routine clinical care.
  • Analyses to improve genomic diagnostic services for individuals with inherited ophthalmic conditions, as reported in this study, have been approved by the North West Research Ethics Committee (11/NW/0421 and 15/YH/0365).
  • Raw sequencing reads were aligned to the GRCh37 reference genome using BWA-mem,18 with single nucleotide variants (SNVs) and indels identified using GATK.19 Larger and more complex indels were identified using Pindel, and copy number variants (CNVs) were identified using DeCON.20 Variants were filtered using quality and read depth thresholds as well as inhouse allele frequencies.
  • The zygosity of CNVs were estimated based on their relative read depths.

Routine diagnostics

  • Routine genomic analysis was performed utilizing the Congenica platform.
  • This process involves filtering variants based on gene/location depending on the gene panel applied, population frequency and predicted molecular consequence.
  • A complete list of presets for variant filtering are available in Supplementary Tables 1&2.
  • After pre-filtering, variants were analysed by clinically accredited scientists and variants classified in accordance with the 2015 American College of Medical Genetics and Genomics (ACMG) best practice guidelines.

EyeG2P

  • Merged VCF files containg SNVs, indels and CNVs were annotated using the G2P plugin for Ensembl Variant Effect Predictor.15,23.
  • This plugin requires an input file which lists genes of interest and their allelic requirements; the authors utilized the EyeG2P dataset and an allele frequency cutoff of 0.001 for variants in monoallelic genes and 0.05 for variants in biallelic genes.
  • Prospectively collected data from 83 individuals were also used for comparison.
  • The authors assessed the sensitivity and the precision of EyeG2P in comparison to results from routine diagnostic analysis.
  • All statistical analyses were performed in R and graphics created in R and BioRender.

Results

  • Curation of the literature identified 667 genes for inclusion in EyeG2P Between April 2017 and June 2020, the authors interogated the biomedical literature for genes associated with highly penetrant genetic ophthalmic disorders.
  • Within the 559 confirmed gene-disease pairs, the associated inheritance patterns were autosomal dominant in 155, autosomal recessive in 341 , X-linked in 31 , and other patterns (including both autosomal dominant and recessive) in 32 instances.
  • The authors assessed the capability of EyeG2P to identify molecular diagnoses in 1234 individuals who had previously undergone diagnostic genetic testing at the North West Genomic Laboratory Hub (Manchester, UK).
  • The 1267 variants prioritized by EyeG2P were identified in 166 distinct genes and had diverse predicted molecular consequences .
  • For 31/33 individuals (94%), the confirmed molecular diagnosis was highlighted by EyeG2P; an average reduction of 7.4 variants for analysis was possible in each individual .

Discussion

  • Characterising the genomic basis of inherited ophthalmic conditions has been shown to inform the management of individuals with these conditions.
  • The expansion from single gene based testing methodologies to the routine use of large gene panels, exome and genome sequencing approaches requires that robust and accurate informatics filtering strategies are applied to the generated datasets.
  • The authors have released these data as a freely available resource that can be dynamically filtered and revised to best aid the users requirements.
  • The authors ability to detect disease-causing genomic variants from high-throughput sequencing datasets has expanded in recent years to include complex structural variants,26,27 exonic deletions and duplications,28,29 deeply intronic variants causing aberrant splicing,30-33 variants in regulatory regions34-36 and complex alleles comprised of combinations of genomic variants common in the general population.
  • Moreover, EyeG2P can identify diverse genomic variants across the spectrum of genetically and clinically heterogeneous ophthalmic genetic conditions.

Did you find this useful? Give us your feedback

Figures (2)

Content maybe subject to copyright    Report

EyeG2P: an automated variant filtering approach improves efficiency
of diagnostic genomic testing for inherited ophthalmic disorders.
Eva Lenassi
1,2,3
, Ana Carvalho
4,5
, Anja Thormann
6
, Tracy Fletcher
2
, Claire Hardcastle
2
, Sarah E Hunt
6
,
Panagiotis I Sergouniotis
1,2,3
, Michel Michaelides
6,7
, Andrew R Webster
6,7
, Cunningham F
6
, Simon
Ramsden
2
, David R FitzPatrick
4
, Graeme CM Black
1,2
, Jamie M Ellingford
1,2
1. Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine
and Health, University of Manchester, Manchester, United Kingdom.
2. Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester University NHS Foundation
Trust, Manchester, United Kingdom.
3. Manchester Royal Eye Hospital, Manchester University NHS Foundation Trust, Manchester, United
Kingdom.
4. MRC Human Genetics Unit, MRC Institute of Genetics and Cancer, University of Edinburgh, Edinburgh,
UK.
5. Medical Genetic Unit, Pediatric Hospital, Coimbra Hospital and Universitary Centre (CHUC), Coimbra,
Portugal
6. European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome
Campus, Hinxton, CB20 1SD, Cambridge, UK.
7. UCL Institute of Ophthalmology, University College London, London, UK.
8. Department of Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust, London, UK.
Correspondence
graeme.black@manchester.ac.uk
jamie.ellingford@manchester.ac.uk
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted July 25, 2021. ; https://doi.org/10.1101/2021.07.23.21261017doi: medRxiv preprint
NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.

Abstract(
Purpose: The widespread adoption of genomic testing for individuals with ophthalmic disorders has
increased demand on diagnostic genomic services for these conditions. Moreover, the clinical utility of a
molecular diagnosis for individuals with inherited ophthalmic disorders is increasingly placing pressure on
the speed and accuracy of genomic testing.
Methods: We created EyeG2P, a publically available resource to assist diagnostic filtering of genomic
datasets for ophthalmic conditions, utilising the Ensembl Variant Effect Predictor. We assessed the
sensitivity of EyeG2P for 1234 individuals with a broad range of conditions, who had previously received
a confirmed molecular diagnosis through routine genomic diagnostic approaches. For a prospective
cohort of 83 individuals, we also assessed the precision of EyeG2P in comparision to routine genomic
diagnostic approaches.
Results: We observed that EyeG2P had a 99.5% sensitivity for genomic variants previously identified as a
molecular diagnosis for 1234 individuals. EyeG2P enabled a significant increase in precision in comparison
to routine testing strategies (p<0.001), with an increased precision in variant analysis of 35% per
individual, on average.
Conclusion: Automated filtering of genomic variants through EyeG2P can increase the efficiency of
diagnostic testing for individuals with a broad range of inherited ophthalmic disorders.(
(
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted July 25, 2021. ; https://doi.org/10.1101/2021.07.23.21261017doi: medRxiv preprint

Introduction(
Inherited ophthalmic disorders are a major cause of blindness in children and working age adults.
1,2
Obtaining a genetic diagnosis in affected individuals can inform management, and clinical genomic testing
is increasingly being used as a frontline diagnostic tool for these disorders.
3-8
Notably, the more
widespread availability of gene-directed interventions including gene therapy and preimplantation
genetic testing has increased both the value and risk of genomic testing.
9-14
This places substantial
demands on the delivery of testing in a timely and accurate manner.
Here we describe and evaluate the diagnostic utility of EyeG2P, a publically available resource for the
analysis of genomic variants identified in genes known as a cause of inherited ophthalmic conditions. We
curated disease-causing genes through robust and transparent standards and assessed the sensitivity and
precision of EyeG2P in a cohort of individuals receiving diagnostic testing for a range of ophthalmic
conditions. EyeG2P uses logical filtering of identified genomic variants in line with their predicted
molecular consequence, population frequency and prior knowledge of disease mechanisms and
inheritance patterns. Overall we found that utilizing EyeG2P as a first-tier analysis strategy reduces the
number of variants requiring analysis by clinically accredited scientists and increases the precision and
efficiency of diagnostic testing for ophthalmic disorders.
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted July 25, 2021. ; https://doi.org/10.1101/2021.07.23.21261017doi: medRxiv preprint

Methods (
Curation(of ( known(disease(genes(
The G2P web portal (https://www.ebi.ac.uk/gene2phenotype/)
15
was used to develop and curate the
ophthalmic disorders panel. New entries were initiated by selection of a relevant gene symbol from the
list of preloaded genes (with their associated Ensembl identifiers). For each entry, a gene or locus was
linked, via a disease mechanism, to a disease. These connections were made after inspecting MEDLINE
(through the PubMed interface); search terms included the gene name (HGNC) and the disease name (as
a minimum). A disease mechanism was defined as both an allelic requirement (mode of inheritance, for
example biallelic or monoallelic) and a mutation consequence (mode of pathogenicity, for example loss-
of-function). A confidence attribute—confirmed, probable or possible—was also assigned to indicate how
likely it is that the gene is implicated in the cause of disease; the rules used to assign confidence, allelic
requirement and mutation consequence to entries are defined and available.
15
Each locus-genotype-
mechanism-disease-evidence link was further characterized by assigning to it a set of phenotype terms
(i.e. clinical signs and symptoms) from the Human Phenotype Ontology (HPO).
16
The details of the edits
and the identifiers of the relevant publications (that provide evidence for that specific gene-disease
thread) were stored and are available through the G2P web portal.
Sequencing(and(variant(identification(
All genomic sequencing datasets were generated in a tertiary healthcare setting (North West Genomic
Laboratory Hub, Manchester, UK; ISO 15189:2012; UKAS Medical reference 9865). Individuals provided
written consent for genomic analysis and all investigations were conducted in accordance to the tenets
of the Declaration of Helsinki. All data collected is part of routine clinical care. Analyses to improve
genomic diagnostic services for individuals with inherited ophthalmic conditions, as reported in this study,
have been approved by the North West Research Ethics Committee (11/NW/0421 and 15/YH/0365). No
individual patient data is reported in this manuscript.
Routine diagnostic gene panel testing was performed as previously described.
3,5,17
Briefly, DNA samples
were processed using Agilent SureSelect (Agilent Technologies, Santa, Clara, CA) target enrichment kits
designed to capture selected intronic regions and all protein-coding exons +/-50 base pairs of flanking
intronic sequences of selected panels of known disease-genes. The decision on which panel to use was
made by the referring clinician (either a consultant ophthalmologist or a consultant clinical geneticist with
an interest in ophthalmic genetics).
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted July 25, 2021. ; https://doi.org/10.1101/2021.07.23.21261017doi: medRxiv preprint

Sequencing was performed using Illumina HiSeq and NextSeq platforms. Raw sequencing reads were
aligned to the GRCh37 reference genome using BWA-mem,
18
with single nucleotide variants (SNVs) and
indels identified using GATK.
19
Larger and more complex indels were identified using Pindel, and copy
number variants (CNVs) were identified using DeCON.
20
Variants were filtered using quality and read
depth thresholds as well as inhouse allele frequencies. The zygosity of CNVs were estimated based on
their relative read depths. Regions that are highly polymorphic and/or difficult to survey through short-
read high-throughput techniques are masked from initial analysis, specifically RP1L1 exon 4, USH1C exon
18 and RPGRorf15.
Variant(analysis(
Routine diagnostics
Routine genomic analysis was performed utilizing the Congenica platform. This process involves filtering
variants based on gene/location depending on the gene panel applied, population frequency and
predicted molecular consequence. A complete list of presets for variant filtering are available in
Supplementary Tables 1&2. After pre-filtering, variants were analysed by clinically accredited scientists
and variants classified in accordance with the 2015 American College of Medical Genetics and Genomics
(ACMG) best practice guidelines.
21,22
EyeG2P
Merged VCF files containg SNVs, indels and CNVs were annotated using the G2P plugin for Ensembl
Variant Effect Predictor.
15,23
This plugin requires an input file which lists genes of interest and their allelic
requirements; we utilized the EyeG2P dataset and an allele frequency cutoff of 0.001 for variants in
monoallelic genes and 0.05 for variants in biallelic genes. An additional list was used as input including all
ClinVar pathogenic or likely pathogenic variants, all variants predicted to have a significant impact on
splicing by SpliceAI,
24
and a selection hypomorphic alleles that are known to be pathogenic but exceed
the variant frequency thresholds specified.
Comparisons between routine diagnostic analysis and EyeG2P
Results from EyeG2P were restrospectively compared to clinically reported variants identified from
routine diagnostic analysis in 1234 individuals with genomic opthalmic conditions. All these study
participants had a confirmed (or a provisional) molecular diagnosis and carried pathogenic (ACMG class
5), likely pathogenic (ACMG class 4) or variants of uncertain significance (ACMG class 3); these changes
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)
The copyright holder for this preprint this version posted July 25, 2021. ; https://doi.org/10.1101/2021.07.23.21261017doi: medRxiv preprint

References
More filters
Journal Article•DOI•
TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

20,557 citations

Journal Article•DOI•
TL;DR: Because of the increased complexity of analysis and interpretation of clinical genetic testing described in this report, the ACMG strongly recommends thatclinical molecular genetic testing should be performed in a Clinical Laboratory Improvement Amendments–approved laboratory, with results interpreted by a board-certified clinical molecular geneticist or molecular genetic pathologist or the equivalent.

17,834 citations

Posted Content•DOI•
TL;DR: BWA-MEM automatically chooses between local and end-to-end alignments, supports paired-end reads and performs chimeric alignment, which is robust to sequencing errors and applicable to a wide range of sequence lengths from 70bp to a few megabases.
Abstract: Summary: BWA-MEM is a new alignment algorithm for aligning sequence reads or long query sequences against a large reference genome such as human. It automatically chooses between local and end-to-end alignments, supports paired-end reads and performs chimeric alignment. The algorithm is robust to sequencing errors and applicable to a wide range of sequence lengths from 70bp to a few megabases. For mapping 100bp sequences, BWA-MEM shows better performance than several state-of-art read aligners to date. Availability and implementation: BWA-MEM is implemented as a component of BWA, which is available at this http URL. Contact: hengli@broadinstitute.org

8,090 citations

Journal Article•DOI•
TL;DR: The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.
Abstract: The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.

4,658 citations

Journal Article•DOI•
TL;DR: This study investigated the safety of subretinal delivery of a recombinant adeno-associated virus (AAV) carrying RPE65 complementary DNA (cDNA) and found three patients with LCA2 had an acceptable local and systemic adverse-event profile after delivery of AAV2.hRPE65v2.
Abstract: S um m a r y Leber's congenital amaurosis (LCA) is a group of inherited blinding diseases with onset during childhood. One form of the disease, LCA2, is caused by mutations in the retinal pigment epithelium-specific 65-kDa protein gene (RPE65). We investiga t ed the safety of subretinal delivery of a recombinant adeno-associated virus (AAV) carry- ing RPE65 complementary DNA (cDNA) (ClinicalTrials.gov number, NCT00516477). Three patients with LCA2 had an acceptable local and systemic adverse-event pro- file after delivery of AAV2.hRPE65v2. Each patient had a modest improvement in measures of retinal function on subjective tests of visual acuity. In one patient, an asymptomatic macular hole developed, and although the occurrence was considered to be an adverse event, the patient had some return of retinal function. Although the follow-up was very short and normal vision was not achieved, this study pro- vides the basis for further gene therapy studies in patients with LCA.

2,066 citations

Related Papers (5)
Frequently Asked Questions (14)
Q1. What are the contributions in "Eyeg2p: an automated variant filtering approach improves efficiency of diagnostic genomic testing for inherited ophthalmic disorders" ?

The authors created EyeG2P, a publically available resource to assist diagnostic filtering of genomic datasets for ophthalmic conditions, utilising the Ensembl Variant Effect Predictor. It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. 

21,22Merged VCF files containg SNVs, indels and CNVs were annotated using the G2P plugin for Ensembl Variant Effect Predictor.15,23 

A disease mechanism was defined as both an allelic requirement (mode of inheritance, for example biallelic or monoallelic) and a mutation consequence (mode of pathogenicity, for example lossof-function). 

The expansion from single gene based testing methodologies to the routine use of large gene panels, exome and genome sequencing approaches requires that robust and accurate informatics filtering strategies are applied to the generated datasets. 

Disease-causing variants were identified in 24 distinct genes; 10 cases had an autosomal dominant, 3 an X-linked and 18 an autosomal recessive disorder. 

In 10/52 cases without a confirmed diagnosis, variants of uncertain significance were identified in a disease-causing state through EyeG2P analysis, and no additional pathogenic variants were detected after routine analysis. 

The genetic basis of ophthalmic conditions such as congenital cataract, inherited retinal disorders and optic neuropathies is diverse and includes genes encoded on autosomal, sex and mitochondrial chromosomes. 

In conclusion, the authors demonstrate that EyeG2P can be effectively integrated with clinical diagnostic testing for inherited ophthalmic conditions to increase the efficiency of variant analysis. 

Following curation of over 1000 biomedical publications the authors identified 667 relevant genes and determined the associated modes of inheritance, mechanisms of disease causation and phenotypic features. 

The authors acknowledge funding from the Wellcome Trust Transforming Genomic Medicine Initiative (WT200990/Z/16/Z), the European Molecular Biology Laboratory and the Manchester NIHR Biomedical Research Centre (IS-BRC-1215-20007). 

Regions that are highly polymorphic and/or difficult to survey through shortread high-throughput techniques are masked from initial analysis, specifically RP1L1 exon 4, USH1C exon 18 and RPGRorf15. 

The authors propose the application of EyeG2P as a firsttier analysis strategy for the diagnosis of inherited opthalmic conditions from high-throughput genomic datasets. 

This plugin requires an input file which lists genes of interest and their allelic requirements; the authors utilized the EyeG2P dataset and an allele frequency cutoff of 0.001 for variants in monoallelic genes and 0.05 for variants in biallelic genes. 

The authors show that EyeG2P increases the precision and efficiency of genomic testing for inherited ophthalmic conditions over routine approaches for variant analysis, at little cost to overall diagnostic rates.Â