How many welldistributed loci were excluded as candidate deletion intervals?

The authors required that > 80 % of the control individuals be heterozyogous for at least two welldistributed loci within these intervals.

What was the important factor in determining potential hemizygosity?

Highly informative SNPs with a random genomic distribution in the controls (and other public databases) and which were nonpolymorphic in the individual with the suspected deletion were weighted more heavily in inferring potential hemizygosity.

What is the reason why the NGS protocol is more sensitive?

BRCA coding variants were found in individuals who were previously screened for lesions in these genes, suggesting this NGS protocol is a more sensitive approach for detecting coding changes.

Why is the exon strength weaker than the natural one?

Although the cryptic exon is strengthened (final Ri,total = 6.9 bits, ΔRi = 14.7 bits), ASSEDA predicts the level of expression of this exon to be negligible, as it is weaker than the natural exon (Ri,total = 8.4 bits) due to the increased length of the predicted exon (+291 nt) [38].

What are the TFs that have evidence for binding to the promoters of the genes?

The authors identified 141 TFs with evidence for binding to the promoters of the genes the authors sequenced, including c-Myc, C/EBPβ, and Sp1, shown to transcriptionally regulate BRCA1, TP53, and ATM, respectively [98–100].

What was the likely to alter stable 2° structures in mRNA?

Variants flagged by SNPfold with the highest probability of altering stable 2° structures in mRNA (where p-value < 0.1) were prioritized.

How many in silico programs evaluated the effects of the remaining variants?

The predicted effects on protein conservation and function of the remaining variants were evaluated by in silico tools: PolyPhen-2 [118], Mutation Assessor (release 2) [119, 120], and PROVEAN (v1.1.3) [121, 122].

What was the common consequence of false positive variant calls?

As previously reported [147], the authors noted that false positive variant calls within intronic and intergenic regions were the most common consequence of dephasing in low complexity, pyrimidine-enriched intervals.

What is the average number of variants per patient at each step?

The average number of variants per patient at each step is indicated in a table below each plot, along with the percent reduction in variants from one step to anotherThree prioritized variants have multiple predicted roles: ATM c.1538A >G in missense and SRFBS, CHEK2 c.190G >A in missense and UTR binding, and CHEK2 c.433C >

What were the likely to have a deleterious impact on protein activity?

Variants predicted by all four programs to be benign were less likely to have a deleterious impact on protein activity; however this did not exclude them from mRNA splicing analysis (described above in IT-Based Variant Analysis).

(Open Access) A unified analytic framework for prioritization of non-coding variants of uncertain significance in heritable breast and ovarian cancer (2015) | Eliseos J. Mucaki

Q: What have the authors contributed in "A unified analytic framework for prioritization of non-coding variants of uncertain significance in heritable breast and ovarian cancer" ?

The authors present a strategy for analyzing different functional classes of non-coding variants based on information theory ( IT ) and prioritizing patients with large intragenic deletions. The authors have presented a strategy for complete gene sequence analysis followed by a unified framework for interpreting non-coding variants that may affect gene expression. With the unified ITframework, 132 variants were identified and 87 functionally significant VUS were further prioritized. This approach distills large numbers of variants detected by NGS to a limited set of variants prioritized as potential deleterious changes.

Q: How many antisense strand oligos were synthesized?

11,828 antisense strand oligos were synthesized (3497 ATM, 1591 BRCA1, 2395 BRCA2, 1860 CDH1, 883 CHEK2, 826 PALB2, and 776 TP53).

Q: What is the role of the IT-based analysis in identifying splicing variants?

IT-based analysis of splicing variants has proven to be robust and accurate (as determined by functional assays for mRNA expression or binding assays) at analyzing splice site (SS) variants, including splicing regulatory factor binding sites (SRFBSs), and in distinguishing them from polymorphisms in both rare and common diseases [36–39].

TE C H N I C A L A D V A N C E Open Access

A unified analytic framework for

prioritization of non-coding variants of

uncertain significance in heritable breast

and ovarian cancer

Eliseos J. Mucaki

, Natasha G. Caminsky

, Ami M. Perri

, Ruipeng Lu

, Alain Laederach

, Matthew Halvorsen

Joan H. M. Knoll

5,6

and Peter K. Rogan

1,2,6,7*

Abstract

Background: Sequencing of both healthy and disease singletons yields many novel and low frequency variants of

uncertain significance (VUS). Co mplete gene and genome sequencing by next generation sequencing (NGS)

significantly increases the number of VUS detected. While prior studies have emphasized protein coding variants,

non-coding sequence variants have also been proven to significantly contribute to high penetrance disorders, such

as hereditary breast and ovarian cancer (HBOC). We present a strategy for analyzing different functional classes of

non-coding variants based on information theory (IT) and prioritizing patients with large intragenic deletions.

Methods: We captured and enriched for coding and non-coding variants in genes known to harbor mutations that

increase HBOC risk. Custom oligonucleotide baits spanning the complete coding, non-coding, and intergenic

regions 10 kb up- and downstream of ATM, BRCA1, BRCA2, CDH1, CHEK2, PALB2, and TP53 were synthesized for

solution hybridization enrichment. Unique and divergent repetitive sequences were sequenced in 102 high-risk,

anonymized patients without identified mutations in BRCA1/2. Aside from protein coding and copy number

changes, IT-based sequence analysis was used to identify and prioritize pathogenic non-coding variants that

occurred within sequence elements predicted to be recognized by proteins or protein complexes involved in

mRNA splicing, transcription, and untranslated region (UTR) binding and structure. This approach was

supplemented by in silico and laboratory analysis of UTR structure.

Results: 15,311 unique variants were identified, of which 245 occurred in coding regions. With the unified IT-

framework, 132 variants were identified and 87 functionally significant VUS were further prioritized. An intragenic

32.1 kb interval in BRCA2 that was likely hemizygous was detected in one patient. We also identified 4 stop-gain

variants and 3 reading-frame altering exonic insertions/deletions (indels).

Conclusions: We have presented a strategy for complete gene sequence analysis followed by a unified framework

for interpreting non-coding variants that may affect gene expression. This approach distills large numbers of

variants detected by NGS to a limited set of variants prioritized as potential deleterious changes.

Keywords: Information theory, Hereditary breast and ovarian cancer, Transcription factor binding, RNA-binding

protein, Prioritization, Variants of uncertain significance, Splicing, Non-coding, Next-generation sequencing

* Correspondence: progan@uwo.ca

EJM and NGC should be considered to be joint first authors.

Department of Biochemistry, Schulich School of Medicine and Dentistry,

Western University, London, ON N6A 2C1, Canada

Department of Computer Science, Faculty of Science, Western University,

London N6A 2C1, Canada

Full list of author information is available at the end of the article

International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and

reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to

the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver

(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Mucaki et al. BMC Medical Genomics (2016) 9:19

DOI 10.1186/s12920-016-0178-5

Background

Advances in NGS have enabled panels of genes, whole

exomes, and even whole genomes to be sequenced for

multiple individuals in parallel. These platforms have be-

come so cost-effective and accurate that they are begin-

ning to be adopted in clinical settings, as evidenced by

recent FDA approvals [1, 2]. However, the overwhelming

number of gene variants revealed in each individual has

challenged interpretation of clinically significant genetic

variation [3–5].

After common variants, which are rarely pathogenic,

are eliminated, the number of VUS in the residual set re-

mains substantial. Assessment of pathogenicity is not

trivial, considering that nearly half of the uniq ue variants

are novel, and cannot be resolved using published litera-

ture and variant databases [6]. Furthermore, loss-of-

function variants (those resulting in protein truncation

are most likely to be deleterious) represent a very small

proportion of identified variants. The remaining variants

are missense and synonymous variants in the exon, sin-

gle nucleotide changes, or in frame insertions or dele-

tions in intervening and intergenic regions. Functional

analysis of large numbers of these variants often cannot

be performed, due to lack of relevant tissues, and the

cost, time, and labor required for each variant. Another

problem is that in silico protein coding predi ction tools

exhibit inconsistent accuracy and are thus problematic

for clinical risk evaluation [7–9]. Consequently, many

HBOC patients undergoing genetic susceptibility testing

will receive either an inconclusive (no BRCA variant

identified) or an uncertain (BRCA VUS) result. The

former has been reported in up to 80 % of cases and

depends on the number of genes tested [10]. The occ ur-

rence of unce rtain BRCA mutations varies greatly (as

high as 46 % in African American populations and as

low as 2.1 %) among tested individuals depending on the

laboratory and the patient’s eth nicity [11–13]. The in-

consistency in diagnostic yield is significant, considering

that HBOC account s for 5– 10 % of all breast/ovarian

cancer [14, 15].

One strategy to improve variant interpretation in patients

is to reduce the full set of variants to a manageable list of

potentially pathogenic variants. Evidence for pathogenicity

of VUS in genetic disease is often limited to amino acid

coding changes [16, 17], and mutations affecting splicing,

transcriptional activation, and mRNA stability tend to be

underreported [18–24]. Splicing errors are estimated to

represent 15 % of disease-causing mutations [25], but may

be much higher [26, 27]. The impact of a single nucleotide

change in a recognition sequence can range from insignifi-

cant to complete abolition of a protein binding site. Aber -

rant splicing events causing frameshifts often disrupt

protein function; in-frame changes are dependent on gene

context. The complexity of interpretation of non-coding

sequence variants benefits from computational approaches

[28] and direct functional analyses [29–33] that may each

support evidence of pathogenicity.

Ex vivo transfection assays developed to determine the

pathogenicity of VUS predicted to lead to splicing aberra-

tions (using in silico tool s) have been successful in identify-

ing pathogenic sequence variants [34, 35]. IT -based analysis

of splicing variants has proven to be robust and accurate

(as determined by functional assays for mRNA expression

or binding assays) at analyzing splice site (SS) variants, in-

cluding splicing regulatory factor binding sites (SRFBSs),

and in distinguishing them from polymorphisms in both

rare and common diseases [36–39]. However, IT can be ap-

plied to any sequence recognized and bound by another

factor [40], such as with transcription factor binding sites

(TFBSs) and RNA-binding protein binding sites (RBBSs).

IT is used as a measure of sequence conservation and is

more accurate than consensus sequences [41]. The individ-

ual information (R

) of a base is related to thermodynamic

entropy, and therefore free energy of binding, and is mea-

sured on a logarithmic scale (in bits). By comparing the

change in information (ΔR

) for a nucleotide variation of a

bound sequence, the resulting change in binding affinity

is ≥ 2

ΔRi

, such that a 1 bit change in information will result

in at least a 2-fold change in binding affinity [42].

IT measures nucleotide sequence conservation and

does not provide information on effe cts of variants on

mRNA secondary (2°) structure, nor can it accurately

predict effects of amino acid sequence changes. Associa-

tions of structural changes in untranslated regions

(UTR) of mRN A with disease justifies including pre-

dicted effects of these changes on 2° structure in the

comprehensive analysis of sequence variants [43]. Other

in silico methods have attempted to address these defi-

ciencies. For example, Halvorsen et al. (2010) introduced

an algorithm called SNPfold, which computes the potential

effect of a single nucleotide variant (SNV) on mRNA 2°

structure [20]. Predictions made by SNPfold can be tested

by the SHAPE assay (Selective 2’-Hydroxyl Acylation ana-

lyzed by Primer Extension) [44], which provides evidence

for sequence variants that lead to structural changes in

mRNA by detection of covalent adducts in mRNA.

The implications of improved VUS interpr eta tion are

particularly relevant for HBOC due to its incidence and the

adoption of panel testing for these individuals [45, 46]. It

has been suggested that patients with a high risk profile re-

ceiving uninformative results would imply that deleterious

variants lie in untested regions of BRCA1/2, untested genes,

or are unrecognized [47, 48]. This is also supported by

studies where families with linkage to BRCA1/2 had no de-

tectable pathogenic mutation (however it is noteworthy

that detection rates of BRCA mutations in families with

documented linkage to these loci appears to vary b y ascer-

tainment, inclusion criteria, and technology used to identify

Mucaki et al. BMC Medical Genomics (2016) 9:19 Page 2 of 25

the mutations) [49, 50]. The concept of non-BRCA gene

association has been demonstrated by the identification of

low-to-moderate risk HBOC genes, and variants within

coding and non-coding r egions affecting splicing and regu-

latory factor binding [51, 52]. Consequently, VUS, which in-

clude rare missense changes, other coding and non-coding

changes in all of these genes, greatly outnumber the catalog

of known deleterious mutations [53].

Here, we devel op and e va luate IT-ba sed model s to

predict potential non-coding sequence mutations in

SSs, TFBSs , and RBBSs in 7 genes sequenced in their

entirety. These models were used to analyze 102 an-

onymous HBOC patients who did not exhibit known

BRC A1/2 coding mutations at the time of initial test-

ing, despite meet ing the criteria for BRCA genetic

testing. The genes are: ATM, BRCA1, BRCA2, CDH1,

CHEK2, PALB2, and TP53, and have been reported to

harbor mutations that increase HBOC risk [54–76].

We apply these IT-based methods to analyze variant s

in the complete sequences of coding, non-coding , and

up- and downstream regions of the 7 genes. In this

study, we established and applied a unified IT-ba sed

framework , first filtering out common variant s , then

to “flag” potentially deleterious ones. Then, using

context-specific criteria and information from the

published literature, we prioritized likely candidates.

Methods

Design of tiled capture array for HBOC gene panel

Nucleic acid hybridization capture reagents designed from

genomic sequences generally avoid repetitive sequence

content to avoid cross hybridization [77]. Complete gene

sequences harbor numerous repetitive sequences, and an

excess of denatured C

t-1 DNA is usually added to

hybridization to prevent inclusion of these sequences [78].

RepeatMasker software completely masks all repetitive

and low-complexity sequences [79]. We increased se-

quence coverage in complete genes with capture probes

by enriching for both single-copy and divergent repeat

(>30 % divergence) regions, such that, under the correct

hybridization and wash conditions, all probes hybridize

only to their correct genomic locations [77]. This step was

incorporated into a modified version of Gnirke and col-

leagues’ (2009) in-solution hybridization enrichment

protocol, in which the majority of library preparation,

pull-down, and wash steps were automated using a

BioMek® FXP Automation Workstation (Beckman

Coulter, Mississauga, Canada) [80].

Genes ATM (RefSeq: NM_000051. 3, NP_000042.3),

BRCA1 (RefSeq: NM_007294.3, NP_009225.1), BRCA2

(RefSeq: NM_000059.3, NP_000050.2), CDH1 (RefSeq:

NM_004360.3, NP_004351.1), CHEK2 (RefSeq: NM_

145862.2, NP_665861.1), PALB2 (RefSeq: NM_024675.3,

NP_078951.2), and TP53 (RefSeq: NM_000546.5, NP_

000537.3) were selected for capture probe design by tar-

geting single copy or highly divergent repeat regions

(spanning 10 kb up- and downstream of each gene rela-

tive to the most upstream first exon and most down-

stream final exon in RefSeq) using an ab initio approach

[77]. If a region was excluded by ab initio but lacked a

conserved repeat element (i.e. divergence > 30 %) [79],

the region was added back into the probe-design se-

quence file. Probe sequences were selected using PICKY

2.2 software [81]. These probes were used in solution

hybridization to capture our target sequences, followed

by NGS on an Illumina Genome Analyzer IIx (Add-

itional file 1: Methods).

Genomic sequences from both strands were captured

using overlapping oligonucleotide sequence designs cover-

ing 342,075 nt among the 7 genes (Fig. 1). In total, 11,841

oligonucleotides were synthesized from the transcribed

strand consisting of the complete, single copy coding, and

flanking regions of ATM (3513), BRCA1 (1587), BRCA2

(2386), CDH1 (1867), CHEK2 (889), PALB2 (811), and

TP53 (788). Additionally, 11,828 antisense strand oligos

were synthesized (3497 ATM,1591BRCA1,2395BRCA2,

1860 CDH1,883CHEK2,826PALB2, and 776 TP53). Any

intronic or intergenic regions without probe coverage are

most likely due to the presence of conserved repetitive el-

ements or other paralogous sequences.

For regions lacking probe coverage (of ≥ 10 nt, N =141;

8inATM,26inBRCA1,10inBRCA2,29inCDH1,36in

CHEK2,15inPALB2,and17inTP53), probes were se-

lected based on predicted T

s similar to other probes,

limited alignment to other sequences in the transcriptome

(<10 times), and avoidance of stable, base-paired 2° struc-

tures (with unaFOLD) [82, 83]. The average coverage of

these sequenced regions was 14.1–24.9 % lower than other

probe sets, indicating that capture was less efficient,

though still successful.

HBOC samples for oligo capture and high-throughput

sequencing

GenomicDNAfrom102patientspreviouslytestedfor

inherited breast/ovarian cancer without evidence of a pre-

disposing genetic mutation, was obtained from the Molecu-

lar Genetics Laboratory (MGL) at the London Health

Sciences Centre in London, Ontario, Canada. Patients

qualified for genetic susceptibility testing as determined by

the Ontario Ministry of Health and Long-Term Care

BRCA1 and BRCA2 genetic testing criteria [84] (see Add-

itional file 2). The University of Western Ontario research

ethics board (REB) approved this anonymized study of

these individuals to evaluate the analytical methods pre-

sented here. BRCA1 and BRCA2 were previously analyzed

by Protein Truncation Test (PTT) and Multiplex Ligation-

dependent Probe Amplification (MLP A). The exons of sev-

eral patients (N = 14) had also been Sanger sequenced. No

Mucaki et al. BMC Medical Genomics (2016) 9:19 Page 3 of 25

pathogenic sequence change was found in any of these in-

dividuals. In addition, one patient with a known pathogenic

BRCA variant was re-sequenc ed by NGS as a positive

control.

Sequence alignment and variant calling

Variant analysis involved the steps of detection, filtering,

IT-based and coding sequence analysis, and prioritization

(Fig. 2). Sequencing data were demultiplexed and aligned to

the specific chromosomes of our sequenced genes (hg19)

using both CASAVA (Consensus Assessment of Sequen-

cing and Variation; v1.8.2) [85] and CRAC (Complex Reads

Analysis and Classification; v1.3.0) [86] software. Align-

ments were prepared for variant calling using Picard [87]

and variant calling was performed on both versions of the

aligned sequences using the UnifiedGenotyper tool in the

Genome Analysis Toolkit (GATK) [88]. We used the rec-

ommended minimum phred base quality score of 30, and

Fig. 1 Capture Probe Coverage over Sequenced Genes. The genomic structure of the 7 genes chosen are displayed with the UCSC Genome

Browser. Top row for each gene is a custom track with the “dense” visualization modality selected with black regions indicating the intervals

covered by the oligonucleotide capture reagent. Regions without probe coverage contain conserved repetitive sequences or correspond to

paralogous sequences that are unsuitable for probe design

Mucaki et al. BMC Medical Genomics (2016) 9:19 Page 4 of 25

results were exported in variant call format (VCF; v4.1). A

software program was developed to ex clude variants called

outside of targeted capture regions and those with quality

scores < 50. Variants flagged by bioinformatic analysis (de-

scribed below) were also assessed by manually inspecting

the reads in the region using the Integrative Genomics

Viewer (IGV; version 2.3) [89, 90] to note and eliminate ob-

vious false positives (i.e. variant called due to polyhomonu-

cleotide run dephasing, or PCR duplicates that were not

eliminated by Picard). Finally, common variants (≥1 % allele

frequency based on dbSNP 142 or > 5 individuals in our

study cohort) were not prioritized.

IT-based variant analysis

All variants were analyzed using the Shannon Human

Splicing Mutation Pipeline, a genome-scale variant

analysis program that predicts the effects of variants on

mRNA splicing [91, 92]. Variants were flagged based on

criteria reported in Shirley et al. (2013): weakened nat-

ural site ≥ 1.0 bits , or strengthened cryptic site (within

300 nt of the nearest exon) where cryptic site strength is

equivalent or greater than the nearest natural site of the

same phase [91]. The effects of flagged variants were fur-

ther analyzed in detail using the Automated Splice Site

and Exon Definition Analysis (ASSEDA) server [38].

Exonic variants and those found within 500 nt of an

exon were assessed for their effects, if any, on SRFBSs

[38]. Sequence logos for splicing regulatory factors (SRFs)

(SRSF1, SRSF2, SRSF5, SRSF6, hnRNPH, hnRNPA1,

ELAVL1, TIA1, and PTB) and their R

sequence

values (the

mean information content [93]) are provided in Caminsky

et al. (2015) [36]. Because these motifs occur frequently in

Fig. 2 Framework for the Identification of Potentially Pathogenic Variants. Integrated laboratory processing and bioinformatic analysis procedures

for comprehensive complete gene variant determination and analysis. Intermediate datasets resulting from filtering are represented in yellow and

final datasets in green. Non-bioinformatic steps, such as sample preparation are represented in blue and prediction programs in purple. Sequencing

analysis yields base calls for all samples. CASAVA [85] and CRAC [86] were used to align these sequencing results to hg19. GATK [88] was used to call

variants from this data against GRCh37 release of the reference human genome. Variants with a quality score < 50 and/or call confidence score < 30

were eliminated along with variants falling outside of our target regions. SNPnexus [112–114] was used to identify the genomic location of the variants.

Nonsense and indels were noted and prediction tools were used to assess the potential pathogenicity of missense variants. The Shannon Pipeline [91]

evaluated the effect of a variant on natural and cryptic SSs, as well as SRFBSs. ASSEDA [38] was used to predict the potential isoforms as a result of

these variants. PWMs for 83 TFs were built using an information weight matrix generator based on Bipad [106]. Mutation Analyzer evaluated the effect

of variants found 10 kb upstream up to the first intron on protein binding. Bit thresholds (R

values) for filtering variants on software program outputs

are indicated. Variants falling within the UTR sequences were assessed using SNPfold [20], and the most probable variants that alter mRNA

structure (p < 0.1) were then processed using mFold to predict the effect on stability [83]. All U TR variants were scanned with a modified

version of the Shannon Pipeline, which uses PWMs computed from nucleotide frequencies for 28 RBPs in RBPDB [109] and 76 RBPs in

CISBP-RNA [110]. All variants meeting these filtering criteria were verified with IGV [89, 90]. *Sanger sequencing was only performed for

protein trunca ting, splicing, and selected missense variants

Mucaki et al. BMC Medical Genomics (2016) 9:19 Page 5 of 25

A unified analytic framework for prioritization of non-coding variants of uncertain significance in heritable breast and ovarian cancer

Figures

Citations

Integrative Genomics Viewer

Prevalence and spectrum of germline rare variants in BRCA1/2 and PALB2 among breast cancer cases in Sarawak, Malaysia.

Assessment of the functional impact of germline BRCA1/2 variants located in non-coding regions in families with breast and/or ovarian cancer predisposition

Next step in molecular genetics of hereditary breast/ovarian cancer: Multigene panel testing in clinical actionably genes and prioritization algorithms in the study of variants of uncertain significance.

Prioritizing variants in complete Hereditary Breast and Ovarian Cancer (HBOC) genes in patients lacking known BRCA mutations

References

Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008.

The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data

Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.

An integrated encyclopedia of DNA elements in the human genome

Mfold web server for nucleic acid folding and hybridization prediction

Related Papers (5)

A unified analytic framework for prioritization of non-coding variants of uncertain significance in heritable breast and ovarian cancer

Ranking of non-coding pathogenic variants and putative essential regions of the human genome.

Genetic variation in an individual human exome.

Analysis of protein-coding genetic variation in 60,706 humans

Identification of pathogenic variant enriched regions across genes and gene families

Frequently Asked Questions (17)

Q1. What have the authors contributed in "A unified analytic framework for prioritization of non-coding variants of uncertain significance in heritable breast and ovarian cancer" ?

Q2. How many antisense strand oligos were synthesized?

Q3. How many welldistributed loci were excluded as candidate deletion intervals?

Q4. What was the important factor in determining potential hemizygosity?

Q5. What are the genes that have been reported to harbor mutations that increase HBOC risk?

Q6. What is the impact of a single nucleotide change in a recognition sequence?

Q7. What is the reason why the NGS protocol is more sensitive?

Q8. Why is the exon strength weaker than the natural one?

Q9. What are the TFs that have evidence for binding to the promoters of the genes?

Q10. What is the role of the IT-based analysis in identifying splicing variants?

Q11. What is the strategy to improve variant interpretation in patients?

Q12. What was the likely to alter stable 2° structures in mRNA?

Q13. How many in silico programs evaluated the effects of the remaining variants?

Q14. What are the benefits of interpreting non-coding sequence variants?

Q15. What was the common consequence of false positive variant calls?

Q16. What is the average number of variants per patient at each step?

Q17. What were the likely to have a deleterious impact on protein activity?