Figure Legends
Figure 1. Results from in-vitro minigene assays demonstrating multiple consequences as a result of variants proximal to the canonical splice site. Left , gel electrophoresis snapshots of cDNA products amplified from primers designed for control exons within the minigene (exon 1 & exon 2 ). All prominent bands were cut out and Sanger sequenced. Right , solid red blocks illustrate alignment of sequenced cDNA transcripts to features within the minigene vector: control exons (grey boxes ) and inserted exons (purple boxes ). (a) SCN2A c.2919+3A>G , showing complete exon exclusion and exon truncation in minigene vectors containing the c.2919+3A>G variant (top two alignments) and normal splicing in minigene vectors containing the WT sequence (bottom alignment). The first resulted in a transcript with a truncated exon, NM_001040142.1:r.2563_2710del, and the second resulted in a complete exon skip, NM_001040142.1:r.2563_2919del. While we interpreted both events as ‘likely pathogenic’ it is noteworthy that these events were considered differently using ACMG criteria; the exon truncation event resulted in a frameshift and introduction of a premature stop codon (PVS1 ), whereas the complete exon skipping event resulted in the inframe removal of 119 amino acids from the transcript (PM4 ).(b) MERTK c.2486+6T>A , showing a shifting of the exon included in the reading frame in minigene vectors containing the c.2486+6T>A variant (top alignment) and normal splicing in minigene vectors containing the WT sequence (bottomalignment). This novel variant is present in two individuals with severe rod-cone dystrophy, and resulted in the simultaneous usage of a cryptic exonic splice acceptor site and a cryptic intronic splice donor site creating a novel exon (chr2: 112,779,939-112,780,082, GRCh37 ), and a premature stop codon in the penultimate exon, p.(Trp784Valfs*10).
Figure 2. Comparison of in silico strategies to prioritize 250 variants of uncertain significance with functional investigations performed. (a) Receiver operating characteristics area under the curve (AUC) comparisons for nine in silicoprioritization strategies demonstrating that SpliceAI (AUC=0.95, 95%CI=0.93-0.98) and a consensus approach (AUC=0.94, 95%CI=0.91-0.96) outperform other strategies for prioritization. (b) AUC comparisons between SpliceAI, a consensus approach and a novel metric, demonstrates that a weighted approach slightly increases accuracy of prioritization over single approaches alone (AUC=0.96, 95%CI=0.94-0.98). (c-d) Accuracy comparisons of each insilico prioritization approach across 2000 bootstraps utilizing region-specific pre-defined thresholds: (c) violin plot demonstrating the calculated accuracy of eachin-silico prioritization approach; (d) frequency that each strategy is the best or joint best performing.
Figure 3. Summary of the overlap and correlations observed between the scores from in silico splicing prediction algorithms for 18,013 unique rare variants identified in a large cohort of 2783 individuals with rare disease undergoing genetic testing, specifically for syndromic and non-syndromic inherited retinal disorders. (a) Bar chart showing overall count of unique variants prioritized using pre-defined thresholds for each in silico prediction algorithm. (b) Overlap between the unique variants prioritized by the five most correlated in silico prediction tools. (c) Grouped bar chart demonstrating the overlap of variants prioritized by each tool segregated by the region of the genome that the variant impacts, as defined by Jagadeesh et al . (d) Correlation between SpliceAI score and the number of additional tools also prioritizing the variant for the 528 unique rare variants prioritized by SpliceAI.