Saturation genome editing experimental data are used here to map putative SREs in BRCA1
Currently, there are no studies that systematically map the active SREs in BRCA1 exons, covering their entire lengths and all possible nucleotide substitutions. However, we took advantage of available mRNA expression data from a recently published large-scale functional analysis of BRCA1 (Findlay et al., 2018) to identify putative SREs across multiple exons of this gene. The study of Findlay et al. applied saturation genome editing to measure the cell survival consequences of all possible single nucleotide variants in the 13 exons that encode the BRCA1 RING and BRCT protein domains, critical for its role as a tumor suppressor (Findlay et al., 2018). Specifically, near-haploid HAP1 cells were genomically edited using CRISPR-Cas9 to introduce BRCA1 single nucleotide variants and variant abundances were quantified by targeted DNA sequencing as readout for a cell survival assay; this information was used to assign a “function score.” Variants that did not affect DNA abundance were classified as “functional”; otherwise, variants were classified as “non-functional” or “intermediate” depending on the extent of DNA depletion. In total, function scores were calculated for 3,893BRCA1 variants, and these scores were observed to accurately predict variant pathogenicity as reported to the ClinVar database. mRNA expression scores were also determined for 96% of the functionally characterized variants, and variants that were depleted in mRNA relative to DNA were interpreted to affect mRNA expression and/or processing.
From this dataset, we selected 33 BRCA1 synonymous or missense variants in putative SREs (Table 2) based on the following criteria (Figure 2): (a) depleted in mRNA (Findlay mean RNA score < -2); (b) non-functional or intermediate function based on DNA depletion; (c) outside of the donor and acceptor splice site motifs; (d) not predicted to create de novo donor or acceptor sites by the MES-based Variant Effect Predictor plugin using the thresholds and decision flowchart described in Shamsani et al. (2018); and (e) predicted to alter or create SREs by at least one SRE algorithm in HSF. These bioinformatic tools were chosen because they are freely available and easy to use. The MES-based Variant Effect Predictor plugin also allows high-throughput submission, and HSF accepts multiple variant queries for analysis using 14 different SRE algorithms (Table 1) in a single platform. Although nonsense variants can also alter SREs to lead to exon skipping (Supplementary Tables 2 and 3), these were excluded because they are expected to deplete mRNA via nonsense-mediated decay.
Exons 2, 3, and 19-22 did not harbor variants that passed the above criteria. The 33 variants prioritized as likely to impact SREs, are shown in Table 2. We then mapped the location of putative SREs in exons 5, 6, 16-18, 23 and 24 ofBRCA1 by identifying SRE sequences that overlap with these 33 variants (Figure 3, Supplementary Figure 1). Notably, the putative SREs mapped to exons with at least one weak splice site (MES score < 6.2), or with moderate strength for both splice sites (MES score between 6.2 and 8.5) (Table 3). With a single exception, in exon 23, putative SREs did not map to exons with strong splice donors (MES score ≥ 8.5) (Table 3). Since variants demonstrating minor mRNA depletion (Findlay mean RNA score between -0.5 and -2) were excluded to limit false predictions, this map is expected to capture only putative SREs with strong activity.
The validity of the mapping approach is supported by published mRNA splicing assay results that relate to variants prioritized as SRE-disrupting (Table 2). Variants c.5080G>A, c.5434C>G and c.5453A>G have been proven to lead to exon skipping in previous studies. Variants c.5080G>A and c.5123C>G are located at the same nucleotide position as three other variants for which exon skipping has been previously reported. Of these latter three variants, one was excluded from our mapping analysis since it encodes a premature termination codon, one had a minor mRNA depletion score of -0.55 above the filter of < -2, and the last was a 2 bp deletion and thus not assayed by Findlay et al. (2018). Eleven of the 33 putative SRE-disrupting variants are reported in ClinVar, where nine are catalogued based on predicted codon usage, and not annotated as spliceogenic variants; the (likely) pathogenic classification of c.5434C>G and c.5453A>G considered published splicing assay results as evidence (Table 2). Eight of these variants are currently not interpreted as (likely) pathogenic: c.5007C>T [p.(Ala1669=), likely benign], c.5044G>A [p.(Glu1682Lys), benign], c.5044G>C [p.(Glu1682Gln), VUS], c.5045A>T [p.(Glu1682Val), VUS], c.5078C>T [p.(Ala1693Val), VUS], c.5080G>A [p.(Glu1694Lys), VUS], c.5444G>C [p.(Trp1815Ser), VUS], and c.5528C>A [p.(Ala1843Glu),VUS].
The assay of Findlay et al. (2018) shows the effect of variants on mRNA levels and does not directly inform variant effect on mRNA splicing. Follow-up splicing assays would still be needed to confirm SRE-related mRNA aberrations for those variants in Table 2 without previously reported splicing assay results. The confirmatory splicing assays would provide further evidence to establish the BRCA1 SRE map for the exons examined, as well as potentially aiding the re-interpretation of variant pathogenicity.