Future use of mapped SREs in BRCA1 to improve SRE
prediction
Solving the problem of over-prediction is an important step towards the
utility of SRE-dedicated bioinformatic tools in variant interpretation
and clinical diagnostics. As shown in our detailed BRCA1 SRE map
(Supplementary Figure 1), there are negative control variants within the
mapped SREs that are predicted by HSF to alter these motifs. Some are
even located at the same nucleotide position as positive control
variants. For example, c.5007C>T, categorized as
non-functional and with effects on mRNA depletion (as per Figure 2), is
designated a true positive since it is also predicted to create an ESS
by HSF (Supplementary Figure 1); whereas, c.5007C>A and
c.5007C>G have no functional impact, and are designated
false positives since they were predicted to create an ESS and break an
ESE, respectively (Supplementary Figure 1). ΔtESRseq,
ΔHZEI, and HOT-SKIP – which combine the scores of ESEs
and ESSs disrupted or created by a variant – correctly predicted
c.5007C>A and c.5007C>G to have no impact on
an SRE. Similar results are observed for other co-located HSF-predicted
false positive variants at c.5127, c.5130, c.5430, and c.5472
(Supplementary Figure 1), where at least two of the three tools
(ΔtESRseq, ΔHZEI, and HOT-SKIP) had negative calls in
agreement with mRNA depletion score results. While the quantitative
combined ESS-ESE scoring approach of ΔtESRseq, ΔHZEI,
and HOT-SKIP appears to significantly lower the number of HSF-predicted
false positives, there are still negative control variants within the
mapped SREs that are predicted as impacting SREs by these three tools.
Clearly, there are other factors that need to be considered to improve
prediction of variant effect with mapped SREs.
The false positive variants can be studied further to gain more
understanding of the structural features that prevent the usage of SREs.
For false positive variants outside of the mapped SREs, the location of
predicted SREs with respect to local mRNA secondary structure could also
play a role e.g. inclusion of SRE in the stem of a stem-loop structure
may possibly lessen the access of a corresponding RNA-binding protein
(Buratti et al., 2004). In the same way, the positive control dataset of
33 variants could be assessed for structural features that enable these
variants to alter mRNA expression. More information on structural
patterns that influence exonic SRE activity, which can be obtained from
bioinformatic analysis, may be useful in improving SRE prediction not
only in BRCA1 but also in other genes.