Benchmarking and comparative analysis
In order to test VIP-HL performance, we constructed two benchmark
datasets. The first dataset consisted of 50 out of 51 variants in the
hearing loss gene that has been curated by the ClinGen HL-EP (Oza et
al., 2018). We excluded NM_206933.3:c.(?_12295)_(14133_?)del in theUSH2A gene because it is an exon-level deletion (Exons 63-64
deletion), which is currently not compatible with VIP-HL.
To assess the importance of disease-specific annotations, we compared
activated rules by ClinGen HL-EP with those activated by VIP-HL and
InterVar, and vice versa.
Comparing rules that were not
activated by ClinGen HL-EP, but activated by either VIP-HL or InterVar,
we did not count the variants meriting the BA1 criterion because ClinGen
HL-EP did not activate other criteria once a variant met the BA1
criterion. For example, ClinGen HL-EP assigned BA1 for
NM_005422.2:c.1111A>G in the TECTA gene and did not
further activate other criteria. However, both VIP-HL and InterVar
activated BS2 because 10518 homozygotes are reported in the gnomAD
database for this variant (Karczewski et al., 2020). The InterVar code
was downloaded from GitHub. All the settings were set as default.
The second dataset included 4948 variants in 142 deafness-related genes
with ClinVar star 2+ (i.e., multiple submitters with assertion criteria,
expert panel or practice guideline) (Landrum et al., 2018). These
variants were selected because they had fewer misclassifications (Shah
et al., 2018; Xiang, Yang, et al., 2020). The 142 deafness-related genes
were curated by ClinGen HL-EP (DiStefano et al., 2019). The gene list
and their gene-disease associations are listed in Supplementary Table 1.