AllelePred: A Simple Allele Frequencies Ensemble Predictor for Different
Single Nucleotide Variants
Abstract
Genomic medicine stands to be revolutionized through the understanding
of single nucleotide variants (SNVs) and their expression in single-gene
disorders (mendelian diseases). Computational tools can play a vital
role in the exploration of such variations and their pathogenicity.
Consequently, we developed the ensemble prediction tool AllelePred to
identify deleterious SNVs and disease causative genes. In comparison to
other tools, our classifier achieves higher accuracy, precision, F1
score, and coverage for different types of coding variants. Furthermore,
this research analyzes and structures 168,945 broad spectrum genetic
variants from the genomes of the Saudi population to denote the accuracy
of the model. When compared, AllelePred was able to structure the
unlabeled Saudi genetic variants of the dataset to mimic the data
characteristics of the known labeled data. On this basis, we accumulated
a list of highly probable deleterious variants that we recommend for
further experimental validation prior to medical diagnostic usage.