Suppl. TableS8-13
Genotype findings in each of the six probands are annotated and analyzed
for common measures of pathogenicity and inheritance (Roessler et al.,
2018a). Coding variants were annotated using dbNSFP v.3.3a consensus
(http://annovar.openbioinformatics.org/en/latest/): determined by
consensus >50% of [SIFT, PolyPhen2HDIV, PolyPhenHVAR,
LRT, MutationTaster, MutationAssessor, FATHMM, PROVEAN, FATHMM-MKL,
MetaSVM, MetaLR; see columns CC to CP and CV to CZ] as driver
mutations (red), damaging variants (pink) or <50% as likely
benign (green). Pathogenic mutations (red) cluster at discrete HPE loci.
Variants observed at a Minor Allele Frequency (MAF) >1% as
determined by ExAC [http://exac.broadinstitute.org] were removed
from the dataset and assumed to be benign. Non-coding SNPs were
prioritized based on a measure of phylogenetic conservation (our scale
of 0 to 7, with 7 being the highest rank considered) among seven
vertebrate species [danRer7, fr3, TetNig1, galGal3, mm10, morDom5,
xenTro3] in the current version of the ECR browser
[ecrbrowser.dcode.org]. Non-coding variants with conservation
<4/7 (light blue) are by far the majority among SNPs as well
as novel observations. Ultra-conserved variants (>4/7, dark
blue) represent a minority of findings, but interestingly are common in
the vicinity of known HPE genes. Indels [colored in tan] were scored
[0 to 7] based on the coordinate score immediately 5’ of the indel
variation. Most non-coding variation is poorly conserved [light blue,
score 0 to 3). Ultra-conserved variants [dark blue, score
>4] are far less common, but can be seen within or near a
known HPE gene. Our score [0 to 7] was then compared with other
position specific metrics [gerp++, phyloP100, phyloP46, CADDRawscore,
CADD13, CADDindel, dann, FATHAMM_coding, FATHMM_noncoding, and Eigen;
see columns BM to BW]. The only filters currently used are
[Func.ref: no ncRNAexonic, no ncRNAintronic; and Exonicfunc.refGene:
the synonymous coding region changes; see columns AH and AK]. Known
repeat regions determined by the ECRbrowser analysis were removed from
the dataset. Poor data was colored in grey but retained in the dataset.