Suppl. TableS8-13
Genotype findings in each of the six probands are annotated and analyzed for common measures of pathogenicity and inheritance (Roessler et al., 2018a). Coding variants were annotated using dbNSFP v.3.3a consensus (http://annovar.openbioinformatics.org/en/latest/): determined by consensus >50% of [SIFT, PolyPhen2HDIV, PolyPhenHVAR, LRT, MutationTaster, MutationAssessor, FATHMM, PROVEAN, FATHMM-MKL, MetaSVM, MetaLR; see columns CC to CP and CV to CZ] as driver mutations (red), damaging variants (pink) or <50% as likely benign (green). Pathogenic mutations (red) cluster at discrete HPE loci. Variants observed at a Minor Allele Frequency (MAF) >1% as determined by ExAC [http://exac.broadinstitute.org] were removed from the dataset and assumed to be benign. Non-coding SNPs were prioritized based on a measure of phylogenetic conservation (our scale of 0 to 7, with 7 being the highest rank considered) among seven vertebrate species [danRer7, fr3, TetNig1, galGal3, mm10, morDom5, xenTro3] in the current version of the ECR browser [ecrbrowser.dcode.org]. Non-coding variants with conservation <4/7 (light blue) are by far the majority among SNPs as well as novel observations. Ultra-conserved variants (>4/7, dark blue) represent a minority of findings, but interestingly are common in the vicinity of known HPE genes. Indels [colored in tan] were scored [0 to 7] based on the coordinate score immediately 5’ of the indel variation. Most non-coding variation is poorly conserved [light blue, score 0 to 3). Ultra-conserved variants [dark blue, score >4] are far less common, but can be seen within or near a known HPE gene. Our score [0 to 7] was then compared with other position specific metrics [gerp++, phyloP100, phyloP46, CADDRawscore, CADD13, CADDindel, dann, FATHAMM_coding, FATHMM_noncoding, and Eigen; see columns BM to BW]. The only filters currently used are [Func.ref: no ncRNAexonic, no ncRNAintronic; and Exonicfunc.refGene: the synonymous coding region changes; see columns AH and AK]. Known repeat regions determined by the ECRbrowser analysis were removed from the dataset. Poor data was colored in grey but retained in the dataset.