2.3 | Variant filtering steps and data analysis
Variants were filtered and prioritized using a four-step strategy to
generate a short candidate variant list for experimental validation
(Figure 1). Initially, we removed variants with less than 10× coverage.
Next, variants were limited to those with low population frequency. The
minor allele frequency (MAF) threshold was carefully chosen and variants
with an MAF ≥1% in the Genome Aggregation Database (gnomAD)
(http://gnomad.broadinstitute.org/) or the Korean Reference Genome
Database (KRGDB) (http://coda.nih.go.kr/coda/KRGDB/index.jsp) were
removed. The third step was to prioritize variants causing missense,
nonsense, frameshifts, and in-frame insertions/deletions variants, or
changes affecting consensus splice site sequences. Finally, we performed
a gene-specific analysis with an in-silico gene panel composed of 903
genes, filtering for selected phenotype traits Human Phenotype Ontology
(HPO)-terms for Microcephaly (HP:0000252) or Online Mendelian
Inheritance in Man (OMIM) microcephaly phenotype genes (Supplementary
Table 1). To delineate candidate genetic variants, an additional allele
analysis was performed under the following conditions: 1) triplicate
data with no pathogenic variant (PV)s nor likely pathogenic variant
(LPV)s, 2) de novo , compound heterozygous, homozygous, or
hemizygous variants, 3) ≤2 alleles in gnomAD or ≤8 alleles if recessive,
4) a CADD score of ~15 or higher and all deleterious
predictions in SIFT (http://sift.jcvi.org), PolyPhen2
(http://genetics.bwh.harvard.edu/pph2), and MutationTaster
(http://mutationtaster.org/) if missense variants, 5) affected genes
with data from animal models and/or functional studies suggesting
neurodevelopmental roles.