Yang Yang

and 7 more

Contemporary population genomic studies typically involve mapping raw reads to a reference genome and analyzing single nucleotide polymorphism (SNP) data obtained from variant calling. Despite the widespread use of the genotype caller GATK for variant calling, its design primarily for human data poses limitations in non-human species. Recently, ATLAS has emerged as a promising alternative caller, exhibiting superior performance with lower false positive and negative rates, significantly impacting phylogenomic inferences. However, the extent to which ATLAS versus GATK influences downstream population genomic analyses remains largely unexplored. To address this gap, we conducted a population genomic study on five Pterocarya species using GATK and ATLAS, alongside two reference genomes, P. stenoptera and P. macroptera. Analyzing four datasets, we evaluated mapping depth, coverage rate, linkage disequilibrium (LD), nucleotide diversity (π), population structure, and demographic history. Notably, using P. stenoptera as the reference genome resulted in less depth and coverage rate variation across species compared to P. macroptera. ATLAS consistently identified more SNPs, higher nucleotide diversity, and lower LD for both reference genomes. Population structure results were more sensitive to the choice of reference genome than callers, while both reference genomes and callers significantly influenced population demography inference. Our study emphasizes the critical impact of genotype caller and reference genome selection on downstream analyses. Based on current evidence, selecting a closely related reference genome and employing ATLAS for SNP calling are recommended to enhance the accuracy and reliability of population genomic studies.