Tests for population structure and ancestry
We used Discriminant Analysis of Principal Components (DAPC) and K-means
clustering to determine whether there was underlying population
structure of P. subaeruginosa in Australia and the northern
hemisphere (Jombart, Devillard, & Balloux, 2010). The packages vcfR
(Knaus & Grünwald, 2017), adegenet (Jombart, 2008), and ggplot2
implemented in R (R_Core_Team, 2014) were used to import an
LD-corrected VCF file, cluster populations by k-means clustering and
DAPC, and plot results, respectively.
We used the relatedness command in vcftools v1.17 (Danecek et
al., 2011), which estimates relationships based on pairwise similarity
of genetic markers between individuals due to shared genetic ancestry,
as defined as the AJK statistic by Yang et al. (2010). We plotted
relatedness values in a pairwise heat map using ggplot2 in R
(R_Core_Team, 2014).