Population diversity
The two isolates composing the New Zealand population were omitted from further population genetic analyses. As linkage disequilibrium (LD) is expected to decay rapidly with large effective population size and high recombination rate, here we calculated LD within 5 kb non-overlapping windows for the European and the Japanese populations separately using the –geno-r2 command in VCFtools (Danecek et al.2011). The mean r2 values for each distance between loci were plotted in R to visualize LD decay. Given that the LD decay decreased rapidly for the Japanese population, a genomic window of 10 kb was chosen as a compromise between LD decay and SNP density for the analyses of genome-wide diversity within and between Japan and Europe. Nucleotide diversity π (Nei & Li 1979) was calculated for both populations using VCFtools. Levels of genetic differentiation between populations was estimated by calculating FST (Hudsonet al . 1992) and nucleotide substitution per site (DXY) using the python script from Simon Martin (https://github.com/simonhmartin). VCFtools was used to estimate Tajima´s D statistics (Tajima 1989) across the 10 kb windows, in order to detect departure from the standard neutral model. Tajima’s D was estimated on a genome wide level and for specific genomic windows of interest. Estimated on a single locus, positive values of Tajima´s D indicate balancing selection, while negative values indicate directional selection. Whereas genome-wide distribution of Tajima´s D values can give insights into demographic population events, with negative value as an indicator of population expansion while positive values indicate population contraction. Manhattan plots were created using the R package qqman to visualize the FST, DXY, π and Tajima’s D along the whole genome and for scaffolds of interest (Turner 2014).