Population differentiation

We used methods based on the site frequency spectrum (SFS) (Korneliussen et al., 2013; Nielsen et al., 2012) to obtain the genome-wide fixation index (FST) values in ANGSD (Korneliussen et al., 2014). We calculated FST for the populations (six) containing at least 10 individuals with each geographical region represented by two locations (Figure 2). First, we generated unfolded SAF files (angsd -bam bamList -doSaf 1 -anc ANC -GL 1), then we estimated the folded SFS for each pair of populations (realSFS safidx1 safidx2 -fold 1). Each joint folded SFS was then used to estimate FST (-whichFst 1 -fold 1). To detect genomic windows of high differentiation in each region, we estimated the population branch statistic (Yi et al., 2010) for non-overlapping windows of 50 kb in ANGSD using this same approach with three populations with 10 individuals each (Table S1). The individuals were chosen from those that had 100% assignment to one of the three ancestral populations defined by NGSadmix in preliminary analysis using all genomic positions that passed the filters described above. Genomic positions within windows with PBS values below the 90th percentile (putatively neutral) were used in all analyses presented. 
Maritime geographical pairwise distances (https://sea-distances.org/) were calculated using the seaport nearest to the sampling location. Average distances were considered for merged populations. A Mantel test implemented in ade4 (Dray & Dufour, 2007) was used to test the statistical significance of the correlation between the geographic and the genetic distance matrices.
We assessed the gene content of the top outlier PBS windows for each region by running tblastn (Camacho et al., 2008) (BLAST version 2.6.0+) of the zebra fish proteome (ENSEMBL version GRCz11_pep) against the reference genome using the option “-evalue 0.000001”. Phenotypes associated with each gene were extracted from ENSEMBL using Biomart (Smedley et al., 2009).

Recombination rate

Variants were called using GATK version 4.0.7.0 (DePristo et al., 2011) for one representative individual per region (Table S1). Briefly, variants were first called for each individual with HaplotypeCaller in BP-RESOLUTION mode, then combining those GVCF files for each sample into a single one using CombineGVCFs per scaffold of interest, and finally joint genotyping with GenotypeGVCF. The default filter of GATK (--phred-scaled-global-read-mismapping-rate 45; --base-quality-score-threshold 18; --min-base-quality-score 10) was used. Recombination rates for 100 kb non-overlapping windows along the genome were estimated using the iSMC approach from (Barroso et al., 2019). We fitted an iSMC model with 40-time intervals and five categories of recombination rates to the samples from each population and optimized parameters in composite likelihood fashion (Barroso & Dutheil, 2021). We then obtained recombination landscapes of single-nucleotide resolution by performing posterior decoding in each diploid using the estimated parameters, and computed a consensus map for each sample by averaging over (for each site) the posterior estimates of rho = 4*Ne*r from all diploids. The final map of 100 kb resolution was obtained by further averaging the single-nucleotide estimates over 100 kb in non-overlapping windows.