2.4 Statistical Data Analysis
All 33 SSR loci were evaluated for their adherence to the Hardy-Weinberg equilibrium and the presence of null alleles using the heterozygous deficiency method in Genepop v4.7 (Brookfield, 1996). The presence or absence of linkage disequilibrium between loci was confirmed by testing for genotypic linkage disequilibrium. The genetic parameters for each population were assessed by calculating various parameters, including allelic richness (A), mean observed heterozygosity (H O), percentage of polymorphic loci (P), and unbiased expected heterozygosity (H E). All genetic diversity estimates were calculated using GenAlEx 6.5.01 software (Peakall and Smouse, 2006). Further, the fixation index (F ST) values were calculated using FSTAT version 2.9.3 and Arlequin 3.5 (Excoffier and Lischer, 2010) to investigate the species/population differentiation. The AMOVA was performed in Arlequin 3.5 to determine the proportion of genetic variance explained by the differences within and among species/populations. Furthermore, a model‐based program, STRUCTURE 2.3.4 (Pritchard et al., 2000), was used to infer the number of distinct genetic clusters and to assign individuals to a specific genetic cluster using default parameters. The program was executed with 10 independent runs for each value of K ranging from 1 to 10, each with 1,000,000 Markov chain Monte Carlo replications, following a 100, 000 burn-in period. The admixture ancestry model and a correlated allele frequency model were used for all runs. The STRUCTURE HARVESTER online application was utilized to determine the estimated numbers of genetic components (K values) (Evannoet al., 2005; Verkuil et al., 2012). Clustering patterns and population structure inferences were determined throughout the K using the web tool CLUMPAK (Jakobsson and Rosenberg, 2007; Kopelmanet al., 2015). Both inter- and intra‐specific genetic structures of the different populations were assessed using multivariate principal component analyses (PCA) via multivariate principal component analyses (PCA) through the dudi.pca() function of “adegenet” R package (Jombart and Ahmed, 2011). The UPGMA clustering analysis of all populations was performed based on Nei’s (1972) unbiased genetic distance using the PowerMarker software (Liu and Muse, 2005), and the resulting tree was visualized with TREEVIEW ver. 1.52. The Venn diagram was tabulated using the number of private alleles identified by genetic analysis for cultivated (inbred, landraces, and feral rice), wild, and weedy types. MIGRATE v. 4.4.4 (Beerli, 2008) was used to estimate effective population size Ne (θ/4μ) and asymmetric gene flow M (m/μ) between pairs of different Oryza groups found in Sri Lanka. The analyses were conducted using Bayesian inference under the structured Coalescent model. First, two shorter runs (10 short chains of 10,000 sampled, 500 records and three final chains of 100,000, 5,000 recorded) were performed. Then, a long final run (10 short chains of 10,000 sampled, 500 recorded, and three final chains of 500,000 sampled and 25,000 recorded) was performed, and results from this final run were reported.