2.5. Data analysis
The 20 qualitative traits were classified, and different values were assigned in accordance with the survey results. The distribution frequency of each classification was also calculated. Then, Shannon diversity index (I ) was calculated in accordance with the distribution frequency as follows:
\(I=\sum_{\par \begin{matrix}\\ i=1\\ \end{matrix}}^{n}{(pi)(\ln{pi)}}\),
where pi represents the relative frequency of theith phenotypic class of a trait (Kouam et al. , 2018). The maximum, minimum, average, standard deviation (SD), and coefficient of variation (CV) of six quantitative traits were calculated using SPSS 25.0 software. Then, in accordance with the overall average (\(\overset{\overline{}}{x}\)) and SD (σ), the quantitative trait data were divided into 10 levels, from the first level [Xi<(\(\overset{\overline{}}{x}\)-2 σ)] to the 10th level [Xi > (\(\overset{\overline{}}{x}\) + 2 σ)], and each 0.5 σ was a level. Principal component (PC) analysis was carried out with 26 phenotypic indices on SPSS 25.0 software. In accordance with the phenotypic trait survey data, a matrix (1,0) was constructed, and the registration at theith level of a trait was 1; otherwise, it was 0.
The bands of SRAP and SSR markers were scored for each primer as presence (1) and absence (0) for each locus, and the binary matrix was constructed and statistically analyzed. The allele number (N a), effective number of alleles (N e), allele frequency, Nei’s (1973) gene diversity index (H ), and I of each primer were calculated on POPGENE software version 1.32 (Yeh et al., 1999). The genetic similarity coefficient was evaluated and the principal coordinate analysis (PCoA) was conducted using NTSYS-pc software version 2.10e (Rohlf, 2000). Cluster analysis of the unweighted pair-group method with arithmetic means of the phenotypic traits and molecular markers was performed on MEGA software version 4.1 (Tamura et al., 2007).
The combined data of SRAP and SSR were analyzed via Bayesian model on STRUCTURE software version 2.3.1 to analyze population structure (Pritchard et al. , 2000). K (number of clusters) was estimated to be in the range of 2–10, and the software was run three times to determine this value. STRUCTURE HARVESTER (Earl and Vonholdt, 2012), which determines the best K on the basis of the probability of data given K and ΔK (Evanno et al., 2005), was used to estimate the most likely number of clusters (K).