b1. Frequency-dependent selection: is there a rare allele advantage?
If MHC alleles were under negative frequency dependent selection, then on average rare alleles must confer an advantage (in the form of lower parasite infection) compared to common alleles. In contrast, MHC alleles with high frequency would be less effective in parasite detection, because parasites would be evolving strategies to avoid detection by those alleles. We therefore expect a positive relationship between the frequency of an allele and its overall effect on parasite infection, where a negative effect denotes protection and a positive effect suggests susceptibility or survivor’s bias. Because MHC II genes scatter throughout multiple loci in fish genome (Kaufman, 2018), it is difficult to use allele frequency to estimate the abundance of MHC allele in the population. We used allele prevalence instead, which was calculated as the percentage of individuals carrying a focal allele in a population. Note that allele prevalence had different properties from allele frequency; for example, allele prevalence values of all the alleles in a population do not sum to 1. Similarly, we also calculated parasite prevalence in a population as the percentage of individuals infected by a focal parasite. Alleles and parasites that are too rare (<0.05 prevalence) or too common (>0.95 prevalence) do not provide sufficient variance to estimate effects. For every moderately prevalent MHC - parasite combination (both prevalence variable ranging from 0.05 to 0.95) in every sampling site, we used a generalized linear model with negative binomial distribution to test if the presence/absence of the focal MHC allele influenced the infection intensity of the focal parasite in that stickleback population. We corrected p values for multiple-comparison with BH method. The Z value of the regression models indicated the impact of the particular MHC allele on the infection rate by a particular parasite. We excluded the models disproportionately influenced by extreme values, i.e. the absolute Z value of a model would change over 0.5 if excluding the largest data point from the model. After iterating this procedure for all qualifying MHC-parasite combinations, we used another mixed-effect linear regression model to examine if the estimated effect sizes (Z) of MHC alleles were influenced by local allele prevalence. In this model, we treated both sampling site and focal parasite as random effects.