Markov Random Field networks
The CRF models (with additional conditioning variables) clearly outperformed the MRF (with only virus occurrences included): the AUC for MRF was 0.69 while for the CRFs it varied between 0.87 and 0.89. Based on cross-validation, there were no pronounced discrepancies between the different CRFs, but the overall performance of the CRFs was better than that of the MRF model: the 50% quantile for the mean for predicting both true positives and negatives correctly for the MRF was 0.76, whereas the corresponding value for the CRF variants was around 0.91. The MRF predicted more false positives, whereas the CRFs predicted more false negatives. The mean values for different performance measures are reported in the Supplementary Results Table S1.
To understand the changes in the network resulting from the addition of conditioning variables, we compared the virus-virus-associations between viruses based on the MRF and the different CRF variants. The MRF revealed mostly positive associations between the viruses (Figure 4A). After including spatial, habitat and host-related variables (Table 1), some of the associations between the viruses diminished or disappeared, and all of the conditional associations were positive (Figure 4B). The number of significant virus co-occurrences captured by the MRF model was 50 (Figure 4A). The corresponding number for the CRFfull model was 16 (Figure 4B). The CRFs incorporating subsets of conditioning variables identified intermediate amounts of associations: 30 for CRFhost, 38 for CRFspat, 28 for CRFhabitat, and 18 for CRFenv.
Although several associations could be explained exchangeably with habitat- or host-related variables, many associations were also explained solely by either habitat- or host-related variables (Figure 4C-D). For example, Bromoviridae showed a high number of associations with other viruses (Figure 4A), but was not explained by host-related or spatial variables (Figure 4C and E). However, several of its strong associations with other viruses were explained by the habitat-related effects (Figure 4D). The 11 association links captured by the CRFfull model were captured by all the other conditional and unconditional model variants as well (Figure 4B). We will refer to these associations as ‘permanent’. In this network, Bromoviridae and especially Secoviridae appeared as hubs, with five association links to other viruses. These permanent associations represented direct interactions between viruses that could not be explained with indirect effects of the rest of the virus community nor any combination of additional conditioning variables.
Next, to understand how host- and habitat-related variables and spatial configuration of the hosts influence virus community structure, we investigated the direct effects of the additional conditioning variables. All the significant direct effects of the environmental and spatial variables were for either Caulimoviridae or Geminiviridae (Table 2): e.g., increasing agricultural land use in the surrounding landscape increased the occurrence probability of Caulimoviridae, and host population size predicted higher occurrence probability for Geminiviridae.
None of the indirect effects of the additional conditioning variables influenced the associations between viruses so that the direction of the direct virus-virus association would change from positive to negative or vice versa. However, all conditioning variables except host plant size and agricultural land use had some effect(s) on some viral associations (Table 2). In terms of the number of virus-virus-associations influenced, the most influential indirect effects were the spatial structure of the host populations (MEMs) and host population connectivity. All the effects of the first, coarse scale spatial variable (MEM1) were positive, whereas the effects of the spatial variables at increasingly finer scales (MEM2-4) were all negative. Increasing connectivity of the host population had both negative and positive effects on the virus-virus associations: for example, higher connectivity lowered the occurrence probability of Avsunoviridae in the presence of Bromoviridae (and vice versa, symmetrically). There were altogether eight significant herbivory-related effect, all of them positive (Table 2).