Generalized linear models
To assess the relative importance of biotic variables (flower and host specialization and distribution of the interacting species) and bee body size (which relates to foraging range) for explaining wild bee occurrence in the Netherlands, we developed a series of Generalized Linear Models (GLMs) (fig. 1). For the flower visiting bees, we used the plant observation data, and we calculated the amount of grid cells occupied by each visited plant species. Secondly, we computed a measure of flower specialization, calculated as the diversity of genera visited in the interaction database for every flower visiting bee, using the Shannon-Wiener index (Shannon 1948). Thirdly, we used the information on body size from the bee trait database. The body size was measured using the intertegular distance (ITD, in mm) as a proxy, which is the distance between the wing insertion points (Greenleaf et al., 2007). The cleptoparasitic bees were modelled in a similar way, except that the distribution of the interacting species was calculated from observation data from potential host bees (table S2 in supporting information) and the host specialization was the number of potential hosts in the literature (Peeters & Nieuwenhuijsen 2012). In both cases, the explanatory variables were standardized, centred, and a gamma distribution with an inverse link function was used. The gamma distribution is applicable for situations in which we want to speculate about the response variable without certainty about its distribution (Faraway 2016) and for ecological data with non-zero values (Foster & Bravington 2013). Model performance was quantified with the AICc and the R2, which represents the proportion of the variance in the dependent variable that the model explains. Modelled species that did not find a contribution (e.g. no features of the respective variable present in the model) of their interacting species to their distribution were left out of the analysis. The three explanatory variables resulted in eight possible combinations of variables and we evaluated the models using the AICc as described in Hurvich & Tsai 1989. The GLMs were developed in the stats package in base R version 4.0.3 (R Core Team 2020).