Data assembly: Infection data, niche modelling, phylogenies
We assembled infection data through a survey of peer-reviewed
literature. This survey resulted in an updated version (Supporting
Information) of the list published by Cruz-Laufer et al. (2021a).
For abundance weighting in downstream analyses, we also assembled
infection parameters including the number of examined hosts, infected
hosts, and parasites. If no infection parameters were reported, we
considered a report as a single infected specimen.
We built host niche dendrograms based on ecological, geographical, and
morphological data (Table 1) available in FishBase (Froese & Pauly
2000) and accessed through the R package rfishbase(Boettiger et al. 2012). Missing trophic level and habitat data
were added through a literature survey (see Supporting Information).
Dendrograms were built through hierarchical clustering in R (Pavoineet al. 2009) based on a Gower’s distance matrix (Gower 1971).
Gower’s distances were calculated using the function dist.ktab in
the R package ade4 v1.7.16 (Pavoine et al. 2009).
As suggested by Clark & Clegg (2017), we accounted for uncertainty of
the host niche by implementing a range of clustering algorithms
implemented in the hclust function in R (incl.ward.D2 , single , complete , average ,mcquitty, median , and centroid ) (R Core Team 2021). We
tested for topological congruence of the resulting dendrograms using the
congruence among distance matrices (CADM) test (Legendre & Lapointe
2004; Campbell et al. 2011) in the R package apev5.4 (Paradis & Schliep 2019).
As no previous phylogenetic study on fishes covers all the species known
to host members of Cichlidogyrus , we conducted a new analysis
(see Appendix S1.1) based on DNA sequence data accessed on GenBank
(Appendix S2) to infer phylogenetic distances between hosts. For the
parasites, we included morphometric and phylogenetic data from
Cruz-Laufer et al. (2021b), i.e. morphological measurements and
100 randomly sampled Bayesian tree topologies from the post-burn in
fraction.