2.2. Occurrence data
We used two sets of occurrence data, at the species and genetic analysis levels. M. geocarpum records were collected from its location points using data of self-collected material and the Global Biodiversity and Informatics Facility (GBIF, www.gbif.org), an online available database. We used both the original population location points and those from GBIF in all subsequent analyses. As the different data sources and a large dataset (>500 occurrence records) would likely carry elevated geographical or environmental space biases (Boria et al. 2014; Peterson et al. 2011), the number of records were decreased in Wallace package, an online workspace based on R interface (Kass et al. 2020) using four complementary approaches: 1) we first removed occurrences collected before 1986 to match with environmental layers and soil properties; 2) considerable ambiguity may exist in GBIF data over the identity of the species due to synonymous names (M. geocarpumvar. geocarpum , M. geocarpum var. Tisserantii ;Kerstingiella geocarpa , Kerstingiella tisserantii ). To avoid any confusion arising from this taxonomic ambiguity, we searched through the online databases using the following keywords:Macrotyloma geocarpum , Kerstingiella geocarpa , var.geocarpa or var. geocarpum ; orphan legumes. We then harmonized the GBIF database and discarded the reports on var.tisserantii and Kerstingiella tisserantii ; 3) we used spatially filtering occurrences located ≤ 10 km from other occurrences using the spThin, an R package (Aiello-Lammens et al. 2015); finally, 4) we manually checked isolated locations points in Africa (in ArcGis ver. 10.7.1) and removed occurrences in areas where M. geocarpum is not generally grown.
The defined genetic clusters data with their geographic coordinates were also filtered separately to ensure the real distribution of each population within agroclimatic zones. The filtered dataset comprised in total 64 occurrences (Pop1 = 22; Pop2 = 25 and GBIF = 17) that was used in subsequent analyses (Table S1).