BOLD datasets acquisition
Access was permitted to analyse COI barcoding data deriving from
a BOLD screening project totaling 184,585 arthropod specimens from 21
countries and collected between 2010 and 2014. COI sequences
provided by BOLD were generally derived from templates created from
somatic tissues (legs are often used in order to retain most of the
specimen for further analyses if necessary), but also rarely included
abdominal tissues. The first dataset made available included 3,817
sequences deemed as contaminant sequences, defined as not matching
initial morphotaxa assignment. The second dataset included 55,366
specimens judged to not contain non-target amplicons ([dataset]
Zakharov, Ratnasingham, deWaard & Smith, 2020). A remaining 125,402
specimens were not made available, and the 55,366 subsample was used as
a representative sample from which the contaminants had originated
(Figure 1).