Limitations and potential improvements
The lower rate of FC sequence recovery compared to BR sequence recovery implies that factors associated with FC PCRs, rather than sampling or DNA extraction, were the main cause of failures to obtain complete barcode sequences. The most likely explanation for this is that one or both primers used in FC PCRs have suboptimal matches with the specimens in question [48]. While it is unclear which of the FC primers (Ill_LCO1490 and Ill_C_R) might cause this problem, deficiencies of the LCO1490 / HCO2198 primer pair have been noted previously [49, 50], pointing to LCO1490 as problematic. These failures were concentrated in certain Coleoptera and Hymenoptera families, suggesting that the primer sequences may need adjustment to improve outcomes for these groups.
Improvements to our bioinformatic process may be possible. We separately identified FC and BR amplicons before attempting to align and merge those with expected taxonomic identities. It may seem intuitively simpler to merge all detected FC and BR sequences into putative barcodes, and then to identify the correct barcode among those based on taxonomy. However, there were unexpectedly high numbers of FC and BR sequences for many specimens after filtering and denoising, which would result in exceedingly high numbers of pairwise combinations of sequences requiring examination for correct taxonomic identity.