Human DNA
When analyzing metabarcoding data for profiling vertebrate diversity, we
remove all human DNA reads under the assumption that they are
contamination. However, we found a disproportionate amount of the DNA
sequences from mosquito pools was identified to be human and the total
human read abundance in samples was well above the level of human DNA
found in the extraction and PCR negative controls. Thus, we reexamined
levels of human DNA found in all three iDNA metabarcoding datasets. We
first eliminated non-human contaminants and non-amplifying samples
(replicates with less than 500 total reads). With the remaining dataset,
we culled all non-human read sequences. We closely examined the amount
of human DNA in the extraction and PCR negative controls to determine a
read threshold for a human positive sample. To be extremely conservative
in what we deemed as a sample positive for human, we set the threshold
as the highest read count in any negative control replicate for each
iDNA dataset. For example, if the highest read count for human DNA in a
negative control for the carrion fly metabarcoding data was 25,000 then
the threshold to be counted as a sample positive for human DNA was
25,000 reads in a replicate. Additionally, we also eliminated samples
where human DNA was not present in both sample replicates. A relative
abundance index (RAI) was calculated as the number of samples positive
for human DNA at a site divided by the total number of samples at a
site.