Results

Evaluation of Training

The performance of our model did not depend on the number of images used to train each species class (Fig. 4). , precision during the training process varied greatly species classes and was not a function of the number of images input into the model (Fig. 3). The class with the highest precision during training was armadillo (98%) with 186 images while grey squirrel had the lowest precision during training (30%), despite being trained on 318 images. The raccoon, turkey, and deer classes all resulted in comparably high precision values while being trained using 88, 430, and 1,109 images, respectively (Fig. 3). Five classes were trained using less than 60 images between the test and train dataset (Table 2, see Supplementary material Appendix 3 for all IOU graphs). Result metrics for these classes also varied as a function of species traits rather than number of images used to train the class.

Model Performance

To judge the performance of the model, we evaluated accuracy, precision, recall, and F-1 at several CT Metrics followed the same trends for both ID and CL purposes with CL values running slightly below ID values (Table 5). The test set produced recall values that were inversely related to the CTs, while the precision values were directly related; precision was highest at 0.95 CT (ID: 90%, CL: 88%) and recall was highest at 0.50 CT (ID: 96%, CL: 89%). F-1 score was highest at the 0.70 CT for ID (86%) and 0.90 CT for CL (83%). The difference between accuracy and F-1 values demonstrates the effect of TNs (Fig. ). Accuracy and F-1 were highest at 0.90 CT for the test data; therefore, we decided to use 0.90 CT for the validation set. The validation test resulted in a 93% accuracy, 68% precision, 86% recall, and 76% F-1 score (Table ).