Results
Evaluation of Training
The performance of our model did not depend on the number of images used
to train each species class (Fig. 4). , precision during the training
process varied greatly species classes and was not a function of the
number of images input into the model (Fig. 3). The class with the
highest precision during training was armadillo (98%) with 186 images
while grey squirrel had the lowest precision during training (30%),
despite being trained on 318 images. The raccoon, turkey, and deer
classes all resulted in comparably high precision values while being
trained using 88, 430, and 1,109 images, respectively (Fig. 3). Five
classes were trained using less than 60 images between the test and
train dataset (Table 2, see Supplementary material Appendix 3 for all
IOU graphs). Result metrics for these classes also varied as a function
of species traits rather than number of images used to train the class.
Model Performance
To judge the performance of the model, we evaluated accuracy, precision,
recall, and F-1 at several CT Metrics followed the same trends for both
ID and CL purposes with CL values running slightly below ID values
(Table 5). The test set produced recall values that were inversely
related to the CTs, while the precision values were directly related;
precision was highest at 0.95 CT (ID: 90%, CL: 88%) and recall was
highest at 0.50 CT (ID: 96%, CL: 89%). F-1 score was highest at the
0.70 CT for ID (86%) and 0.90 CT for CL (83%). The difference between
accuracy and F-1 values demonstrates the effect of TNs (Fig. ). Accuracy
and F-1 were highest at 0.90 CT for the test data; therefore, we decided
to use 0.90 CT for the validation set. The validation test resulted in a
93% accuracy, 68% precision, 86% recall, and 76% F-1 score (Table ).