1. INTRODUCTION
Recent reductions in the size and cost of autonomous data collection equipment have allowed ecologists to better and more efficiently survey their study sites (Acevedo &
Villanueva-Rivera, 2006). Much work has been done to examine the benefits of using camera trap networks to detect shy and retiring species whose detection probabilities greatly decrease in the presence of human researchers (O’Connell et al., 2010). However, many species for which remote surveying techniques are optimal are difficult to properly monitor with camera traps due to their small body sizes and / or preference for heavy vegetative cover (Newey et al., 2015). A number of these species, particularly interior forest birds, are much easier to detect via acoustic monitoring techniques due to their frequent, far-carrying vocalizations, and battery-operated automated recording units (ARUs) have recently become a cost-effective option for researchers working with these species (Brandes, 2008). ARUs can operate in the field for much longer periods of time than humans observers can (several days or weeks in many cases), efficiently and safely survey remote areas early in the morning and late at night, and, like camera traps, minimize disturbance to sensitive species. An additional benefit of audio recorders relative to camera traps is that audio recorders have a wider range of detectability than camera traps since they do not require direct line of sight, which increases the area of coverage and the likelihood of detecting rare species. However, in order to use audio recordings from ARUs, vocalizations from the target species must be detected among large quantities of survey audio. Efficiently and reliably identifying these detections presents a major challenge when developing a data-processing pipeline. Although this task is still a non-trivial consideration in developing a study design, recent advancements in machine learning (ML) classification techniques, coupled with dramatic increases in the availability and accessibility of powerful hardware, have made this process easier than ever (Kahl et al., 2019). We strongly believe that the application of machine learning techniques to the processing of large quantities of automated acoustic event detection data will prove to be a transformative development in the fields of ecology and conservation, allowing researchers to tackle biological questions that have previously been impractical to answer.
Several life history characteristics of tinamous (Tinamidae), a group of terrestrial birds that occur widely in the Neotropics, make them superb candidates for field-testing this type of audio processing pipeline. Although a few species in this family occupy open habitats, most show a high affinity for interior forest areas with thick vegetative cover (Bertelli & Tubaro, 2002). They are far more often heard than seen, and some species vocalize prolifically as part of the dawn and dusk choruses (Pérez‐Granados et al., 2020). This preference for interior forest, along with their large body sizes and terrestrial nature, makes tinamous inordinately susceptible to the effects of anthropogenic habitat change, both in terms of outright habitat loss and to increased human hunting pressure in fragmented forest patches near populated areas (Thornton et al., 2012). Intensive life history research in the coming years will be critical to conservation of tinamous and their habitats, and autonomous recording has the potential to revolutionize this line of inquiry.
Here we present the preliminary results of an ongoing field study that involves deploying ARUs at lowland Amazonian forest sites in Madre de Dios, Peru. Although this region has tentatively among the highest levels of tinamou alpha diversity in the Neotropics (11 co-occurring species: eBird, 2017), there is currently a lack of research into which biological and ecological factors allow such high degrees of alpha diversity. We collected environmental audio of each day’s dawn and dusk choruses and designed a data pipeline that uses a machine learning (ML) audio classifier to identify tinamou vocalization events in the audio data and organize the detections into a spatiotemporal database for future use in producing occupancy models for the target species. To our knowledge, this technology has not previously been used to conduct community-level surveying for tinamous and represents a promising alternative to camera traps and more traditional point-count surveying as a means of studying elusive yet highly vocal bird taxa.