Keywords:

Artificial Intelligence, Deep Learning, Species Classification, Neural Network, Pattern Recognition, Big Data

Introduction

Deep learning, a branch of machine learning, is an artificial intelligence approach which has been used for pattern recognition across multiple domains \cite{Shen_2017,Golden_2017,Min_2016,Heaton_2016,Esteva2019,Esteva2017}.  Whereas other machine learning approaches have been used for acoustic classification \cite{Aide_2013}, ecological modelling and studying animal behaviour \cite{Olden_2008,Valletta_2017,Christin_2019}, deep learning approaches have demonstrated the ability to overcome several machine learning limitations. One of the challenges of machine learning approaches is the need for superior domain knowledge and high-level programming skills \cite{LeCun_2015,Christin_2019}\cite{inproceedings,NIPS2014_5347}. Further, the data feature engineering step in machine learning is a complex and often tedious task that discourages many from using these techniques. Deep learning overcomes this feature engineering step by ensuring that the algorithm finds features by itself automatically \cite{inbook}.
In ecology, however, the use of deep learning is still in its infancy. This is despite its potential to revolutionalize applied ecology in identification and classification of species, behavioural studies, population monitoring and citizen science, ecosystem management and conservation \cite{Christin_2019,Lamba_2019,Miao_2019,Ditria_2019}. Several research articles continue to implement new and interesting application\cite{Terry_2020,Talas_2019,Priyadarshani_2020}. However, the techniques used still remain cryptic and inaccessible to most ecologists who are experts in their domains but who have no experience with these techniques. 
Ecology is particularly ripe for the applications of deep learning owing to the increase in complex ecological datasets over the past few years ranging from genomic to ecosystem-scale data, also known as Big Data. The Big Data derived from the increasingly sophisticated automatic monitoring by sensors can no longer be manually processed as it is redundant and time consuming \cite{Weinstein_2017,Norouzzadeh_2018}. Deep learning is specifically better than other methods in dealing with non-linear complex data commonly encountered in ecology \cite{Christin_2019}. In fact, all winning methods for the most recent LifeCLEF contests have been deep learning-based \cite{Joly_2017}. Reviews and proposals for these have been put forward and the field feels right for disruption \cite{Christin_2019,Lamba_2019}. Deep learning has been touted as a contender in solving problems with immediate application ranging from illegal trafficking of wildlife products to large scale automated ecosystem management tools - areas that are expensive and logistically expensive to manage \cite{Cantrell_2017,Christin_2019}
A lot of the challenges that prevented deep learning from having practical applications have been eliminated with advancements research on transfer learning and data augmentation \cite{Shorten_2019}. This has led to a reduction in the data required to make accurate world-class models. Furthermore, the recent wave in computer hardware innovation for GPU’s and CPU’s has also accelerated by reducing the cost of accessing the processing power required for accurate model development. 
Naturalists have been identifying species for the past two centuries, laying the foundations of the ecological science. However, even today, most of the taxonomic work and species identification work is still manual and reliant on a few domain experts. Therefore, to illustrate to non-experts how they can prototype these previously mysterious techniques this paper takes you step by step on the various stages and offers open-source code in form of an annotated Jupyter Notebook that can be used by anybody in the world to produce expert-level accuracy on whatever supervised species classification they want to carry out. The tutorial is designed in a way that it can be implemented in the lowest resourced environment and unlock great application in taxa image identification in ecology the world over that we can hardly imagine at the moment.
The code can be accessed from the Jupyter Notebook here: https://bit.ly/39woeLt