In ecology, however, the field is still in its infancy, a literature review
puts both peers and non-peer reviewed papers at 46 as of
April 2018, mostly using CNNs and RNNS and non using a recent technique Deep
Reinforcement Learning \cite{Christin_2019}. This is despite its potential to revolutionalize
applied ecology in identification and classification of species,
behavioural studies, population monitoring and citizen science,
ecosystem management and conservation \cite{Christin_2019,Lamba_2019}. Further still, the techniques used remain cryptic and inaccessible to most ecologists who are experts in their domains but who have no experience with these techniques.
Ecology is particurarly ripe for the applications of deep learning owing to the
explosion of complex ecological datasets over the past few years ranging from
genomic to ecosystem-scale data, also known as Big Data. The Big data
derived from the increasingly sophisticated automatic monitoring by sensors can no
longer be manually processed as it is redundant, tedious, time
consuming and sometimes too complex for human beings to comprehend \cite{Weinstein_2017,Norouzzadeh_2018}, hence the need to use more efficient strategies for this.
Deep learning is specifically better than other methods in dealing with non-linear
complex data commonly encountered in
ecology \cite{Christin_2019}. In fact, all winning methods for the most recent LifeCLEF
contests have been deep learning-based \cite{Joly_2017}. Reviews and proposals for these have been put forward and the field
feels right for disruption \cite{Christin_2019,Lamba_2019}. Deep
learning has been touted as a contender in solving problems with immediate application ranging from
illegal trafficking of wildlife products to large scale automated
ecosystem management tools - areas that are expensive and logistically
expensive to manage \cite{Cantrell_2017,Christin_2019}.
But why now? A lot of the challenges that prevented deep learning from having practical applications have being eliminated with groundbreaking research on transfer learning and
data augmentation \cite{Shorten_2019}. This has led to a reduction in the data required to make accurate world-class models. Furthermore, the recent wave in
computer hardware innovation for GPU’s and CPU’s has also accelerated by
reducing the cost of accessing the processing power required for
accurate model development - which is heavy on matrix multiplications. The overall move from the AI winters of the
past and a prediction for the “singularity” also poses interesting
opportunities for the future. Life scientists such as ecologists, therefore, need to jump into this
bandwagon and take life science to the next level.
However, all is not rosy - deep learning is still theory-heavy and
difficult to implement and domain life scientists do not have the time or
expertise to delve into these powerful tools. Traditional deep
learning is a complicated path that involves evaluating the tools,
parameters, datasets, training time and computing power. For this reasons , it still remains siloed in well funded research labs and big technology
companies who rarely have the incentive to publicize their processes.
However, most of the taxonomic work and species identification work is still manual and reliant on a few experts. scientists advanced identification work such as wildlife identification and analysis of fish abundance \cite{Miao_2019,Ditria_2019}; Classical naturalist have identified species for the non-expert past two centuries laying the foundations of the ecological science that we thrive in today. Therefore to illustrate to non experts how easily they can prototype these previously mysterious tecnhiques this paper takes you step by step on the various stages and offers open source code in form of an annotated Jupyter Notebook that can be used by anybody in the world to produce world class
accuracy on whatever supervised species classification they want to carry out. The tutorial is designed in a way that it can be implemented in the lowest resourced environment and
unlock great application in species identification in ecology the world over that we can hardly
imagine at the moment.
Previously, without tight synergies between computer science
professionals and ecologists - deep learning work on ecological datasets
have proved difficult despite obvious benefits. The few deep learning
practitioners are in so much demand from far wealthier giant companies
starving the lesser funded ecological world off personnel. This roadmap
was popularized by the fast.ai course created by the visionary Jeremy Howard and
Rachel Thomas both scientists at the University of San Fransisco and their enthusiastic students who are now implementing these algorithms in other fields and turning whole modes of thinking in traditional industries inside out. This paper demonstrates how to build a simple species classifier that has world-class
accuracy. The code can be accessed from the Jupyter Notebook here:
https://bit.ly/39woeLtIMPLEMENTATION
Sea stars are important species in our understanding of marine invertebrate communities. Intertidal relationships between the sea star Pisaster ochraceus and the mussel Mytilus californicus was actually used to coin the term keystone species \cite{Paine_1966}. Following that classical study, it would, therefore, be interesting to use sea stars as model species to prototype the classifier AI system.Further, seastars have complicated morphology that might be a challenge even for expert humans - for this reasons we use these to prototype our AI system. Figure \ref{488749} illustrates our workflow to achieving a that minimum viable product: