Whereas model training requires the use of Graphical Processing Units (GPUs) to quickly carry out the matrix multiplications that are implicit to AI models, inference can be easily achieved using the CPU - and hence in the code, we turn to use CPU for model inference because it is cheaper and faster for that task. Whereas there are more advanced ways in the Fast.ai library to carry multi-image classification synchronously, in our case we will classify only one image at a time \cite{howard2018fastai}.
Refer to #CODE BLOCKS 19,20 # on Jupyter Notebook