Figure 3. Network structure diagram of the SegNet model. The SegNet model is a symmetric structure, mainly including Encoder and Decoder stages. The Encoder stage classifies and analyzes low-level local pixel values of images through Convolution Layer, ReLU function, Batch Normalization function, and Pooling Layer to obtain high-level semantic information. The Decoder stage uses Pooling Indices, UpSampling Layer, Convolution Layer, ReLU function, and Batch Normalization function to improve the geometric shape of objects and make up for the loss of detail caused by shrinking the object in the Pooling Layer of the Encoder stage. Finally, the extracted feature image is output as a segmentation map through the Sigmoid function. This model inputs an image with three RGB channels and outputs a segmentation map with one channel.