The workflow of the CNN model on the training dataset is shown with the number of input and output features and the batch normalization (Figure 2). The first layer, input layer, consists of neurons that are connected to individual inputs those are passed to the next layer (embedding layer), without having any weights (or biases). Embedding layer transforms each word into a vector of a pre-determined length. (The vocabulary is first encoded as a series of integers, and then the embedding layer retrieves the embedding vector for each word-index.) Convolutional layer is the core component of a CNN. It has a collection of filters (128 in this case, Figure 2) whose settings will be learnt as part of the training process. In most of the cases, the length of the sequence is less than the size of the filters. Each filter is used to generate an activation map by convolving with the input. Average pooling is used to construct a down-sampled (pooled) feature map by computing the average for each patch of the feature map. Normalization in a Neural Network is called ”Batch Norm,” and it takes place between the layers of the network rather than with the raw data itself. Mini-batches are processed instead of the entire dataset at once. Learning is simplified as the pace of the instruction is increased. The Global Average Pooling-1D layer is used to reduce the dimensionality of the input data after the convolutional layers. Dense Layer is a layer of neurons within which each neuron gets input from all the neurons in the preceding layer, thus becomes “dense” and fully connected. Output layer is typically the last layer in the network, and it consists of neurons that represent the different classes or categories that the network is trained to predict. Here, two dense layers were added, each with RELU activation function and batch-normalization. The first dense layer has 256 units, and the second dense layer has 128 units respectively. The final Dense layer has 8 units, representing the number of classes in the classification task. The activation function used here is softmax, which converts the output into probabilities for each class. The output layer neurons are connected to all the neurons in the previous layer, and the weights and biases of the neurons in the output layer determine the strength of the predictions for each class.