4. Experimental Evaluation
In this section, the experimental evaluation is presented, describing the data, the performance measures, the baseline algorithms, the results and other additional information.
4.1. Materials and Methods
4.1.1. Methods and Datasets:
Two classification methods, logistic regression and the K-nearest neighbor method, were chosen as the baseline methods; both were compared with the BireyselValue method in terms of the performance metrics mentioned below. Moreover, six different multiclass datasets were selected from several domains to evaluate the compression effect. The datasets were obtained from two repositories, as outlined in (\ref{384212}). The datasets were randomly split into training and testing sets. Notably, none of the preprocessing, preparation, or cleaning steps were performed on the datasets.
However, to satisfy the conditions for employing the BireyselValue method outlined above in (2.2), a dedicated function from the BireyselValue package in \cite{dahman2024a} was employed; as a result, a balanced training dataset was used for the three methods. The sizes of the original dataset, the numbers of classes, the sizes of the training and testing datasets, and the overall accuracy results are illustrated in Fig. (\ref{943876}) and Fig. (\ref{268286}). Notably, the scatter plots are created using two values: the first is the value from equation (\ref{eq:4}), and the second is the index of the observation. Overall, the scatter plot represents the overlapping classes of each dataset.