Abstract

In machine learning, there are many powerful classification algorithm with ability to classify data with high accuracy. However, most of them lack transparencies in rationalizing their decisions and to explain the data in a human-friendly form. The lack of transparency may hinder the usage of machine learning in critical problems.
Visualization is the most widely used technique for understanding complex data. Complex data are likely to be high dimensional, therefore visualizing them requires dimensional reduction algorithm to reduce them in two- or three- dimensional space. Although very intuitive, it is still necessary to make effort to extract information in the human-friendly form, semantics. Especially for labeled data, where the objective is to highlight classes distribution and data structures. Therefore, for such type of data, it is beneficial to have a transparent classification algorithm, which is able to classify labeled data while giving auxiliary information in the form of semantical explanation.  
In this paper, Cumulative Fuzzy Class Membership Criterion (CFCMC), a recently proposed classification algorithm, is modified and used for novel approach of information extraction about data structure and distribution of classes in labeled data directly in the form of semantics. 
CFCMC is a novel fuzzy modeling approach based on creating Cauchy-like bell shaped function, considered as fuzzy membership function, around each training input. Decision boundaries for each class in feature space are created by accumulating fuzzy membership functions of all training inputs belonging to the same class. Every pattern of the same class shares the same value of fuzzy membership function parameters, which are optimized using well-known Simulated Annealing to achieve highest possible classification accuracy for training set. The modification of CFCMC is based on clustering of the each class of the input data using k-means algorithm. The number of clusters is estimated via the gap statistic. Unlike the first variation of CFCMC algorithm, the same value of fuzzy membership function parameters share every pattern of the same cluster.
Semantic extraction is based on post-processing of CFCMC structure giving auxiliary output about data structures and similarity between classes resulting in  explanatory mechanism that explain the complexity of particular data in the form of semantics. It is able to explain, why the problem is easy or hard.
The focus of the experiments is to compare the performance of CFCMC against other classifiers, namely Multi-layer Perceptron (MLP), Support Vector Machine (SVM), 1 - Nearest Neighbor (1 - NN) and Membership Function ARTMAP (MF ARTMAP) and to validate explanatory mechanism of extracted semantics using visualization of classified data and comparing with contingency tables obtained during evaluating of classification performance.
Our initial empirical results shows that CFCMC is not necessarily the best classifier, although, in the most cases, it is not too far from the best performing methods, such as SVM and MLP. However, the semantical explanation offers useful information in understanding the problems and the generalization abilities of the classifier. This transparency potentially allows the classifier to be applied into real-world problem where compliance to the user are critical.