Please note: Importing new articles from Word documents is currently unavailable. We are working on restoring normal service soon. We apologize for any inconvenience.

Yizhang Wang

and 3 more

Federated clustering (FC) is an emerging and important topic that clustering all the data from many different heterogeneous clients/devices while prohibiting clients from sharing raw data. However, for existing works, there are some problems: (1) federated learning performs well in independent and identically distributed (IID) scenarios, but for Non-IID scenarios (Non-Identical class distribution), it is hard to collaboratively train a clustering algorithm based on global similarity measure while keeping all data local. (2) Some federated clustering algorithms have good performance, but their communication costs are high. It is difficult to balance communication costs and clustering effectiveness. In this paper, we propose new federated k-means clustering framework to solve the above two problems and balance communication costs and clustering effectiveness. (1) For the clients, we use cluster centers (representative points) genearate by K-means to represent the corresponding clusters because these representative points form density backbone of clusters and can effectively preserve the structure of the local data. (2) For the server, we propose two methods to reprocess these uploaded encrypted representative points to obtain better final cluster centers, one uses K-means and the other takes the improved density peaks (density cores) as final centers, and then send them back to the clients. Finally, each client assign local data to the nearest centers. The experimental results show that the proposed method performs better than some centralized (non-federated) classical clustering algorithms (K-means, DBSCAN and density peak clustering) and state-of-the-art (SOTA) centralized clustering algorithms in most cases. In particular, the proposed algorithms performs better than SOTA federated clustering framework k-FED (ICML2021) and MUFC (ICLR2023).