Exploring Strategies for Privacy-Preserving Machine Learning in Distributed Environments

Suresh Dodda; Anoop Kumar; Navin Kamuni; Madan Mohan Tito Ayyalasomayajula

doi:10.36227/techrxiv.171340711.17793838/v2

loading page

Exploring Strategies for Privacy-Preserving Machine Learning in Distributed Environments

Suresh Dodda,
Anoop Kumar,
Navin Kamuni,
Madan Mohan Tito Ayyalasomayajula

Abstract

Machine Learning (ML) with distributed privacy preservation is growing in significance as it focuses on facilitating multi-party learning without requiring actual data sharing. This is especially helpful for companies that want to work together but are unable to do so because of ethical, regulatory, or budgetary constraints on sharing data. In order to address these issues, this study examines three privacy-preserving algorithms: regularized logistic regression with Differential Privacy (DP), stochastic gradient descent (SGD) with differentially private updates, and a distributed Lasso that distributes gradients among data centers. The study emphasizes the relationship between error rate and privacy through these algorithms. In order to improve error rates for large datasets, both DP algorithms modify their sensitivity dependent on the amount of data, highlighting the significance of training data volume in model performance in the study. Results demonstrate that using the SGD; error rate can be reduced by employing random projections in advance.

21 Apr 2024Submitted to TechRxiv

29 Apr 2024Published in TechRxiv

Abstract

Peer review timeline