Federated Learning and Privacy-Preserving Machine Learning

Current Developments in Federated Learning and Privacy-Preserving Machine Learning

The field of federated learning (FL) and privacy-preserving machine learning (PPML) has seen significant advancements over the past week, driven by a focus on enhancing privacy, security, and efficiency in distributed learning environments. The research community is increasingly addressing the challenges of data heterogeneity, privacy leakage, and adversarial threats, while also exploring novel applications in critical domains such as medical imaging and digital forensics.

General Trends and Innovations

  1. Federated Learning for Insider Threat Detection: There is a growing interest in applying federated learning to detect insider threats in distributed environments. This approach addresses the privacy concerns associated with sharing sensitive user behavior data across multiple locations. Innovations in this area include the use of generative models to handle non-Independent and Identically Distributed (non-IID) data and the integration of self-normalized neural networks to improve detection accuracy.

  2. Information-Theoretic Approaches to Privacy Metrics: Researchers are developing new information-theoretic metrics to quantify privacy leakage in machine learning systems. These metrics aim to formalize the asymptotic behavior of privacy measures, providing a rigorous framework for evaluating privacy degradation as the number of observations increases. This work extends previous research by offering a more generalized set of metrics that encompass known measures like mutual information and maximal leakage.

  3. Data Poisoning and Leakage in Federated Learning: The risks of data poisoning and leakage in FL are being thoroughly investigated. Recent studies highlight the importance of perturbing raw gradient updates with randomized noise to mitigate privacy threats. Additionally, there is a focus on understanding the impact of data poisoning attacks and developing dynamic model perturbation techniques to enhance privacy protection and model resilience.

  4. Differentially Private Federated Learning: Advances in differentially private federated learning (DPFL) are being made to improve model utility without compromising privacy. Novel methods leverage personalized model-sharing and sharpness-aware minimization to mitigate the adverse effects of noise addition and clipping. These approaches are shown to enhance the privacy-utility trade-off, particularly in settings with heterogeneous data.

  5. Privacy-Preserving Techniques in Federated Learning: New privacy mechanisms are being introduced to balance privacy guarantees, communication efficiency, and model accuracy. Techniques such as correlated binary stochastic quantization and secure multi-party computation are being explored to achieve differential privacy while maintaining model performance. These methods are particularly effective in settings where data is distributed across multiple clients.

  6. Outlier Detection and Data Distribution Shifts: The detection of global outliers and data distribution shifts in FL is being studied as a privacy issue. Researchers are developing strategies to detect subtle temporal shifts in data distribution, which could reveal sensitive information about production processes or other private activities. These methods aim to provide better evaluation metrics for detecting distributional shifts than traditional approaches.

  7. Privacy-Preserving Data Provision in Digital Forensics: In the context of digital forensics, particularly for driverless taxis, new approaches are being proposed to ensure the privacy of data providers and investigators during data upload and access. These methods use cryptographic techniques to verify data integrity, control data access, and issue warrants in a privacy-preserving manner.

  8. Gradient Inversion and Privacy Analysis: The problem of gradient inversion in FL is being addressed from a cryptographic perspective. By formulating the input reconstruction problem as a Hidden Subset Sum Problem, researchers are able to achieve perfect input reconstruction, providing insights into the limitations of existing empirical attacks. This work also explores the use of secure data aggregation techniques to defend against such attacks.

  9. Decentralized Federated Learning and Privacy: The privacy implications of decentralized federated learning (DFL) are being re-evaluated through an information-theoretical lens. Studies show that DFL generally offers stronger privacy preservation than centralized FL, particularly in scenarios where a fully trusted server is not available. This work highlights the importance of considering graph topology and privacy attacks in evaluating information leakage.

  10. Federated Learning in Medical Imaging: The application of FL to medical imaging, particularly in assessing stenosis severity in coronary angiography, is gaining traction. Federated detection transformers are being proposed to improve model generalization while preserving data privacy. These methods are particularly useful in settings where large, diverse datasets are challenging to aggregate due to privacy concerns.

Noteworthy Papers

  1. FedAT: Federated Adversarial Training for Distributed Insider Threat Detection: This paper introduces a novel FL approach for multiclass insider threat detection, addressing the challenges of non-IID data distribution and extreme class imbalance.

  2. DP$^2$-FedSAM: Enhancing Differentially Private Federated Learning Through Personalized Sharpness-Aware Minimization: The proposed method significantly improves the privacy-utility trade-off in DPFL, especially in heterogeneous data settings, by leveraging personalized model-sharing and sharpness-aware minimization.

  3. **

Sources

FedAT: Federated Adversarial Training for Distributed Insider Threat Detection

The Asymptotic Behaviour of Information Leakage Metrics

Data Poisoning and Leakage Analysis in Federated Learning

DP$^2$-FedSAM: Enhancing Differentially Private Federated Learning Through Personalized Sharpness-Aware Minimization

CorBin-FL: A Differentially Private Federated Learning Mechanism using Common Randomness

Data Distribution Shifts in (Industrial) Federated Learning as a Privacy Issue

Global Outlier Detection in a Federated Learning Setting with Isolation Forest

Towards Lightweight and Privacy-preserving Data Provision in Digital Forensics for Driverless Taxi

Perfect Gradient Inversion in Federated Learning: A New Paradigm from the Hidden Subset Sum Problem

Re-Evaluating Privacy in Centralized and Decentralized Learning: An Information-Theoretical and Empirical Study

FeDETR: a Federated Approach for Stenosis Detection in Coronary Angiography

PrivaMatch: A Privacy-Preserving DNA Matching Scheme for Forensic Investigation

UTrace: Poisoning Forensics for Private Collaborative Learning

Future-Proofing Medical Imaging with Privacy-Preserving Federated Learning and Uncertainty Quantification: A Review

PhD Forum: Efficient Privacy-Preserving Processing via Memory-Centric Computing

CryptoTrain: Fast Secure Training on Encrypted Datase

Built with on top of