Federated Learning

Report on Current Developments in Federated Learning

General Direction of the Field

The field of Federated Learning (FL) is witnessing a significant shift towards addressing the complexities introduced by data heterogeneity and the dynamic nature of data distributions across clients. Recent advancements are focusing on developing more robust and adaptive algorithms that can handle the Non-Independent and Identically Distributed (Non-IID) data scenarios prevalent in real-world applications. This shift is driven by the need to improve model convergence rates, enhance model performance, and ensure the robustness of models against data perturbations and uncertainties.

One of the key areas of innovation is the development of personalized FL frameworks that can adapt to the changing data distributions across clients. These frameworks aim to balance the trade-offs between global model consistency and local model personalization, ensuring that the model can generalize well across diverse data environments. Techniques such as category decoupling, local data distribution reconstruction, and the use of generative models are being explored to mitigate the effects of data heterogeneity and improve the overall performance of FL systems.

Another important trend is the integration of distributionally robust optimization (DRO) techniques into FL. These methods aim to train models that are resilient to uncertainties and perturbations in the data, which is particularly crucial in scenarios where data is subject to feature and label uncertainty. By incorporating ambiguity sets based on metrics like the Wasserstein distance, researchers are developing algorithms that can provide theoretical guarantees on the out-of-sample performance of the models, even in the presence of data heterogeneity.

The decentralization of FL is also gaining traction, with a focus on developing algorithms that can operate effectively in decentralized settings without the need for a central server. These algorithms leverage local statistical characteristics and neighbor-based communication to achieve consensus on the global data distribution, thereby improving convergence rates and model performance in non-IID scenarios.

Noteworthy Papers

  • Personalized Federated Learning on Flowing Data Heterogeneity under Restricted Storage: Introduces a novel framework (pFedGRP) that addresses the challenges of dynamic data distributions in FL by reconstructing local data distributions and using a generator architecture for personalized aggregation.

  • FedCert: Federated Accuracy Certification: Proposes a method (FedCert) to evaluate the robustness of FL models against data perturbations by approximating certified accuracy based on client-specific data distributions, laying the groundwork for enhancing the dependability of decentralized learning.

  • A Federated Distributionally Robust Support Vector Machine with Mixture of Wasserstein Balls Ambiguity Set for Distributed Fault Diagnosis: Develops a distributionally robust SVM (FDR-SVM) using a Mixture of Wasserstein Balls ambiguity set, providing theoretical guarantees and practical algorithms for robust fault diagnosis in federated settings.

  • Distributionally Robust Clustered Federated Learning: A Case Study in Healthcare: Introduces Cross-silo Robust Clustered Federated Learning (CS-RCFL), which leverages Wasserstein distance to construct ambiguity sets and optimize clustering of clients to mitigate biases caused by heterogeneous data distributions.

  • FedEP: Tailoring Attention to Heterogeneous Data Distribution with Entropy Pooling for Decentralized Federated Learning: Proposes a decentralized FL aggregation algorithm (FedEP) that uses entropy pooling to tailor attention to heterogeneous data distributions, achieving faster convergence and higher test performance in non-IID scenarios.

Sources

Personalized Federated Learning on Flowing Data Heterogeneity under Restricted Storage

FedCert: Federated Accuracy Certification

A Federated Distributionally Robust Support Vector Machine with Mixture of Wasserstein Balls Ambiguity Set for Distributed Fault Diagnosis

Distributionally Robust Clustered Federated Learning: A Case Study in Healthcare

FedEP: Tailoring Attention to Heterogeneous Data Distribution with Entropy Pooling for Decentralized Federated Learning

Built with on top of