Privacy-Preserving Innovations in Machine Learning

The recent advancements in the field of privacy-preserving technologies and machine learning have seen significant innovations, particularly in the areas of Quantitative Information Flow (QIF), Private Information Retrieval (PIR), and Differential Privacy (DP). Researchers are increasingly focusing on developing optimal solutions for minimizing information leakage in fixed systems, as evidenced by the exploration of exact-guessing and s-distinguishing adversaries in QIF applications. Additionally, the challenge of privately retrieving counterfactual explanations while maintaining immutable features has been addressed, introducing practical solutions for real-world scenarios such as website fingerprinting defense.

In the realm of DP, auditing procedures for DP-SGD with shuffling have been introduced, revealing significant overestimations in privacy guarantees. This work underscores the importance of rigorous auditing to ensure the integrity of privacy claims in machine learning models. Furthermore, the study of learning problems in Euclidean spaces has led to new insights into the expressivity of reductions and the role of randomness, challenging previous assumptions about the VC dimension and its implications for learning algorithms.

Noteworthy papers include one that proposes improved PIR schemes using matching vectors and derivatives, significantly reducing communication complexity, and another that frames patent novelty as a textual entailment problem, introducing a novel dataset and demonstrating the effectiveness of large language models in predicting patent claim revisions. These contributions not only advance the theoretical underpinnings of privacy and machine learning but also offer practical solutions that could impact real-world applications.

Sources

Self-Defense: Optimal QIF Solutions and Application to Website Fingerprinting

Private Counterfactual Retrieval With Immutable Features

To Shuffle or not to Shuffle: Auditing DP-SGD with Shuffling

On Reductions and Representations of Learning Problems in Euclidean Spaces

Improved PIR Schemes using Matching Vectors and Derivatives

PatentEdits: Framing Patent Novelty as Textual Entailment

Differentially Private Learning Beyond the Classical Dimensionality Regime

$d_X$-Privacy for Text and the Curse of Dimensionality

Built with on top of