Machine Learning Robustness, Privacy, and Security

Report on Current Developments in the Research Area

General Direction of the Field

The recent advancements in the research area are predominantly focused on enhancing the robustness, privacy, and security of machine learning models, particularly in the context of data attribution, adversarial attacks, and diffusion models. The field is moving towards developing more resilient and secure systems that can withstand adversarial manipulations and protect user privacy. This shift is driven by the increasing awareness of the vulnerabilities in current models and the potential for malicious exploitation.

One of the key areas of innovation is the development of methods to detect and mitigate adversarial attacks on data attribution. Researchers are exploring novel techniques to quantify the contribution of individual training data points to model outputs, while also ensuring that these attributions are robust against adversarial tampering. This includes the design of sophisticated threat models and the creation of principled attack methods that can systematically manipulate data attribution.

Another significant trend is the integration of secure multi-party computation (MPC) techniques into diffusion models to address privacy concerns. This approach aims to protect user privacy during the sampling process by securely computing nonlinear activations, thereby reducing computation and communication costs. The focus is on creating versatile and universal frameworks that can be widely implemented across various diffusion model-based tasks.

Additionally, there is a growing interest in developing tools that empower users to detect unauthorized use of their data in training deep learning models. These tools leverage membership inference techniques to trace data provenance and provide users with the ability to audit whether their data has been used without consent.

Finally, the field is exploring new attack surfaces in distributed learning, particularly in the context of gradient inversion. Researchers are proposing novel methods that use denoising diffusion models as strong priors to enhance recovery of sensitive information, even in large batch settings.

Noteworthy Papers

  • Influence-based Attributions can be Manipulated: This work highlights the vulnerability of influence-based attributions to adversarial attacks, raising important questions about their reliability under adversarial circumstances.

  • CipherDM: Secure Three-Party Inference for Diffusion Model Sampling: CipherDM introduces a novel framework for secure sampling in diffusion models, significantly improving running time and reducing communication costs.

  • Adversarial Attacks on Data Attribution: This paper bridges the gap in understanding the adversarial robustness of data attribution methods, proposing two innovative attack methods with significant impact.

  • Catch Me if You Can: Detecting Unauthorized Data Use in Deep Learning Models: MembershipTracker empowers users to detect unauthorized data use, demonstrating high effectiveness across various settings.

  • Exploring User-level Gradient Inversion with a Diffusion Prior: This work introduces a novel gradient inversion attack that leverages diffusion models to enhance recovery of sensitive information.

  • CPSample: Classifier Protected Sampling for Guarding Training Data During Diffusion: CPSample prevents training data replication while preserving image quality, offering greater robustness against membership inference attacks.

Sources

Influence-based Attributions can be Manipulated

CipherDM: Secure Three-Party Inference for Diffusion Model Sampling

Adversarial Attacks on Data Attribution

Catch Me if You Can: Detecting Unauthorized Data Use in Deep Learning Models

Exploring User-level Gradient Inversion with a Diffusion Prior

CPSample: Classifier Protected Sampling for Guarding Training Data During Diffusion