Advancements in Adversarial Robustness and Model Security

The field of artificial intelligence is moving towards developing more robust and secure models, particularly in the context of adversarial attacks and data perturbations. Recent research has focused on enhancing model resistance to transferable adversarial examples, backdoor attacks, and input noise. Innovative approaches, such as trigger activation and proactive model robustification, have shown promising results in improving model security. Furthermore, the development of frameworks for evaluating model robustness, such as FLUKE and RoMA, has enabled more systematic and task-agnostic assessments of model vulnerability. Noteworthy papers in this area include: TrojanDam, which proposes a novel backdoor defense mechanism utilizing out-of-distribution samples, and Towards Model Resistant to Transferable Adversarial Examples via Trigger Activation, which introduces a new training paradigm for enhancing model robustness against transferable adversarial examples.

Sources

Towards Model Resistant to Transferable Adversarial Examples via Trigger Activation

TrojanDam: Detection-Free Backdoor Defense in Federated Learning through Proactive Model Robustification utilizing OOD Data

Impact of Noise on LLM-Models Performance in Abstraction and Reasoning Corpus (ARC) Tasks with Model Temperature Considerations

Property-Preserving Hashing for $\ell_1$-Distance Predicates: Applications to Countering Adversarial Input Attacks

FLUKE: A Linguistically-Driven and Task-Agnostic Framework for Robustness Evaluation

Evaluating the Vulnerability of ML-Based Ethereum Phishing Detectors to Single-Feature Adversarial Perturbations

Towards Robust LLMs: an Adversarial Robustness Measurement Framework

Built with on top of