Advancements in Safety and Efficiency of Text-to-Image Diffusion Models

The field of text-to-image (T2I) diffusion models is rapidly evolving, with a strong focus on enhancing safety, integrity, and efficiency in model deployment. Recent developments have introduced innovative frameworks and methodologies aimed at mitigating the risks associated with harmful content generation, ensuring model integrity, and improving the generalization and robustness of forensic detectors. A notable trend is the emphasis on creating adaptable and efficient defense mechanisms against Not-Safe-for-Work (NSFW) prompts, alongside efforts to verify the integrity of black-box T2I models. Additionally, there is a growing interest in addressing the challenges of model bias and the propagation of undesirable behaviors through novel training paradigms and attack strategies. These advancements collectively contribute to the safer and more responsible use of T2I diffusion models in various applications.

Noteworthy Papers

Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models: Introduces a bilevel optimization framework for pruned diffusion models, enhancing safety and efficiency.
SafeCFG: Redirecting Harmful Classifier-Free Guidance for Safe Generation: Presents the Harmful Guidance Redirector (HGR) for achieving high safety and quality in image generation.
PromptLA: Towards Integrity Verification of Black-box Text-to-Image Diffusion Models: Proposes a novel prompt selection algorithm for efficient integrity verification of T2I models.
A Bias-Free Training Paradigm for More General AI-generated Image Detection: Introduces B-Free, a bias-free training paradigm improving generalization and robustness of forensic detectors.
AEIOU: A Unified Defense Framework against NSFW Prompts in Text-to-Image Models: Develops AEIOU, a versatile defense framework against NSFW prompts with high accuracy and efficiency.
FameBias: Embedding Manipulation Bias Attack in Text-to-Image Models: Introduces FameBias, a T2I biasing attack that manipulates input embeddings without additional model training.

Advancements in Safety and Efficiency of Text-to-Image Diffusion Models

Noteworthy Papers

Sources