Personalized Image Generation and Text-to-Image Synthesis

The field of image generation and text-to-image synthesis is rapidly advancing, with a focus on personalized and controllable generation. Recent developments have centered around improving the quality and consistency of generated images, as well as enhancing the ability to incorporate user preferences and styles. Notable advancements include the use of disentangled representation learning, direct preference optimization, and self-supervised training methods. These innovations have led to significant improvements in the realism and coherence of generated images, as well as the ability to tailor generation to specific user needs and preferences.

Particularly noteworthy papers include: DuoLoRA, which proposes a content-style personalization framework that outperforms state-of-the-art methods. RefVNLI, which introduces a cost-effective metric for evaluating subject-driven text-to-image generation that outperforms existing baselines. DreamO, which presents a unified framework for image customization that facilitates seamless integration of multiple conditions. SUDO, which optimizes both fine-grained details and global image quality in text-to-image diffusion models. DSPO, which aligns instance-level human preferences in real-world image super-resolution using semantic guidance. FreeGraftor, which enables precise subject identity transfer in subject-driven text-to-image generation without requiring model fine-tuning or additional training. DRC, which enhances personalized image generation via disentangled representation composition and mitigates the guidance collapse issue.

Sources

DuoLoRA : Cycle-consistent and Rank-disentangled Content-Style Personalization

Towards NSFW-Free Text-to-Image Generation via Safety-Constraint Direct Preference Optimization

SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization

Insert Anything: Image Insertion via In-Context Editing in DiT

DSPO: Direct Semantic Preference Optimization for Real-World Image Super-Resolution

LLM-Enabled Style and Content Regularization for Personalized Text-to-Image Generation

FreeGraftor: Training-Free Cross-Image Feature Grafting for Subject-Driven Text-to-Image Generation

DreamO: A Unified Framework for Image Customization

DRC: Enhancing Personalized Image Generation via Disentangled Representation Composition

RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation

Built with on top of