Advances in Adversarial Robustness and Watermarking for Speech and Image Synthesis

The field of speech and image synthesis is rapidly advancing, with a focus on improving adversarial robustness and developing effective watermarking techniques. Recent research has explored the use of generative adversarial networks (GANs) and optimal transport theory to improve the naturalness of generated speech samples. Additionally, researchers have proposed various watermarking methods, including temporal-aware robust watermarking and low-rank adaptation, to protect speech synthesis models from unauthorized use. In the area of image synthesis, frequency-domain learning with kernel prior has shown promise in improving the generalization capabilities of deep learning methods for image deblurring. Furthermore, advancements in deepfake detection have led to the development of spatial-frequency collaborative learning and hierarchical cross-modal fusion methods. Notable papers in this area include the proposal of a Collective Learning Mechanism-based Optimal Transport GAN model, which achieves state-of-the-art results in voice conversion tasks, and the introduction of a novel generative watermarking method called SOLIDO, which ensures high-fidelity watermarked speech and achieves high extraction accuracy against common attacks.

Sources

Collective Learning Mechanism based Optimal Transport Generative Adversarial Network for Non-parallel Voice Conversion

Frequency-domain Learning with Kernel Prior for Blind Image Deblurring

Protecting Your Voice: Temporal-aware Robust Watermarking

Fast Adversarial Training with Weak-to-Strong Spatial-Temporal Consistency in the Frequency Domain on Videos

SOLIDO: A Robust Watermarking Method for Speech Synthesis via Low-Rank Adaptation

Quantifying Source Speaker Leakage in One-to-One Voice Conversion

Towards Generalizable Deepfake Detection with Spatial-Frequency Collaborative Learning and Hierarchical Cross-Modal Fusion

CoheMark: A Novel Sentence-Level Watermark for Enhanced Text Quality

Unified Attacks to Large Language Model Watermarks: Spoofing and Scrubbing in Unauthorized Knowledge Distillation

Built with on top of