Advances in Facial Expression Recognition and Human Image Synthesis

The field of facial expression recognition and human image synthesis is rapidly advancing, with a focus on improving the nuance and controllability of emotional expressions. Recent developments have led to the creation of more sophisticated models that can capture subtle variations in emotions and generate highly realistic human images. Notably, the use of ordinal ranking and learning-to-rank frameworks has enhanced the ability of AI systems to interpret emotional nuances, while advances in diffusion-based methods have improved the quality and controllability of generated images. Furthermore, research has explored the importance of disentangling factors such as viewpoint, pose, clothing, and identity in human image synthesis, leading to more flexible and generalizable models. Noteworthy papers include: Rank-O-ToM, which proposes a novel framework for enhancing affective Theory of Mind by leveraging ordinal ranking to align confidence levels with the emotional spectrum. HAPI, which introduces a learning-to-rank framework that leverages human feedback to generate more realistic and socially resonant robotic facial expressions. NullSwap, which presents a proactive defense approach that cloaks source image identities and nullifies face swapping under a pure black-box scenario.

Sources

Rank-O-ToM: Unlocking Emotional Nuance Ranking to Enhance Affective Theory-of-Mind

HAPI: A Model for Learning Robot Facial Expressions from Human Preferences

NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping

HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation

DeClotH: Decomposable 3D Cloth and Human Body Reconstruction from a Single Image

MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation

EmoHead: Emotional Talking Head via Manipulating Semantic Expression Parameters

Exploring Disentangled and Controllable Human Image Synthesis: From End-to-End to Stage-by-Stage

ITA-MDT: Image-Timestep-Adaptive Masked Diffusion Transformer Framework for Image-Based Virtual Try-On

Evaluating Facial Expression Recognition Datasets for Deep Learning: A Benchmark Study with Novel Similarity Metrics

Disentangled Source-Free Personalization for Facial Expression Recognition with Neutral Target Data

DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation

High-Fidelity Diffusion Face Swapping with ID-Constrained Facial Conditioning

Follow Your Motion: A Generic Temporal Consistency Portrait Editing Framework with Trajectory Guidance