The field of human-object interaction (HOI) synthesis is rapidly evolving, with a focus on developing more realistic and physically plausible models. Recent research has explored the use of diffusion-based methods, vision-language models, and multimodal priors to improve the accuracy and diversity of HOI synthesis. These innovations have enabled the generation of more complex and realistic interactions between humans and objects, with applications in areas such as virtual reality, robotics, and animation. Noteworthy papers in this area include those that propose novel frameworks for HOI synthesis, such as the use of autoregressive diffusion models and layout-instructed diffusion models. Notable papers:
- Auto-Regressive Diffusion for Generating 3D Human-Object Interactions presents a novel autoregressive diffusion model that predicts the next continuous token in a sequence of human-object interactions.
- Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model introduces a specialized layout representation for hands and objects, enabling effective disentanglement of hand modeling and object adaptation.