Advancing 3D Generation and Human-Object Interaction Detection

The recent advancements in 3D generation and human-object interaction (HOI) detection have shown significant progress, particularly in enhancing realism and interactivity. In the realm of 3D generation, there is a notable shift towards integrating physics-grounded motion synthesis with text-to-3D frameworks, enabling the creation of photo-realistic 3D objects with accurate physical behaviors. This approach not only improves the fidelity of 3D models but also their dynamic interactions, making them more suitable for virtual and mixed-reality applications.

In HOI detection, the focus has been on improving the representation of diverse intra-category patterns and inter-category dependencies through innovative prompt distribution learning techniques. These methods enhance the ability of detectors to accurately identify uncommon visual patterns and distinguish between ambiguous HOIs, leading to more robust and versatile HOI recognition systems. Additionally, the incorporation of spatial context learning has further refined HOI detection by leveraging background and surroundings, which is crucial for scenarios where foreground instances are occluded or blurred.

Noteworthy developments include a unified model for 3D human-object interactions that operates in any direction, significantly enhancing both qualitative and quantitative metrics. Another standout is the synergy between 2D and 3D diffusion models for realistic image-to-3D generation, ensuring high-fidelity geometry and texture while maintaining 3D consistency. These innovations collectively push the boundaries of what is possible in 3D generation and HOI detection, paving the way for more immersive and interactive digital experiences.

Sources

Text-to-3D Gaussian Splatting with Physics-Grounded Motion Generation

TriDi: Trilateral Diffusion of 3D Humans, Objects, and Interactions

Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy

Orchestrating the Symphony of Prompt Distribution Learning for Human-Object Interaction Detection

ContextHOI: Spatial Context Learning for Human-Object Interaction Detection

LIVE-GS: LLM Powers Interactive VR by Enhancing Gaussian Splatting

SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing

Built with on top of