Enhancing Realism and Functionality in 3D Modeling, Simulation, and Robotics

Advances in 3D Modeling, Simulation, and Robotics

Recent developments in the field have seen significant strides in the areas of 3D modeling, simulation, and robotics, particularly in enhancing the realism and functionality of digital environments and robotic systems. 3D modeling has advanced with innovations in monocular depth estimation and food portion estimation, leveraging advancements in 3D reconstruction and generative models to improve accuracy and applicability in real-world scenarios. Simulation techniques have seen improvements in metric depth estimation and the synthesis of realistic materials, addressing long-standing challenges in scene understanding and material representation. Robotics has benefited from new simulation platforms that facilitate the learning of robotic skills in unbounded soft environments, enabling more efficient data collection and policy evaluation.

Noteworthy contributions include:

  • A framework for accurate food portion estimation using monocular images, enhancing dietary monitoring.
  • A novel approach to metric depth estimation that leverages generative diffusion models for improved scene understanding.
  • A simulation platform designed for robotic skill learning in unbounded soft environments, reducing computational costs and storage requirements.
  • An innovative method for 4D dynamic scene simulation that integrates multi-modal foundation models and video diffusion for enhanced realism and flexibility.

Sources

MFP3D: Monocular Food Portion Estimation Leveraging 3D Point Clouds

MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation

Towards Mitigating Sim2Real Gaps: A Formal Quantitative Approach

MVLight: Relightable Text-to-3D Generation via Light-conditioned Multi-View Diffusion

RoboGSim: A Real2Sim2Real Robotic Gaussian Splatting Simulator

NeuMaDiff: Neural Material Synthesis via Hyperdiffusion

MTFusion: Reconstructing Any 3D Object from Single Image Using Multi-word Textual Inversion

UBSoft: A Simulation Platform for Robotic Skill Learning in Unbounded Soft Environments

Automated 3D Physical Simulation of Open-world Scene with Gaussian Splatting

XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation

Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation

Built with on top of