The recent advancements in 3D scene understanding have significantly shifted towards dynamic and interactive environments, emphasizing the need for real-time updates and long-term consistency. Innovations in tracking dynamic objects, particularly from egocentric viewpoints, have shown substantial improvements in accuracy and smoothness of 6DoF object trajectories. These developments are crucial for robotic applications requiring precise object retrieval and manipulation in changing environments. Additionally, the integration of generative models with motion field priors has enhanced the reliability of motion prediction, especially in sparse data scenarios, contributing to safer autonomous navigation. The introduction of probabilistic Gaussian superposition models has also advanced the efficiency and accuracy of 3D occupancy prediction, addressing the spatial sparsity of driving scenes. Overall, the field is progressing towards more holistic and adaptive scene understanding, leveraging advanced machine learning techniques and probabilistic modeling to handle the complexities of dynamic and interactive environments.