The field of 3D scene reconstruction and generation is witnessing significant advancements, with a focus on creating immersive and realistic scenes. Researchers are exploring novel approaches to address challenges such as visual discontinuities, scene voids, and limited navigational freedom. One notable direction is the development of hierarchical layered 3D scene reconstruction frameworks, which integrate open-vocabulary segmentation models and diffusion models to restore occluded regions and generate high-quality scenes. Another area of research is the creation of unified evaluation benchmarks for world generation, enabling the assessment of diverse approaches and revealing key insights into their strengths and weaknesses. Furthermore, generative pipelines for synthesizing traversable 3D scenes from text prompts are being developed, leveraging panoramic videos as intermediate representations to model 360-degree details of scenes. Additionally, novel NeRF-based generative methods are being proposed to generate realistic room-level indoor scenes from in-the-wild images, addressing the challenge of estimating camera poses and achieving superior view-to-view consistency and semantic normality. Noteworthy papers include:
- Scene4U, which proposes a novel layered 3D scene reconstruction framework and achieves state-of-the-art results.
- WorldPrompter, which introduces a generative pipeline for synthesizing traversable 3D scenes from text prompts and demonstrates high-quality panoramic Gaussian splat reconstruction.
- LPA3D, which presents a novel NeRF-based generative approach for generating realistic room-level indoor scenes from in-the-wild images and achieves superior view-to-view consistency and semantic normality.