Advances in 3D Generation and Scene Understanding

The field of 3D generation and scene understanding is rapidly evolving, with a focus on developing more efficient and effective methods for generating high-quality 3D models and understanding complex scenes. Recent research has explored the use of large language models, generative adversarial networks, and diffusion models to improve the accuracy and fidelity of 3D generation. Additionally, there is a growing interest in developing methods that can understand and generate 3D scenes in a more human-like way, such as by incorporating knowledge of object relationships and scene semantics. Noteworthy papers in this area include DreamLLM-3D, which presents a novel approach for affective dream reliving using large language models and 3D generative AI, and HSM, which introduces a hierarchical framework for indoor scene generation with dense object arrangements. Other notable papers include RelTriple, which enhances furniture distribution by learning spacing relationships between objects and regions, and SparseFlex, which enables differentiable mesh reconstruction at high resolutions directly from rendering losses.

Sources

DreamLLM-3D: Affective Dream Reliving using Large Language Model and 3D Generative AI

iFlame: Interleaving Full and Linear Attention for Efficient Mesh Generation

HSM: Hierarchical Scene Motifs for Multi-Scale Indoor Scene Generation

PVChat: Personalized Video Chat with One-Shot Learning

Geometric Constrained Non-Line-of-Sight Imaging

Decorum: A Language-Based Approach For Style-Conditioned Synthesis of Indoor 3D Scenes

Global-Local Tree Search for Language Guided 3D Scene Generation

Training-Free Personalization via Retrieval and Reasoning on Fingerprints

MC-LLaVA: Multi-Concept Personalized Vision-Language Model

Learning 3D Object Spatial Relationships from Pre-trained 2D Diffusion Models

RelTriple: Learning Plausible Indoor Layouts by Integrating Relationship Triples into the Diffusion Process

MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation

SparseFlex: High-Resolution and Arbitrary-Topology 3D Shape Modeling

CTRL-O: Language-Controllable Object-Centric Visual Representation Learning

Shape Generation via Weight Space Learning

Hi3DGen: High-fidelity 3D Geometry Generation from Images via Normal Bridging

Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion