Advancements in 3D Reconstruction and Motion Capture Using Human Semantics

The recent developments in the field of 3D reconstruction and motion capture from multi-view images and videos highlight a significant shift towards leveraging human semantics and motion as a means to simplify and enhance the accuracy of camera calibration and scene reconstruction. A notable trend is the utilization of human pose and shape estimation to initialize and refine camera parameters and dynamic scene representations, thereby reducing reliance on traditional calibration tools and background features. This approach not only addresses the challenges posed by inter-person interactions and occlusions but also significantly improves the efficiency and accuracy of reconstructing dynamic scenes from unsynchronized and uncalibrated videos. Furthermore, advancements in neural implicit surface reconstruction for indoor scenes from sparse views demonstrate the potential of integrating novel priors and matching strategies to overcome scale ambiguity and improve reconstruction quality with limited input data.

Noteworthy Papers

  • Simultaneously Recovering Multi-Person Meshes and Multi-View Cameras with Human Semantics: Introduces a method for dynamic multi-person motion capture with uncalibrated cameras, utilizing human semantics to estimate camera parameters and human meshes in a one-step reconstruction.
  • Humans as a Calibration Pattern: Dynamic 3D Scene Reconstruction from Unsynchronized and Uncalibrated Videos: Proposes a novel approach to generate dynamic neural fields from unsynchronized videos with unknown poses by leveraging human motion, achieving accurate spatiotemporal calibration and high-quality scene reconstruction.
  • Sparis: Neural Implicit Surface Reconstruction of Indoor Scenes from Sparse Views: Presents a new method for indoor surface reconstruction from sparse views, employing a novel prior based on inter-image matching information to enhance reconstruction accuracy.

Sources

Simultaneously Recovering Multi-Person Meshes and Multi-View Cameras with Human Semantics

Humans as a Calibration Pattern: Dynamic 3D Scene Reconstruction from Unsynchronized and Uncalibrated Videos

Sparis: Neural Implicit Surface Reconstruction of Indoor Scenes from Sparse Views

Built with on top of