2026 Featured

Video Real2Sim (VR2S)

6.8300 final project: uses a video model to hallucinate orbital observations from one populated-scene photo, then reconstructs the scene with a pose-free 3D stack.

Project Page

Video Real2Sim explores a single-image route to scene reconstruction for populated real-world scenes. Given one photo, the pipeline uses a general video model to synthesize missing orbital observations, then feeds those views into a pose-free reconstruction stack.

The goal is not to claim that video replaces geometry. The project asks a narrower question: when the observed image does not expose the back side, contact regions, or other useful scene evidence, can generated video provide enough additional observations to improve reconstruction? In the final report, the strongest gains appear in perceptual and back-side reconstruction quality, with metric-level caveats and small exceptions surfaced explicitly.

This was my final project for MIT 6.8300 Advances in Computer Vision.

Technologies & Topics

Computer Vision3D ReconstructionVideo ModelsReal2Sim