UFO Icon UFO-4D: Unposed Feedforward 4D Reconstruction from Two Images

Google     Princeton University     Harvard University
ICLR 2026

Feedforward 4D reconstruction from just two unposed images

How it works

UFO-4D predicts dynamic 3D Gaussians and camera poses from a pair of unposed images in a single feedforward pass. This unified representation enables rendering of image, 3D geometry, and 3D motion at any intermediate view or timestamp using just one model. Via semi-supervised training through the differentiable 4D rasterizer, UFO-4D unlocks three major advantages:

  • A dense, self-supervised photometric loss to mitigate the scarcity of training data with 4D annotation.
  • Tight coupling of appearance, geometry, and motion with 3D Gaussians, for mutual regularization.
  • High-fidelity appearance, 3D geometry, and 3D motion rendering at any novel view or point in time.

4D Interpolation

Given predicted dynamic 3D Gaussians, we can rasterize images, depth, and motion at interpolated timestamps and views. Both depth and motion are defined in the canonical camera coordinate. In motion visualization, only moving objects are highlighted in non-white colors.

Small/medium motion

UFO-4D successfully interpolates and renders high-quality images, depth, and motion from predicted dynamic 3D Gaussians.


Input pairs
4D Interpolated image
Interpolated depth
Interpolated motion

Large motion

Even on scenarios with large motion, UFO-4D robustly outputs all estimates from unposed two images.


Input pairs
4D Interpolated image
Interpolated depth
Interpolated motion

Extreme motion (stress test)

For extreme motion or minimal overlap, UFO-4D accurately estimates camera motion and static geometry, though dynamic object motion becomes challenging.

Input pairs
4D Interpolated image
Interpolated depth
Interpolated motion

Qualitative comparison

UFO-4D provides superior depth and motion accuracy under significant camera rotation and large object motion. It effectively disentangles dynamic motion from ego-motion while preserving clear motion boundaries and geometric consistency.

Stereo4D

Input Pair
DynaDUSt3R
ZeroMSF
St4RTrack
UFO-4D (Ours)
Ground truth

Bonn

Input Pair
DynaDUSt3R
ZeroMSF
St4RTrack
UFO-4D (Ours)
Ground truth

KITTI

Input Pair
DynaDUSt3R
ZeroMSF
St4RTrack
UFO-4D (Ours)
Ground truth

BibTeX

@inproceedings{Hur:2026:UFO,
  title = {{UFO-4D}: Unposed Feedforward 4{D} Reconstruction from Two Images},
  author = {Hur, Junhwa and Herrmann, Charles and Peng, Songyou and Henzler, Philipp and Ma, Zeyu and Zickler, Todd and Sun, Deqing},
  booktitle = {ICLR},
  year = {2026}
}