UFO-4D: Unposed Feedforward 4D Reconstruction from Two Images

Junhwa Hur Charles Herrmann Songyou Peng Philipp Henzler Zeyu Ma Todd Zickler Deqing Sun

Google Princeton University Harvard University

ICLR 2026

Feedforward 4D reconstruction from just two unposed images

How it works

UFO-4D predicts dynamic 3D Gaussians and camera poses from a pair of unposed images in a single feedforward pass. This unified representation enables rendering of image, 3D geometry, and 3D motion at any intermediate view or timestamp using just one model. Via semi-supervised training through the differentiable 4D rasterizer, UFO-4D unlocks three major advantages:

A dense, self-supervised photometric loss to mitigate the scarcity of training data with 4D annotation.
Tight coupling of appearance, geometry, and motion with 3D Gaussians, for mutual regularization.
High-fidelity appearance, 3D geometry, and 3D motion rendering at any novel view or point in time.

4D Interpolation

Given predicted dynamic 3D Gaussians, we can rasterize images, depth, and motion at interpolated timestamps and views. Both depth and motion are defined in the canonical camera coordinate. In motion visualization, only moving objects are highlighted in non-white colors.

UFO-4D successfully interpolates and renders high-quality images, depth, and motion from predicted dynamic 3D Gaussians.

Input pairs

4D Interpolated image

Interpolated depth

Interpolated motion

Even on scenarios with large motion, UFO-4D robustly outputs all estimates from unposed two images.

Input pairs

4D Interpolated image

Interpolated depth

Interpolated motion

For extreme motion or minimal overlap, UFO-4D accurately estimates camera motion and static geometry, though dynamic object motion becomes challenging.

Input pairs

4D Interpolated image

Interpolated depth

Interpolated motion

Qualitative comparison

UFO-4D provides superior depth and motion accuracy under significant camera rotation and large object motion. It effectively disentangles dynamic motion from ego-motion while preserving clear motion boundaries and geometric consistency.

Input Pair

DynaDUSt3R

ZeroMSF

St4RTrack

UFO-4D (Ours)

Ground truth

Input Pair

DynaDUSt3R

ZeroMSF

St4RTrack

UFO-4D (Ours)

Ground truth

Input Pair

DynaDUSt3R

ZeroMSF

St4RTrack

UFO-4D (Ours)

Ground truth

BibTeX

@inproceedings{Hur:2026:UFO,
  title = {{UFO-4D}: Unposed Feedforward 4{D} Reconstruction from Two Images},
  author = {Hur, Junhwa and Herrmann, Charles and Peng, Songyou and Henzler, Philipp and Ma, Zeyu and Zickler, Todd and Sun, Deqing},
  booktitle = {ICLR},
  year = {2026}
}

UFO-4D: Unposed Feedforward 4D Reconstruction from Two Images

How it works

4D Interpolation

Small/medium motion

Large motion

Extreme motion (stress test)

Qualitative comparison

Stereo4D

Bonn

KITTI

BibTeX