








Note: Results are downsampled 4 times for efficient online rendering. Also, we do not mask out any points for fair comparison.
Note: Results are downsampled 4 times for efficient online rendering. Also, we do not mask out any points for fair comparison.
Attribute to our group-wise inference manner and prior geometry knowledge from pretrained video diffusion model, our model successfully produces consistent 4D geometry under fast motion (row 1) and deceptive reflection in the water (row 2).