Four iPhones capture the subject simultaneously from different angles. The footage is processed through a pose estimation stage, masked per-camera for clean subject isolation, and fed into a multi-view synthesis model that generates the additional viewpoints needed for full 3D reconstruction. The final output is rendered as a 4D Gaussian Splat — a real-time, view-consistent volumetric representation of a moving human.