egocentric video generation EgoControl: Why First-Person Video Generation Needs the Whole Body, Not Just the Camera EgoControl reframes egocentric video generation as embodied simulation. By conditioning diffusion models on future 3D full-body poses, it enables controllable, physically grounded first-person video prediction aligned with intended human motion.