Our model generates a diverse distribution of videos of the original subject, with a significant amount of motion and realism. The right-most part shows pixel diversity as obtained from 80 generated videos. The person moves head and body significantly (red means higher diversity in pixel color) while the background is kept fixed, and despite the diversity all videos look realistic.