Text-to-video models turn a prompt into moving footage, handling motion, lighting, and increasingly sound. They enable b-roll, concepts, and ads without filming, while consistency across shots and physical realism remain the hard problems.