Reading: Video Generation: From Frames to World Models

A comprehensive, self-contained guide to how machines learn to generate video — moving, temporally coherent imagery — from the first video GANs to…

Welcome to Video Generation: From Frames to World Models.

A comprehensive, self-contained guide to how machines learn to generate video — moving, temporally coherent imagery — from the first video GANs to the latent video-diffusion transformers behind today's text-to-video systems. This is the third volume in a trilogy; it blends intuition, mathematics, and runnable code, and builds directly on its companions on machine learning and image generation.

This title is part of the ShriIra library and is free to read in full, right here — our small contribution to making world-class knowledge easy to reach.

A note on reading it: open the Contents menu at the top of the reader to jump between chapters, use the Aa menu to set a comfortable text size, theme (light, sepia, or night), and single- or two-page layout. Your place is saved automatically, so you can always pick up where you left off.

We hope it serves you well.

— Shriira Press

Preface

Contents