Researchers define what counts as a world model and text-to-video generators do not
New framework clarifies AI world models, excluding text-to-video generators.

An international research team has established a new framework to define AI world models, emphasizing the necessity for perception, interaction, and memory. This definition notably excludes text-to-video models like Sora, which lack real-world feedback loops. The researchers also introduced OpenWorldLib, an open-source project designed to facilitate the development and evaluation of world models through a modular approach.
Key Takeaways
- 1.
The newly defined world model framework includes perception, interaction, and memory.
- 2.
OpenWorldLib, an open-source project, integrates five modules for world model development.
- 3.
Text-to-video models like Sora are explicitly excluded from this definition.
Get your personalized feed
Trace groups the biggest stories, videos, and discussions into one feed so you can stay current without scanning ten tabs.
Try Trace free