AI Insight
This paper introduces a systematic framework for understanding "world models" in AI, organizing them along two axes: three capability levels (from basic one-step prediction to self-revising models) and four governing-law regimes (physical, digital, social, and scientific domains). The authors synthesize over 400 research works and analyze more than 100 representative AI systems, providing a unified taxonomy that connects previously fragmented research communities working on robotics, language models, multi-agent systems, and scientific discovery tools.
Why it matters
As AI systems transition from passive text generation to active goal-directed agents that interact with real environments, this framework provides essential guidance for building more capable and reliable systems. The taxonomy helps researchers identify failure modes specific to different domains and offers practical evaluation methods for world models that must predict and manipulate physical objects, navigate software, coordinate socially, or conduct scientific experiments.
arXiv:2604.22748v3 Announce Type: replace
Abstract: As AI systems move from generating text to accomplishing goals through sustained interaction, the ability to model environment dynamics becomes a central bottleneck. Agents that manipulate objects, navigate software, coordinate with others, or design experiments require predictive environment models, yet the term world model carries different meanings across research communities. We introduce a “levels x laws” taxonomy organized along two axes. The first defines three capability levels: L1 Predictor, which learns one-step local transition operators; L2 Simulator, which composes them into multi-step, action-conditioned rollouts that respect domain laws; and L3 Evolver, which autonomously revises its own model when predictions fail against new evidence. The second identifies four governing-law regimes: physical, digital, social, and scientific. These regimes determine what constraints a world model must satisfy and where it is most likely to fail. Using this framework, we synthesize over 400 works and summarize more than 100 representative systems spanning model-based reinforcement learning, video generation, web and GUI agents, multi-agent social simulation, and AI-driven scientific discovery. We analyze methods, failure modes, and evaluation practices across level-regime pairs, propose decision-centric evaluation principles and a minimal reproducible evaluation package, and outline architectural guidance, open problems, and governance challenges. The resulting roadmap connects previously isolated communities and charts a path from passive next-step prediction toward world models that can simulate, and ultimately reshape, the environments in which agents operate. Code and resources are available at: https://github.com/matrix-agent/awesome-agentic-world-modeling.
Source: Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond