Fei-Fei Li: AI Still Fails to Understand the Physical World

Stanford computer science professor Fei-Fei Li believes artificial intelligence still lacks one crucial ability — to truly understand the physical world. The solution, she argues, lies in developing “spatial intelligence,” or what researchers now call world models.

The Missing Link in AI Evolution

Despite huge progress in large-language models, today’s AI systems struggle to reason about physical reality. “These systems must generate spatially coherent worlds that follow physical laws, process multimodal inputs — from images to actions — and predict how those worlds evolve,” said Li during a recent lecture.

The concept of “spatial intelligence” goes far beyond pattern recognition. It requires training models not just on language or visual data, but also on the underlying physics of real environments — mass, motion, causality, and continuity. Only by learning how the world actually works, she says, can AI develop human-like reasoning and perception.

World Labs and the Birth of Marble

In September, Li’s company, World Labs, released a beta version of Marble — an early world-model prototype capable of generating continuous, explorable 3D environments from text or visual prompts. Unlike traditional scene generators, Marble maintained consistent worlds without loading screens or environmental resets, allowing users to navigate freely without temporal limits.

The project demonstrated how next-generation AI could bridge simulation and cognition, turning static image generators into dynamic, physics-aware systems. “This is a glimpse of AI that doesn’t just describe the world — it lives in it,” Li explained.

Why World Models Are the Future

The idea of “world models” is rapidly gaining traction across research labs, from OpenAI and DeepMind to Stanford and World Labs. These systems aim to teach AI agents to understand cause and effect, anticipate physical interactions, and simulate outcomes before they happen — much like a human brain does when interacting with reality.

Fei-Fei Li calls this the “next paradigm shift” in artificial intelligence. “Language models gave machines the ability to talk,” she said. “World models will give them the ability to think in space and time.”

Editorial Team — CoinBotLab

Search

Fei-Fei Li: AI Still Fails to Understand the Physical World