QualityAdequateQuality: 54/100Human-assigned rating of overall page quality, considering depth, accuracy, and completeness.Structure suggests 87
62
ImportanceUsefulImportance: 62/100How central this topic is to AI safety. Higher scores mean greater relevance to understanding or mitigating AI risk.
13
Structure13/15Structure: 13/15Automated score based on measurable content features.Word count2/2Tables3/3Diagrams1/2Internal links1/2Citations3/3Prose ratio2/2Overview section1/1
21TablesData tables in the page1DiagramsCharts and visual diagrams2Internal LinksLinks to other wiki pages0FootnotesFootnote citations [^N] with sources38External LinksMarkdown links to outside URLs%6%Bullet RatioPercentage of content in bullet lists
Comprehensive analysis of world models + planning architectures showing 10-500x sample efficiency gains over model-free RL (EfficientZero: 194% human performance with 100k vs 50M steps), but estimating only 5-15% probability of TAI dominance due to LLM superiority on general tasks. Key systems include MuZero (superhuman on 57 Atari games without rules), DreamerV3 (first to collect Minecraft diamonds from scratch), with unique safety advantages (inspectable beliefs, explicit goals) but risks from reward misgeneralization and mesa-optimization.
Issues2
QualityRated 54 but structure suggests 87 (underrated by 33 points)
Links1 link could use <R> components
World Models + Planning
Capability
World Models + Planning
Comprehensive analysis of world models + planning architectures showing 10-500x sample efficiency gains over model-free RL (EfficientZero: 194% human performance with 100k vs 50M steps), but estimating only 5-15% probability of TAI dominance due to LLM superiority on general tasks. Key systems include MuZero (superhuman on 57 Atari games without rules), DreamerV3 (first to collect Minecraft diamonds from scratch), with unique safety advantages (inspectable beliefs, explicit goals) but risks from reward misgeneralization and mesa-optimization.
2.2k words
Quick Assessment
Dimension
Assessment
Evidence
Current Dominance
Low (games/robotics only)
MuZero, DreamerV3 superhuman in games; limited general task success
TAI Probability
5-15%
Strong for structured domains; LLMs dominate general tasks
Sample Efficiency
10-500x better than model-free
EfficientZero: 194% human performance with 100k steps (DQN needs 50M)
Interpretability
Partial
World model predictions inspectable; learned representations opaque
Compute at Inference
High
AlphaZero: 80k positions/sec vs Stockfish's 70M; relies on MCTS search
Scalability
Uncertain
Games proven; real-world complexity unproven at scale
Key Advocate
Yann LeCunPersonYann LeCunComprehensive biographical profile of Yann LeCun documenting his technical contributions (CNNs, JEPA), his ~0% AI extinction risk estimate, and his opposition to AI safety regulation including SB 1...Quality: 41/100 (Meta)
World models + planning represents an AI architecture paradigm fundamentally different from large language modelsCapabilityLarge Language ModelsComprehensive analysis of LLM capabilities showing rapid progress from GPT-2 (1.5B parameters, 2019) to o3 (87.5% on ARC-AGI vs ~85% human baseline, 2024), with training costs growing 2.4x annually...Quality: 60/100. Instead of learning to directly produce outputs from inputs, these systems learn an explicit model of how the world works and use search/planning algorithms to find good actions.
This is the paradigm behind AlphaGo, MuZero, and the approach Yann LeCun advocates with JEPA (Joint-Embedding Predictive Architectures). The key idea: separate world understanding from decision making.
Estimated probability of being dominant at transformative AI: 5-15%. Powerful for structured domains but not yet competitive for general tasks. MuZero (Nature, 2020) achieved superhuman performance across Go, chess, shogi, and 57 Atari games without knowing game rules. DreamerV3 (2023) became the first algorithm to collect diamonds in Minecraft from scratch, demonstrating generalization across 150+ diverse tasks with fixed hyperparameters.
Architecture
Loading diagram...
Key Components
Component
Function
Learnable
State Encoder
Compress observations to latent state
Yes
Dynamics Model
Predict how state changes with actions
Yes
Reward Model
Predict rewards/values
Yes
Policy Network
Propose likely good actions
Yes
Value Network
Estimate long-term value
Yes
Search/Planning
Find best action via lookahead
Algorithm (not learned)
Key Properties
Property
Rating
Assessment
White-box Access
PARTIAL
World model inspectable but learned representations are opaque
Trainability
HIGH
Model-based RL, self-play, gradient descent
Predictability
MEDIUM
Explicit planning, but world model errors compound
Modularity
MEDIUM
Clear separation of world model, policy, value
Formal Verifiability
PARTIAL
Planning algorithm verifiable; world model less so
Yann LeCun, Meta's VP and Chief AI Scientist, has been the most vocal advocate for world models as the path to AGI. His 2022 position paper "A Path Towards Autonomous Machine Intelligence" argues:
LLMs are "dead end" - Autoregressive token prediction does not produce genuine world understanding
Prediction in embedding space - JEPA predicts high-level representations, not raw pixels/tokens
Six-module architecture - Perception, world model, cost, memory, action, configurator
Hierarchical planning - Multiple abstraction levels needed for complex tasks
The JEPA Framework
JEPA (Joint Embedding Predictive Architecture) differs fundamentally from generative models:
Aspect
Generative (LLMs, Diffusion)
JEPA
Prediction target
Raw tokens/pixels
Abstract embeddings
Uncertainty handling
Must model all details
Ignores irrelevant variation
Training signal
Reconstruction loss
Contrastive/predictive loss
Information focus
Surface-level patterns
Semantic structure
Meta has released three JEPA implementations: I-JEPA (images, 2023), V-JEPA (video, 2024), and VL-JEPA (vision-language, 2024).
Counter-Arguments
LeCun Claim
Counter
Assessment
LLMs don't understand
They demonstrate understanding on diverse benchmarks
Partially valid: benchmark performance vs. true understanding debated
Autoregressive is limited
GPT-4/o1 shows complex reasoning
Contested: reasoning may still be pattern matching
Need explicit world model
Implicit world model may emerge in LLMs
Open question: Sora debate suggests possible emergence
JEPA is superior
No JEPA system matches GPT-4 capability
Currently true: JEPA hasn't demonstrated general task success
Comparison with LLM Approaches
Aspect
World Models
LLMs
Winner (2025)
Planning mechanism
Explicit MCTS search (80k pos/sec)
Implicit chain-of-thought
Context-dependent
World knowledge
Learned dynamics model
Compressed in weights
LLMs (broader)
Sample efficiency
10-500x better (EfficientZero)
Requires billions of tokens
World Models
Generalization
Compositional planning
In-context learning
LLMs (more flexible)
Task diversity
Requires domain-specific training
Single model, many tasks
LLMs
Current SOTA
Games, robotics control
Language, reasoning, code
Domain-dependent
Compute at inference
High (search required)
Lower (single forward pass)
LLMs
Interpretability
Partial (can query world model)
Low (weights opaque)
World Models
Sample Efficiency Comparison
Algorithm
Steps to Human-Level (Atari)
Relative Efficiency
DQN (2015)
50,000,000
1x (baseline)
Rainbow (2017)
10,000,000
5x
SimPLe (2019)
100,000
500x
EfficientZero (2021)
100,000
500x (194% human)
DreamerV3 (2023)
Variable
Fixed hyperparameters
Model-based approaches achieve comparable or superior performance with 2 orders of magnitude less data, critical for robotics and real-world applications where data collection is expensive or dangerous.
Safety Research Implications
Research That Applies Well
Research Area
Why It Applies
Reward modeling
Explicit reward models are central
Goal specification
Planning objectives are visible
Corrigibility
Can potentially modify goals/world model
Interpretability of beliefs
Can query world model predictions
Unique Safety Challenges
Challenge
Description
Reward hacking
Planning will find unexpected ways to maximize reward
World model exploitation
Agent may exploit inaccuracies in world model
Power-seeking
Planning may naturally discover instrumental strategies
Deceptive planning
Could agent learn to simulate safe behavior while planning harm?
Trajectory Assessment
Arguments For Growth (40-60% probability of increased importance)
Factor
Evidence
Strength
Sample efficiency
500x improvement demonstrated (EfficientZero)
Strong
Robotics demand
Physical tasks need prediction; 2024 survey shows growing adoption
Strong
LeCun's advocacy
Meta investing heavily in JEPA research
Moderate
Compositionality
Planning naturally combines learned skills
Moderate
Scaling evidence
DreamerV3 shows favorable scaling with model size
Moderate
Arguments Against (40-60% probability of continued niche status)
Factor
Evidence
Strength
LLMs dominating
GPT-4, Claude perform well on most general tasks
Strong
World model accuracy
Errors compound over planning horizon
Strong
Computational cost
MCTS is expensive; AlphaZero searches 80k vs Stockfish's 70M positions/sec
Moderate
Limited generalization
No world model system matches LLM task diversity
Strong
Hybrid approaches emerging
LLM + world model combinations may dominate both
Moderate
Probability Assessment: Paradigm Dominance at TAI
Scenario
Probability
Reasoning
World models dominant
5-10%
Would require breakthrough in scalable world modeling
Hybrid (LLM + world model) dominant
25-40%
Combines strengths; active research area
LLMs dominant (current trajectory)
40-55%
Empirically winning; massive investment
Novel paradigm
10-20%
Unknown unknowns
Key Uncertainties
Uncertainty
Current Evidence
Resolution Timeline
Impact on Safety
Can world models scale to real-world complexity?
DreamerV3 handles 150+ tasks; robotics limited
2-5 years
High: determines applicability
Will hybrid approaches dominate?
Active research (LLM + world model); no clear winner