Longterm Wiki
Updated 2026-02-10HistoryData
Page StatusContentTable
Edited 3 days ago
20
QualityDraft
30
ImportanceReference
0
Structure0/15
00000%0%
Updated monthlyDue in 4 weeks
Summary

An interactive sortable table comparing 12+ architecture scenarios across dimensions including likelihood, safety outlook, whitebox access, modularity, and formal verifiability. Supports grouped/unified views by category.

Issues1
QualityRated 20 but structure suggests 0 (overrated by 20 points)

Architecture Scenarios Table

Columns:|
Probability this becomes dominant at TAI
Trend
Overall safety assessment
Interpretability of internals
Training approach
Behavior predictability
Component separation
Formal verification possible
Key PapersLabsSafety ProsSafety Cons
Direct model API/chat with basic prompting. No persistent memory, minimal tools. Like ChatGPT web interface.
Deployment Patterns
5-15%
Unlikely to stay dominant - scaffolding adds clear value
(illustrative)
5/10
Mixed
Easy to study but limited interpretability; low capability ceiling reduces risk
LOW
Model internals opaque; just see inputs/outputs
HIGH
Standard RLHF on base model
MEDIUM
Single forward pass, somewhat predictable
LOW
Monolithic model
LOW
Model itself unverifiable
+ Simple to analyze
+ No tool access = limited harm
Model internals opaque
Limited capability ceiling
Model + basic tool use + simple chains. RAG, function calling, single-agent loops. Like GPT with plugins.
Deployment Patterns
15-25%
Current sweet spot; but heavy scaffolding catching up
(illustrative)
5/10
Mixed
Tool use adds capability and risk; scaffold provides some inspection
MEDIUM
Scaffold code readable; model still opaque
HIGH
Model trained; scaffold is code
MEDIUM
Tool calls add some unpredictability
MEDIUM
Clear tool boundaries
PARTIAL
Scaffold code can be verified
+ Scaffold logic inspectable
+ Tool permissions controllable
Tool use enables real-world harm
Model decisions still opaque
Multi-agent systems, complex orchestration, persistent memory, autonomous operation. Like Claude Code, Devin.
Deployment Patterns
25-40%
Strong trend; scaffolding getting cheaper and more valuable
(illustrative)
4/10
Challenging
High capability with emergent behavior; scaffold helps but autonomy is risky
MEDIUM-HIGH
Scaffold code fully readable; model calls are black boxes
LOW
Models trained separately; scaffold is engineered code
LOW
Multi-step plans diverge unpredictably
HIGH
Explicit component architecture
PARTIAL
Scaffold verifiable; model calls not
+ Scaffold code auditable
+ Can add safety checks in code
+ Modular
Emergent multi-step behavior
Autonomous = less oversight
Tool use risk
Standard transformer architecture. All parameters active. Current GPT/Claude/Llama architecture.
Base Architectures
(base arch)
Orthogonal to deployment - combined with scaffolding choices
(illustrative)
5/10
Mixed
Most studied but still opaque; interpretability improving but slowly
LOW
Weights exist but mech interp still primitive
HIGH
Well-understood pretraining + RLHF
LOW-MED
Emergent capabilities, phase transitions
LOW
Monolithic, end-to-end trained
LOW
Billions of parameters, no formal guarantees
+ Most studied architecture
+ Some interp tools exist
Internals still opaque
Emergent deception possible
Scale makes analysis hard
Mixture-of-Experts or other sparse architectures. Only subset of params active per token.
Base Architectures
(base arch)
May become default for efficiency; orthogonal to scaffolding
(illustrative)
4/10
Mixed
Efficiency gains good for safety research budget, but routing adds complexity
LOW
Same opacity as dense + routing complexity
HIGH
Standard + load balancing
LOW
Routing adds another layer of unpredictability
MEDIUM
Expert boundaries exist but interact
LOW
Combinatorial explosion of expert paths
+ Can study individual experts
+ More efficient = more testing budget
Routing is another black box
Hard to cover all expert combinations
State-space models or SSM-transformer hybrids with linear-time inference.
Base Architectures
5-15%
Promising efficiency but transformers still dominate benchmarks
(illustrative)
Unknown
Too early to assess; different internals may help or hurt
MEDIUM
Different internals, less studied
HIGH
Still gradient-based
MEDIUM
Recurrence adds complexity
LOW
Similar to transformers
UNKNOWN
Recurrence may help or hurt
CartesiaTogether AIPrinceton
+ More efficient
+ Linear complexity
Interp tools don't transfer
Less studied
Explicit learned world model with search/planning. More like AlphaGo than GPT.
Base Architectures
5-15%
LeCun advocates; not yet competitive for general tasks
(illustrative)
6/10
Mixed
Explicit structure helps inspection but goal misgeneralization risks higher
PARTIAL
World model inspectable but opaque
HIGH
Model-based RL, self-play
MEDIUM
Explicit planning but model errors compound
MEDIUM
Separate world model, policy, value
PARTIAL
Planning verifiable, world model less so
Google DeepMindMeta FAIRUC Berkeley
+ Explicit goals
+ Can inspect beliefs
Goal misgeneralization
Mesa-optimization
Neural + symbolic reasoning, knowledge graphs, or program synthesis.
Base Architectures
3-10%
Long-promised, rarely delivered at scale
(illustrative)
7/10
Favorable
Symbolic components enable formal verification; hybrid boundaries a challenge
PARTIAL
Symbolic parts clear, neural parts opaque
COMPLEX
Neural trainable, symbolic often hand-crafted
MEDIUM
Explicit reasoning more auditable
HIGH
Clear neural/symbolic separation
PARTIAL
Symbolic parts formally verifiable
Neural Theorem Provers
AlphaProof (2024)
IBM ResearchGoogle DeepMindMIT-IBM Lab
+ Auditable reasoning
+ Formal verification possible
Brittleness
Hard to scale
Boundary problems
Formally verified AI with mathematical safety guarantees. Davidad's agenda.
Base Architectures
1-5%
Ambitious; unclear if achievable for general capabilities
(illustrative)
9/10
Favorable
If achievable, best safety properties by design; uncertainty about feasibility
HIGH
Designed for formal analysis
DIFFERENT
Verified synthesis, not just SGD
HIGH
Behavior bounded by proofs
HIGH
Compositional by design
HIGH
This is the point
ARIA (Davidad)MIRI
+ Mathematical guarantees
+ Auditable by construction
May not scale
Capability tax
World model verification hard
Something we haven't thought of yet. Placeholder for model uncertainty.
Base Architectures
5-15%
Epistemic humility; history suggests surprises
(illustrative)
Unknown
Cannot assess; all current safety research may or may not transfer
???
Depends on what emerges
???
Unknown
???
No basis for prediction
???
Unknown
???
Unknown
None listed
Unknown
+ Fresh start possible
All current work may not transfer
Actual biological neurons, brain organoids, or wetware computing.
Alternative Compute
<1%
Fascinating but far from TAI-relevant scale
(illustrative)
3/10
Challenging
Deeply opaque; no existing safety tools apply; ethical complexities
LOW
Biological systems inherently opaque
UNKNOWN
Biological learning rules
LOW
Noisy and variable
LOW
Highly interconnected
LOW
Too complex
Cortical LabsVarious academic
+ May have human-like values
+ Energy efficient
Ethical concerns
No interp tools
Slow iteration
Spiking neural networks on specialized chips. Event-driven, analog.
Alternative Compute
1-3%
Efficiency gains real but not on path to TAI
(illustrative)
Unknown
Different substrate with different properties; too early to assess
PARTIAL
Architecture known, dynamics complex
DIFFERENT
Spike-timing plasticity
MEDIUM
More brain-like
MEDIUM
Modular chip designs possible
LOW
Analog dynamics hard to verify
Intel LabsIBM ResearchSynSense
+ Energy efficient
+ Robust
Current tools don't transfer
Less mature
Upload/simulate a complete biological brain at sufficient fidelity. Requires scanning + simulation tech.
Non-AI Paradigms
<1%
Probably slower than AI; scanning tech far away
(illustrative)
5/10
Mixed
Human values by default, but speed-up and copy-ability create novel risks
LOW
Brain structure visible but not interpretable
N/A
Copied from biological learning
LOW
Human-like = unpredictable
LOW
Brains are highly interconnected
LOW
Too complex, poorly understood
CarboncopiesAcademic neuroscience
+ Human values by default
+ Understood entity type
Ethics of copying minds
Could run faster than real-time
Identity issues
IQ enhancement via embryo selection, polygenic screening, or direct genetic engineering.
Non-AI Paradigms
<0.5%
Too slow for TAI race; incremental gains only
(illustrative)
7/10
Favorable
Slow and controllable; enhanced humans still have human values
LOW
Genetic effects poorly understood
N/A
Biological development
MEDIUM
Still human, but smarter
LOW
Integrated biological system
LOW
Biological complexity
Genomic PredictionAcademic genetics
+ Human values
+ Slow/controllable
+ Socially legible
Ethical concerns
Too slow to matter for TAI
Inequality risks
Neural interfaces that augment human cognition with AI/compute. Neuralink-style.
Non-AI Paradigms
<1%
Bandwidth limits; AI likely faster standalone
(illustrative)
5/10
Mixed
Human oversight built-in, but security risks and bandwidth limits
PARTIAL
Interface visible, brain opaque
HYBRID
Human learning + AI training
LOW
Human in the loop = unpredictable
MEDIUM
Clear human/AI boundary
LOW
Human component unverifiable
NeuralinkSynchronBrainGate
+ Human oversight built-in
+ Gradual augmentation
Bandwidth limits
Security risks
Human bottleneck
Human-AI teams, prediction markets, deliberative democracy augmented by AI. Intelligence from coordination.
Non-AI Paradigms
(overlay)
Not exclusive; already happening
(illustrative)
7/10
Favorable
Human oversight natural; slower pace; but coordination challenges
PARTIAL
Process visible, emergent behavior less so
N/A
Coordination protocols, not training
MEDIUM
Depends on protocol design
HIGH
Explicitly modular by design
PARTIAL
Protocols can be analyzed
Collective Intelligence papers
+ Human oversight
+ Diverse perspectives
+ Slower = more controllable
Coordination failures
Vulnerable to manipulation
May not scale
16 scenarios across 4 categories