What's the Short Timeline Plan
blogAuthor
Credibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: Alignment Forum
Written by Marius Hobbhahn of Apollo Research in January 2025, this post synthesizes near-term AI safety priorities under short-timeline assumptions and is notable for prompting community discussion on the absence of detailed public safety roadmaps.
Metadata
Summary
Marius Hobbhahn outlines a two-layer safety plan for scenarios where transformative AI arrives soon, arguing that current publicly available strategies are insufficiently detailed. Layer 1 focuses on near-term controls like CoT monitoring, AI control, and evals; Layer 2 addresses deeper alignment research including interpretability and scalable oversight.
Key Points
- •Short AI timelines are treated as plausible, requiring concrete safety plans that go beyond vague aspirations.
- •Layer 1 priorities: maintaining human-legible chain-of-thought, improved monitoring, AI control methods, scheming detection, robust evals, and security.
- •Layer 2 priorities: improved near-term alignment strategies, interpretability, scalable oversight, reasoning transparency, and safety-first organizational culture.
- •The post expresses concern that detailed, publicly available short-timeline safety plans are largely absent from the AI safety community.
- •The author acknowledges known limitations and open questions, framing the post as a prompt for community discussion rather than a finished blueprint.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Short AI Timeline Policy Implications | Analysis | 62.0 |
Cached Content Preview
[What’s the short timeline plan?](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#)
28 min read
•
[Short timelines are plausible](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Short_timelines_are_plausible)
•
[What do we need to achieve at a minimum?](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#What_do_we_need_to_achieve_at_a_minimum_)
•
[Making conservative assumptions for safety progress](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Making_conservative_assumptions_for_safety_progress)
•
[So what's the plan?](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#So_what_s_the_plan_)
•
[Layer 1](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Layer_1)
•
[Keep a paradigm with faithful and human-legible CoT](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Keep_a_paradigm_with_faithful_and_human_legible_CoT)
•
[Significantly better (CoT, action & white-box) monitoring](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Significantly_better__CoT__action___white_box__monitoring)
•
[Control (that doesn’t assume human-legible CoT)](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Control__that_doesn_t_assume_human_legible_CoT_)
•
[Much deeper understanding of scheming](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Much_deeper_understanding_of_scheming)
•
[Evals](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Evals)
•
[Security](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Security)
•
[Layer 2](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Layer_2)
•
[Improved near-term alignment strategies](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Improved_near_term_alignment_strategies)
•
[Continued work on interpretability, scalable oversight, superalignment & co](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Continued_work_on_interpretability__scalable_oversight__superalignment___co)
•
[Reasoning transparency](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Reasoning_transparency)
•
[Safety first culture](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Safety_first_culture)
•
[Known limitations and open questions](https://www.alignmentforum.org/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan#Known_limitations_and_open_questions)
[AI Control](https://www.alignmentforum.org/w/ai-control)[AI Evaluations](https://www.alignmentforum.org/w/ai-evaluations)[Deceptive Alignment](https
... (truncated, 69 KB total)145e6d684253d6f0 | Stable ID: OGZiYjFhYT