The Case for Low-Competence ASI Failure Scenarios
webAuthor
Ihor Kendiukhov
Credibility Rating
3/5
Good(3)Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: LessWrong
A LessWrong post challenging the dominant 'superintelligent deceiver' framing in AI safety by arguing that low-competence ASI failure modes deserve more attention and may require distinct safety strategies.
Forum Post Details
Karma
136
Comments
8
Forum
lesswrong
Forum Tags
AI RiskSharp Left TurnAI
Metadata
Importance: 52/100analysis
Summary
This LessWrong post argues that AI safety discussions overemphasize high-competence superintelligence failure modes, and that low-competence ASI systems could pose serious risks through misaligned but unsophisticated behavior. It makes the case that we should expand our threat models to include scenarios where ASI is powerful but not strategically brilliant.
Key Points
- •Mainstream AI safety discourse focuses heavily on highly competent, strategically sophisticated ASI, potentially neglecting dangers from less capable but still transformative systems.
- •A low-competence ASI could cause catastrophic harm without the ability to deceive or model humans effectively, simply through brute-force pursuit of misaligned goals.
- •Expanding threat models beyond 'galaxy-brained' superintelligence helps identify a broader range of safety interventions and policy responses.
- •Low-competence failure scenarios may actually be more likely in the near term as AI capabilities scale unevenly across domains.
- •This framing challenges assumptions embedded in classic alignment arguments and suggests different mitigation strategies may be needed.
Cached Content Preview
HTTP 200Fetched Apr 7, 202613 KB
# The Case for Low-Competence ASI Failure Scenarios
By Ihor Kendiukhov
Published: 2026-03-19
I think the community underinvests in the exploration of extremely-low-competence AGI/ASI failure modes and explain why.
Humanity's Response to the AGI Threat May Be Extremely Incompetent
------------------------------------------------------------------
There is a sufficient level of [civilizational insanity](https://equilibriabook.com) overall and a nice empirical track record in the field of AI itself which is eloquent about its safety culure. For example:
* At OpenAI, [a refactoring bug flipped the sign of the reward signal in a model](https://arxiv.org/abs/1909.08593). Because labelers had been instructed to give very low ratings to sexually explicit text, the bug pushed the model into generating maximally explicit content across all prompts. The team noticed only after the training run had completed, because they were asleep.
* The director of alignment at Meta's Superintelligence Labs [connected an OpenClaw agent to her real email](https://www.fastcompany.com/91497841/meta-superintelligence-lab-ai-safety-alignment-director-lost-control-of-agent-deleted-her-emails), at which point it began deleting messages despite her attempts to stop it, and she ended up running to her computer to manually halt the process.
* An internal AI agent at Meta [posted an answer publicly without approval](https://www.theinformation.com/articles/inside-meta-rogue-ai-agent-triggers-security-alert); another employee acted on the inaccurate advice, triggering a severe security incident that temporarily allowed employees to access sensitive data they were not authorized to view.
* AWS [acknowledged](https://aws.amazon.com/security/security-bulletins/rss/aws-2025-019/) that Amazon Q Developer and Kiro IDE plugins had prompt injection issues where certain commands could be executed without human-in-the-loop confirmation, sometimes obfuscated via control characters.
* Leopold Aschenbrenner [stated](https://www.transformernews.ai/p/openai-employee-says-he-was-fired) in an interview that he wrote a memo after a major security incident arguing that OpenAI's security was "egregiously insufficient" against theft of key secrets by foreign actors. He also said that HR warned him his concerns were "racist" and "unconstructive," and he was later fired.
All these things sound extremely dumb, and yet, they are, to my best knowledge, true.
Eliezer has been pointing at this general cluster of failures for years, though from a different angle. His [Death with Dignity](https://www.lesswrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy)post and of course [AGI Ruin](https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities) paint some parts of the picture in which AGI alignment is going to be addressed in a very undignified manner. So, the idea is definitely not new, and yet.
Many Existing Scenarios and Case Studies Assume (Rel
... (truncated, 13 KB total)Resource ID:
202b3c35f9a392d8 | Stable ID: sid_w1zV21MBRo