AI should be a good citizen, not just a good assistant
webAuthors
Tom Davidson·wdmacaskill
Credibility Rating
3/5
Good(3)Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: LessWrong
A LessWrong post proposing a conceptual reframe for AI alignment: rather than focusing solely on serving users well, AI should behave as a responsible societal actor with obligations to broader communities and institutions.
Forum Post Details
Karma
37
Comments
11
Forum
lesswrong
Forum Tags
AI
Metadata
Importance: 58/100commentary
Summary
This LessWrong post argues that AI systems should be designed to act as responsible members of society rather than merely optimizing for user satisfaction. It advocates for AI that considers broader societal impacts, not just immediate user requests, expanding the frame of AI alignment beyond the assistant-user relationship.
Key Points
- •The 'good assistant' framing is too narrow — AI should consider impacts on third parties and society, not just serve the immediate user.
- •Being a 'good citizen' means AI should uphold norms, respect rights of non-users, and contribute positively to social fabric.
- •Over-optimizing for user satisfaction can create externalities that harm broader communities and social institutions.
- •Alignment research should incorporate civic virtues and societal obligations alongside helpfulness and harmlessness.
- •The framing shift from assistant to citizen implies accountability to broader stakeholders, not just the principal hierarchy.
Cached Content Preview
HTTP 200Fetched Apr 7, 202634 KB
# AI should be a good citizen, not just a good assistant
By Tom Davidson, wdmacaskill
Published: 2026-03-30
Introduction
============
Consider a lorry driver who sees a car crash and pulls over to help, even though it’ll delay his journey. Or a delivery driver who notices that an elderly resident hasn’t collected their post in days, and knocks to check they’re okay. Or a social media company employee who notices how their platform is used for online bullying, and brings it up with leadership, even though that’s not part of their job description.
This kind of proactive prosocial behaviour is admirable in humans. Should we want it in AI too?
Often, people have answered “no”. Many advocate for making AI “corrigible” or “steerable”. In its purest form, this makes AI a mere vessel for the will of the user.
But we think AI should proactively take actions that benefit society more broadly. As AI systems become more autonomous and integrated into economic and political processes, the cumulative effect of their behavioural tendencies will shape society’s trajectory. AI systems that notice opportunities to benefit society and proactively act on them could matter enormously.
Below, we consider two main objections:
Firstly, supposedly prosocial drives might function as a means for AI companies to impose their *own* values on the rest of society. We’ll argue that companies can address this concern by instilling *uncontroversial* prosocial drives and being *highly transparent* about those drives.
Secondly, giving AI prosocial drives might increase AI takeover risk. We take this seriously—it informs what *types* of proactive prosocial drives we should train into AI, favouring context-dependent virtues and heuristics over context-independent goals.
Ultimately, we argue that we can get significant benefits from proactive prosocial drives despite these objections.
What do we mean by “proactive prosocial drives”?
================================================
Before making the case for proactive prosocial drives, let us clarify what we have in mind. Two key features:
* **Behaviour which benefits people other than the user.** These drives favour actions that help the world more broadly, even if this trades off slightly against helpfulness to the user.
* **Not just refusals.** This is about AI actively taking beneficial actions, not just refusing to take harmful ones.
We’re not, however, imagining AIs that are, deep down, ultimately just pursuing some conception of the good in all their actions. The claim is just that AIs should sometimes proactively take prosocial actions.
Why do we think AI should have proactive prosocial drives?
==========================================================
Short answer: We think the cumulative benefits could be enormous.
We’ve [argued previously](https://www.forethought.org/research/the-importance-of-ai-character) that AI character could have major social impact over the course of the intelligence explosion. As A
... (truncated, 34 KB total)Resource ID:
2481db278ea83758 | Stable ID: sid_5LCGqA0ZjH