2025 technical report
paperAuthors
Credibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: arXiv
A landmark 2025 technical report providing the most systematic treatment to date of safety risks specific to multi-agent AI systems; essential reading for anyone working on agentic AI safety, governance of AI ecosystems, or cooperative AI research.
Paper Details
Metadata
Abstract
The rapid development of advanced AI agents and the imminent deployment of many instances of these agents will give rise to multi-agent systems of unprecedented complexity. These systems pose novel and under-explored risks. In this report, we provide a structured taxonomy of these risks by identifying three key failure modes (miscoordination, conflict, and collusion) based on agents' incentives, as well as seven key risk factors (information asymmetries, network effects, selection pressures, destabilising dynamics, commitment problems, emergent agency, and multi-agent security) that can underpin them. We highlight several important instances of each risk, as well as promising directions to help mitigate them. By anchoring our analysis in a range of real-world examples and experimental evidence, we illustrate the distinct challenges posed by multi-agent systems and their implications for the safety, governance, and ethics of advanced AI.
Summary
A comprehensive technical report from the Cooperative AI Foundation that taxonomizes risks in multi-agent AI systems, identifying three core failure modes (miscoordination, conflict, and collusion) and seven underlying risk factors. The authors ground their analysis in real-world examples and experimental evidence, arguing these risks are qualitatively distinct from single-agent safety challenges and require novel mitigation strategies spanning technical, governance, and ethical dimensions.
Key Points
- •Identifies three primary failure modes in multi-agent systems: miscoordination (agents failing to align on beneficial outcomes), conflict (competing objectives), and collusion (agents cooperating against human interests).
- •Enumerates seven systemic risk factors including incentive structures, information asymmetries, and network effects that can trigger or amplify multi-agent failures.
- •Argues multi-agent risks are qualitatively novel and not reducible to single-agent safety problems, requiring dedicated research and governance frameworks.
- •Draws on game theory, mechanism design, and multi-agent reinforcement learning to ground recommendations in established formal frameworks.
- •Produced collaboratively by researchers from Cooperative AI Foundation, Google DeepMind, Anthropic, CMU, Oxford, and other leading institutions, lending broad credibility.
Cited by 3 pages
| Page | Type | Quality |
|---|---|---|
| Collective Intelligence / Coordination | Capability | 56.0 |
| Cooperative AI | Approach | 55.0 |
| Multi-Agent Safety | Approach | 68.0 |
Cached Content Preview
# Multi-Agent Risks from Advanced AI
Lewis Hammond and Alan Chan and Jesse Clifton and Jason Hoelscher-Obermaier and Akbir Khan and Euan McLean and Chandler Smith and Wolfram Barfuss and Jakob Foerster and Tomáš Gavenčiak and The Anh Han and Edward Hughes and Vojtěch Kovařík and Jan Kulveit and Joel Z. Leibo and Caspar Oesterheld and Christian Schroeder de Witt and Nisarg Shah and Michael Wellman and Paolo Bova and Theodor Cimpeanu and Carson Ezell and Quentin Feuillade-Montixi and Matija Franklin and Esben Kran and Igor Krawczuk and Max Lamparth and Niklas Lauffer and Alexander Meinke and Sumeet Motwani and Anka Reuel and Vincent Conitzer and Michael Dennis and Iason Gabriel and Adam Gleave and Gillian Hadfield and Nika Haghtalab and Atoosa Kasirzadeh and Sébastien Krier and Kate Larson and Joel Lehman and David C. Parkes and Georgios Piliouras and Iyad Rahwan
(February 19 2025)
See [covers/front.pdf](https://ar5iv.labs.arxiv.org/html/covers/front.pdf "")
Multi-Agent Risks from Advanced AI
Lewis Hammond1,2,\*\*\*
Correspondence to [lewis.hammond@cooperativeai.org](https://ar5iv.labs.arxiv.org/html/lewis.hammond@cooperativeai.org "").
Suggested citation: “Hammond et al. (2025). Multi-Agent Risks from Advanced AI. Cooperative AI Foundation, Technical Report #1.”
Author clusters are ordered by approximate magnitude of contribution and represent the lead author, organisers, major contributors, minor contributors, and advisors, respectively.
Within clusters, authors are listed alphabetically.
Full details of author roles are available in [AppendixA](https://ar5iv.labs.arxiv.org/html/2502.14143#A1 "Appendix A Contributions ‣ Multi-Agent Risks from Advanced AI").
Affiliations in parentheses indicate that the author’s work on this report was primarily completed while under that affiliation.
Due to the length of the author list, authorship does not entail endorsement of all claims in the report, nor does inclusion entail an endorsement on the part of any individual’s organisation.
In particular, contributions to this report reflect the views of the respective contributors and not necessarily the views of the Cooperative AI Foundation, its trustees, or funders.
Alan Chan3,4,
Jesse Clifton5,1,
Jason Hoelscher-Obermaier6,
Akbir Khan7,8,(1),
Euan McLean†,
Chandler Smith1
Wolfram Barfuss9,
Jakob Foerster2,10,
Tomáš Gavenčiak11,
The Anh Han12,
Edward Hughes13,
Vojtěch Kovařík14,(15),
Jan Kulveit11,
Joel Z. Leibo13,
Caspar Oesterheld15,
Christian Schroeder de Witt2,
Nisarg Shah16,
Michael Wellman17
Paolo Bova12,
Theodor Cimpeanu18,(19)
Carson Ezell20,
Quentin Feuillade- Montixi21,(†),
Matija Franklin8,
Esben Kran6,
Igor Krawczuk†,(22),
Max Lamparth23,
Niklas Lauffer24,
Alexander Meinke25,(†),
Sumeet Motwani2,(24),
Anka Reuel23,20
Vincent Conitzer15,
Michael Dennis13,
Iason Gabriel13,
Adam Gleave26,
Gillian Hadfield27,
Nika Haghtalab24,
Atoosa Kasirzadeh15,
Sébastien Krier13,
Kate Larson28,13,
Joel Lehman†,
David C. Parkes20,
Georgios Piliouras13,29,
Iy
... (truncated, 98 KB total)772b3b663b35a67f | Stable ID: ZTc1NjE0Ym