[1912.06680] Dota 2 with Large Scale Deep Reinforcement Learning
paperAuthors
Credibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: arXiv
A landmark capabilities paper relevant to AI safety discussions about the rapid progression of RL systems, the role of scale in AI performance, and the challenges of aligning highly capable agents trained via self-play in complex environments.
Paper Details
Metadata
Summary
Describes OpenAI Five, a deep reinforcement learning system that achieved superhuman performance in the complex real-time strategy game Dota 2, defeating the world champion team. The paper details the training infrastructure, algorithmic choices, and scaling laws that enabled this milestone, using roughly 45,000 years of self-play experience. It serves as a landmark demonstration of what large-scale RL can achieve in long-horizon, partially observable, multi-agent environments.
Key Points
- •OpenAI Five defeated the world champion Dota 2 team OG in a best-of-three series, demonstrating superhuman performance in a complex multi-agent game.
- •Training required approximately 45,000 years of self-play experience accumulated through massive distributed compute infrastructure.
- •The system uses a relatively simple LSTM-based architecture, suggesting scale and training time matter more than architectural complexity for such tasks.
- •Demonstrates emergent teamwork and coordination among five independent agents without explicit communication mechanisms.
- •Provides empirical evidence for scaling laws in RL: performance improved predictably with more compute, offering lessons for future large-scale AI systems.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Deep Learning Revolution Era | Historical | 44.0 |
Cached Content Preview
[1912.06680] Dota 2 with Large Scale Deep Reinforcement Learning
-->
Computer Science > Machine Learning
arXiv:1912.06680 (cs)
[Submitted on 13 Dec 2019]
Title: Dota 2 with Large Scale Deep Reinforcement Learning
Authors: OpenAI : Christopher Berner , Greg Brockman , Brooke Chan , Vicki Cheung , Przemysław Dębiak , Christy Dennison , David Farhi , Quirin Fischer , Shariq Hashme , Chris Hesse , Rafal Józefowicz , Scott Gray , Catherine Olsson , Jakub Pachocki , Michael Petrov , Henrique P. d.O. Pinto , Jonathan Raiman , Tim Salimans , Jeremy Schlatter , Jonas Schneider , Szymon Sidor , Ilya Sutskever , Jie Tang , Filip Wolski , Susan Zhang View a PDF of the paper titled Dota 2 with Large Scale Deep Reinforcement Learning, by OpenAI: Christopher Berner and 24 other authors
View PDF
Abstract: On April 13th, 2019, OpenAI Five became the first AI system to defeat the world champions at an esports game. The game of Dota 2 presents novel challenges for AI systems such as long time horizons, imperfect information, and complex, continuous state-action spaces, all challenges which will become increasingly central to more capable AI systems. OpenAI Five leveraged existing reinforcement learning techniques, scaled to learn from batches of approximately 2 million frames every 2 seconds. We developed a distributed training system and tools for continual training which allowed us to train OpenAI Five for 10 months. By defeating the Dota 2 world champion (Team OG), OpenAI Five demonstrates that self-play reinforcement learning can achieve superhuman performance on a difficult task.
Subjects:
Machine Learning (cs.LG) ; Machine Learning (stat.ML)
Cite as:
arXiv:1912.06680 [cs.LG]
(or
arXiv:1912.06680v1 [cs.LG] for this version)
https://doi.org/10.48550/arXiv.1912.06680
Focus to learn more
arXiv-issued DOI via DataCite
Submission history
From: Filip Wolski [ view email ]
[v1]
Fri, 13 Dec 2019 19:56:40 UTC (8,625 KB)
Full-text links:
Access Paper:
View a PDF of the paper titled Dota 2 with Large Scale Deep Reinforcement Learning, by OpenAI: Christopher Berner and 24 other authors View PDF
TeX Source
view license
Current browse context: cs.LG
< prev
|
next >
new
|
recent
| 2019-12
Change to browse by:
cs
stat
stat.ML
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
2 blog links
( what is this? )
DBLP - CS Bibliography
listing | bibtex
Greg Brockman
Vicki Cheung
Christopher Hesse
Rafal Józefowicz
Scott Gray …
export BibTeX citation
Loading...
BibTeX formatted citation
×
loading...
Data provided by:
Bookmark
Bibliographic Tools
Bibliographic
... (truncated, 5 KB total)033ae49bb240e454 | Stable ID: OWFhNjE2NT