Skip to content
Longterm Wiki
Back

[1912.06680] Dota 2 with Large Scale Deep Reinforcement Learning

paper

Authors

Nima Sarang·Charalambos Poullis

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

A landmark capabilities paper relevant to AI safety discussions about the rapid progression of RL systems, the role of scale in AI performance, and the challenges of aligning highly capable agents trained via self-play in complex environments.

Paper Details

Citations
2,088
97 influential
Year
2019

Metadata

Importance: 62/100arxiv preprintprimary source

Summary

Describes OpenAI Five, a deep reinforcement learning system that achieved superhuman performance in the complex real-time strategy game Dota 2, defeating the world champion team. The paper details the training infrastructure, algorithmic choices, and scaling laws that enabled this milestone, using roughly 45,000 years of self-play experience. It serves as a landmark demonstration of what large-scale RL can achieve in long-horizon, partially observable, multi-agent environments.

Key Points

  • OpenAI Five defeated the world champion Dota 2 team OG in a best-of-three series, demonstrating superhuman performance in a complex multi-agent game.
  • Training required approximately 45,000 years of self-play experience accumulated through massive distributed compute infrastructure.
  • The system uses a relatively simple LSTM-based architecture, suggesting scale and training time matter more than architectural complexity for such tasks.
  • Demonstrates emergent teamwork and coordination among five independent agents without explicit communication mechanisms.
  • Provides empirical evidence for scaling laws in RL: performance improved predictably with more compute, offering lessons for future large-scale AI systems.

Cited by 1 page

PageTypeQuality
Deep Learning Revolution EraHistorical44.0

Cached Content Preview

HTTP 200Fetched Feb 22, 20265 KB
[1912.06680] Dota 2 with Large Scale Deep Reinforcement Learning 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 

 
 
 
 
 
--> 

 
 
 Computer Science > Machine Learning

 

 
 arXiv:1912.06680 (cs)
 
 
 
 
 
 [Submitted on 13 Dec 2019] 
 Title: Dota 2 with Large Scale Deep Reinforcement Learning

 Authors: OpenAI : Christopher Berner , Greg Brockman , Brooke Chan , Vicki Cheung , Przemysław Dębiak , Christy Dennison , David Farhi , Quirin Fischer , Shariq Hashme , Chris Hesse , Rafal Józefowicz , Scott Gray , Catherine Olsson , Jakub Pachocki , Michael Petrov , Henrique P. d.O. Pinto , Jonathan Raiman , Tim Salimans , Jeremy Schlatter , Jonas Schneider , Szymon Sidor , Ilya Sutskever , Jie Tang , Filip Wolski , Susan Zhang View a PDF of the paper titled Dota 2 with Large Scale Deep Reinforcement Learning, by OpenAI: Christopher Berner and 24 other authors 
 View PDF 

 
 Abstract: On April 13th, 2019, OpenAI Five became the first AI system to defeat the world champions at an esports game. The game of Dota 2 presents novel challenges for AI systems such as long time horizons, imperfect information, and complex, continuous state-action spaces, all challenges which will become increasingly central to more capable AI systems. OpenAI Five leveraged existing reinforcement learning techniques, scaled to learn from batches of approximately 2 million frames every 2 seconds. We developed a distributed training system and tools for continual training which allowed us to train OpenAI Five for 10 months. By defeating the Dota 2 world champion (Team OG), OpenAI Five demonstrates that self-play reinforcement learning can achieve superhuman performance on a difficult task.
 

 
 
 
 Subjects: 
 
 Machine Learning (cs.LG) ; Machine Learning (stat.ML) 
 
 Cite as: 
 arXiv:1912.06680 [cs.LG] 
 
 
 
 (or 
 arXiv:1912.06680v1 [cs.LG] for this version)
 
 
 
 
 https://doi.org/10.48550/arXiv.1912.06680 
 
 
 Focus to learn more 
 
 
 
 arXiv-issued DOI via DataCite 
 
 
 
 
 
 
 
 Submission history

 From: Filip Wolski [ view email ] 
 [v1] 
 Fri, 13 Dec 2019 19:56:40 UTC (8,625 KB)

 
 
 
 
 
 Full-text links: 
 Access Paper:

 
 
View a PDF of the paper titled Dota 2 with Large Scale Deep Reinforcement Learning, by OpenAI: Christopher Berner and 24 other authors View PDF 
 TeX Source
 
 
 view license 
 
 
 Current browse context: cs.LG 

 
 
 < prev 
 
 | 
 next > 
 

 
 new 
 | 
 recent 
 | 2019-12 
 
 Change to browse by:
 
 cs 
 stat 
 stat.ML 
 
 

 
 
 References & Citations

 
 NASA ADS 
 Google Scholar 

 Semantic Scholar 

 
 
 

 
 
 2 blog links 

 ( what is this? )
 
 
 
 DBLP - CS Bibliography

 
 listing | bibtex 
 
 Greg Brockman 
 Vicki Cheung 
 Christopher Hesse 
 Rafal Józefowicz 
 Scott Gray &hellip; 
 
 
 export BibTeX citation 
 Loading... 
 

 
 
 
 BibTeX formatted citation

 &times; 
 
 
 loading... 
 
 
 Data provided by: 
 
 
 
 
 Bookmark

 
 
 
 
 
 
 
 
 
 
 
 Bibliographic Tools 
 
 Bibliographic

... (truncated, 5 KB total)
Resource ID: 033ae49bb240e454 | Stable ID: OWFhNjE2NT