Mastering the game of Go without human knowledge
David Silver, Julian Schrittwieser, Karen Simonyan, Demis Hassabis · 17 authors total from DeepMind; key names listed, final senior author Demis Hassabis included.
Summary
This paper presented AlphaGo Zero, which learned to play Go solely through self-play reinforcement learning without any human game data or handcrafted features, using a single neural network and a simpler tree search. Starting from random play, it discovered Go knowledge and novel strategies on its own. AlphaGo Zero surpassed all previous versions of AlphaGo, including the one that beat Lee Sedol.
Key findings
- Learned Go entirely from self-play with no human knowledge or supervised data.
- Used a single combined policy/value network and simplified Monte Carlo tree search.
- Defeated the previous champion-beating AlphaGo version 100-0.
Subjects & keywords
Cite this paper
David Silver, Julian Schrittwieser, Karen Simonyan, & Demis Hassabis [17 authors total from DeepMind; key names listed, final senior author Demis Hassabis included.] (2017). Mastering the game of Go without human knowledge. Nature. https://doi.org/10.1038/nature24270
@article{silver2017mastering,
author = {David Silver and Julian Schrittwieser and Karen Simonyan and Demis Hassabis and {17 authors total from DeepMind; key names listed, final senior author Demis Hassabis included.}},
title = {Mastering the game of Go without human knowledge},
journal = {Nature},
year = {2017},
doi = {10.1038/nature24270},
url = {https://doi.org/10.1038/nature24270}
}