Mastering the game of Go with deep neural networks and tree search
David Silver, Aja Huang, Chris J. Maddison, Demis Hassabis · 20 authors total from Google DeepMind; key names listed, final senior author Demis Hassabis included.
Summary
This paper introduced AlphaGo, a system combining deep convolutional neural networks (policy and value networks) trained by supervised learning from human games and reinforcement learning by self-play, integrated with Monte Carlo tree search. The networks reduce the breadth and depth of the search needed to evaluate Go positions. AlphaGo defeated other Go programs and became the first program to beat a professional human Go player (Fan Hui) on a full-size board.
Key findings
- Combined policy and value networks with Monte Carlo tree search to evaluate and select moves in Go.
- Achieved a 99.8% win rate against other Go programs.
- First computer program to defeat a human professional Go player (5-0 against Fan Hui).
Subjects & keywords
Cite this paper
David Silver, Aja Huang, Chris J. Maddison, & Demis Hassabis [20 authors total from Google DeepMind; key names listed, final senior author Demis Hassabis included.] (2016). Mastering the game of Go with deep neural networks and tree search. Nature. https://doi.org/10.1038/nature16961
@article{silver2016mastering,
author = {David Silver and Aja Huang and Chris J. Maddison and Demis Hassabis and {20 authors total from Google DeepMind; key names listed, final senior author Demis Hassabis included.}},
title = {Mastering the game of Go with deep neural networks and tree search},
journal = {Nature},
year = {2016},
doi = {10.1038/nature16961},
url = {https://doi.org/10.1038/nature16961}
}