Human-level control through deep reinforcement learning
Volodymyr Mnih, Koray Kavukcuoglu, David Silver · 19 authors total (Google DeepMind). Remaining authors: Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis.
Summary
The paper introduced the Deep Q-Network (DQN), which combines Q-learning with deep convolutional networks and stabilizing techniques such as experience replay and a target network. Trained end-to-end from raw pixels and game scores, a single architecture and hyperparameter set learned to play 49 Atari 2600 games. It reached or exceeded the level of a professional human games tester on the majority of titles, demonstrating a general agent learning directly from high-dimensional sensory input.
Key findings
- Combined deep convolutional networks with Q-learning, using experience replay and target networks for stable training
- A single agent learned 49 Atari 2600 games from raw pixels, surpassing prior algorithms on most
- Achieved performance comparable to a professional human player on more than half of the games tested
Subjects & keywords
Cite this paper
Volodymyr Mnih, Koray Kavukcuoglu, & David Silver [19 authors total (Google DeepMind). Remaining authors: Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis.] (2015). Human-level control through deep reinforcement learning. Nature. https://doi.org/10.1038/nature14236
@article{mnih2015humanlevel,
author = {Volodymyr Mnih and Koray Kavukcuoglu and David Silver and {19 authors total (Google DeepMind). Remaining authors: Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis.}},
title = {Human-level control through deep reinforcement learning},
journal = {Nature},
year = {2015},
doi = {10.1038/nature14236},
url = {https://doi.org/10.1038/nature14236}
}