Mastering the game of Go without human knowledge
This paper presented AlphaGo Zero, which learned to play Go solely through self-play reinforcement learning without any human game data or handcrafted features, using a single neural network and a simpler tree search. Starting from random play, it discovered Go knowledge and novel strategies on its own. AlphaGo Zero surpassed all previous versions of AlphaGo, including the one that beat Lee Sedol.