Mastering the game of Go with deep neural networks and tree search

David Silver, Aja Huang, Chris J. Maddison, Demis Hassabis · 20 authors total from Google DeepMind; key names listed, final senior author Demis Hassabis included.

Published 27 January 2016 · Nature · Journal article

Read the original paper Cite

Summary

This paper introduced AlphaGo, a system combining deep convolutional neural networks (policy and value networks) trained by supervised learning from human games and reinforcement learning by self-play, integrated with Monte Carlo tree search. The networks reduce the breadth and depth of the search needed to evaluate Go positions. AlphaGo defeated other Go programs and became the first program to beat a professional human Go player (Fan Hui) on a full-size board.

Key findings

Combined policy and value networks with Monte Carlo tree search to evaluate and select moves in Go.
Achieved a 99.8% win rate against other Go programs.
First computer program to defeat a human professional Go player (5-0 against Fan Hui).

Subjects & keywords

Artificial Intelligence deep learning reinforcement learning monte carlo tree search game of go alphago

Cite this paper

APA

David Silver, Aja Huang, Chris J. Maddison, & Demis Hassabis [20 authors total from Google DeepMind; key names listed, final senior author Demis Hassabis included.] (2016). Mastering the game of Go with deep neural networks and tree search. Nature. https://doi.org/10.1038/nature16961

BibTeX

@article{silver2016mastering,
  author    = {David Silver and Aja Huang and Chris J. Maddison and Demis Hassabis and {20 authors total from Google DeepMind; key names listed, final senior author Demis Hassabis included.}},
  title     = {Mastering the game of Go with deep neural networks and tree search},
  journal   = {Nature},
  year      = {2016},
  doi       = {10.1038/nature16961},
  url       = {https://doi.org/10.1038/nature16961}
}

Related in Artificial Intelligence

AI2023

Segment Anything

Alexander Kirillov, Eric Mintun and Nikhila Ravi

This paper introduces the Segment Anything project: a promptable image segmentation task, the Segment Anything Model (SAM), and the SA-1B dataset. SAM combines an image encoder, a flexible prompt encoder (points, boxes, masks, text), and a fast mask decoder to produce valid segmentation masks from arbitrary prompts. Trained on over 1 billion masks across 11 million images, SAM shows strong zero-shot transfer to many segmentation tasks without additional training.

IEEE/CVF International Conference on Computer Vision (ICCV) Open access

AI2023

GPT-4 Technical Report

OpenAI

This technical report describes GPT-4, a large-scale multimodal Transformer model that accepts image and text inputs and produces text outputs. The report emphasizes that GPT-4 achieves human-level performance on a range of professional and academic benchmarks, and details infrastructure and optimization methods that allowed performance to be predicted from much smaller models. For competitive and safety reasons, the report withholds architecture, dataset, and training details.

arXiv Open access

AI2023

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril and Gautier Izacard

The paper presents LLaMA, a family of foundation language models ranging from 7B to 65B parameters trained exclusively on publicly available datasets. It argues that strong performance can be reached without proprietary data and at smaller parameter counts than prior models. LLaMA-13B outperforms the much larger GPT-3 175B on most benchmarks, and LLaMA-65B is competitive with the best contemporary models such as Chinchilla-70B and PaLM-540B.

arXiv preprint (arXiv:2302.13971) Open access