in-context learning — papers

AIAdvances in Neural Information Processing Systems 35 (NeurIPS 2022) · Jan 2022 Open access

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, et al.

The paper shows that prompting a large language model with a few exemplars that include intermediate reasoning steps (a 'chain of thought') substantially improves its ability to solve multi-step reasoning problems. This reasoning ability emerges only in sufficiently large models and requires no fine-tuning. Across arithmetic, commonsense, and symbolic reasoning tasks, chain-of-thought prompting produces large gains, including a new state of the art on the GSM8K math word-problem benchmark.

chain-of-thought prompting reasoning large language models

AIAdvances in Neural Information Processing Systems 33 (NeurIPS 2020) · Dec 2020 Open access

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann and Nick Ryder

This paper presented GPT-3, an autoregressive language model with 175 billion parameters, and studied its ability to perform tasks from natural-language descriptions and a few examples without gradient updates (in-context learning). Scaling the model dramatically improved few-shot performance across many NLP benchmarks, sometimes approaching fine-tuned systems. The authors also examined limitations, data contamination, and broader societal impacts of large language models.

gpt-3 language model few-shot learning in-context learning