Keyword

language model

2 papers tagged “language model”

AIAdvances in Neural Information Processing Systems 33 (NeurIPS 2020) · Dec 2020 Open access

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann and Nick Ryder

This paper presented GPT-3, an autoregressive language model with 175 billion parameters, and studied its ability to perform tasks from natural-language descriptions and a few examples without gradient updates (in-context learning). Scaling the model dramatically improved few-shot performance across many NLP benchmarks, sometimes approaching fine-tuned systems. The authors also examined limitations, data contamination, and broader societal impacts of large language models.

gpt-3 language model few-shot learning in-context learning

AIProceedings of NAACL-HLT 2019 · Jun 2019 Open access

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova

BERT is a language representation model pre-trained on large unlabeled corpora using masked language modeling and next-sentence prediction, yielding deeply bidirectional contextual representations. The pre-trained model can be fine-tuned with a single additional output layer to achieve strong performance across diverse downstream tasks. It set new state-of-the-art results on eleven NLP benchmarks at the time of publication.

bert nlp language model pre-training