LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron, Thibaut Lavril, Gautier Izacard · First 3 of 14 authors listed; full list also includes Xavier Martinet, Marie-Anne Lachaux, Timothee Lacroix, Baptiste Roziere, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample (Meta AI / GenAI team).
Summary
The paper presents LLaMA, a family of foundation language models ranging from 7B to 65B parameters trained exclusively on publicly available datasets. It argues that strong performance can be reached without proprietary data and at smaller parameter counts than prior models. LLaMA-13B outperforms the much larger GPT-3 175B on most benchmarks, and LLaMA-65B is competitive with the best contemporary models such as Chinchilla-70B and PaLM-540B.
Key findings
- A range of models (7B-65B parameters) trained only on publicly available data, releasing them to the research community.
- LLaMA-13B outperforms GPT-3 (175B) on most benchmarks despite being over 10x smaller.
- LLaMA-65B is competitive with leading models like Chinchilla-70B and PaLM-540B.
Subjects & keywords
Cite this paper
Hugo Touvron, Thibaut Lavril, & Gautier Izacard [First 3 of 14 authors listed; full list also includes Xavier Martinet, Marie-Anne Lachaux, Timothee Lacroix, Baptiste Roziere, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample (Meta AI / GenAI team).] (2023). LLaMA: Open and Efficient Foundation Language Models. arXiv preprint (arXiv:2302.13971). https://arxiv.org/abs/2302.13971
@misc{touvron2023llama,
author = {Hugo Touvron and Thibaut Lavril and Gautier Izacard and {First 3 of 14 authors listed; full list also includes Xavier Martinet, Marie-Anne Lachaux, Timothee Lacroix, Baptiste Roziere, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample (Meta AI / GenAI team).}},
title = {LLaMA: Open and Efficient Foundation Language Models},
journal = {arXiv preprint (arXiv:2302.13971)},
year = {2023},
url = {https://arxiv.org/abs/2302.13971}
}