LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard · First 3 of 14 authors listed; full list also includes Xavier Martinet, Marie-Anne Lachaux, Timothee Lacroix, Baptiste Roziere, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample (Meta AI / GenAI team).

Published 27 February 2023 · arXiv preprint (arXiv:2302.13971) · Preprint

Read the original paper Cite

Summary

The paper presents LLaMA, a family of foundation language models ranging from 7B to 65B parameters trained exclusively on publicly available datasets. It argues that strong performance can be reached without proprietary data and at smaller parameter counts than prior models. LLaMA-13B outperforms the much larger GPT-3 175B on most benchmarks, and LLaMA-65B is competitive with the best contemporary models such as Chinchilla-70B and PaLM-540B.

Key findings

A range of models (7B-65B parameters) trained only on publicly available data, releasing them to the research community.
LLaMA-13B outperforms GPT-3 (175B) on most benchmarks despite being over 10x smaller.
LLaMA-65B is competitive with leading models like Chinchilla-70B and PaLM-540B.

Subjects & keywords

Artificial Intelligence foundation models large language models open models efficient training llama

Cite this paper

APA

Hugo Touvron, Thibaut Lavril, & Gautier Izacard [First 3 of 14 authors listed; full list also includes Xavier Martinet, Marie-Anne Lachaux, Timothee Lacroix, Baptiste Roziere, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample (Meta AI / GenAI team).] (2023). LLaMA: Open and Efficient Foundation Language Models. arXiv preprint (arXiv:2302.13971). https://arxiv.org/abs/2302.13971

BibTeX

@misc{touvron2023llama,
  author    = {Hugo Touvron and Thibaut Lavril and Gautier Izacard and {First 3 of 14 authors listed; full list also includes Xavier Martinet, Marie-Anne Lachaux, Timothee Lacroix, Baptiste Roziere, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample (Meta AI / GenAI team).}},
  title     = {LLaMA: Open and Efficient Foundation Language Models},
  journal   = {arXiv preprint (arXiv:2302.13971)},
  year      = {2023},
  url       = {https://arxiv.org/abs/2302.13971}
}

Related in Artificial Intelligence

AI2023

Segment Anything

Alexander Kirillov, Eric Mintun and Nikhila Ravi

This paper introduces the Segment Anything project: a promptable image segmentation task, the Segment Anything Model (SAM), and the SA-1B dataset. SAM combines an image encoder, a flexible prompt encoder (points, boxes, masks, text), and a fast mask decoder to produce valid segmentation masks from arbitrary prompts. Trained on over 1 billion masks across 11 million images, SAM shows strong zero-shot transfer to many segmentation tasks without additional training.

IEEE/CVF International Conference on Computer Vision (ICCV) Open access

AI2023

GPT-4 Technical Report

OpenAI

This technical report describes GPT-4, a large-scale multimodal Transformer model that accepts image and text inputs and produces text outputs. The report emphasizes that GPT-4 achieves human-level performance on a range of professional and academic benchmarks, and details infrastructure and optimization methods that allowed performance to be predicted from much smaller models. For competitive and safety reasons, the report withholds architecture, dataset, and training details.

arXiv Open access

AI2022

LoRA: Low-Rank Adaptation of Large Language Models

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, et al.

The paper introduces LoRA, a parameter-efficient fine-tuning method that keeps the pretrained model weights frozen and instead learns small trainable low-rank decomposition matrices injected into the Transformer layers. This drastically cuts the number of trainable parameters and optimizer memory needed to adapt very large models to downstream tasks. The authors show LoRA matches or exceeds full fine-tuning quality across several models including GPT-3 175B while adding no extra inference latency.

International Conference on Learning Representations (ICLR 2022) Open access