LLaMA: Open and Efficient Foundation Language Models
The paper presents LLaMA, a family of foundation language models ranging from 7B to 65B parameters trained exclusively on publicly available datasets. It argues that strong performance can be reached without proprietary data and at smaller parameter counts than prior models. LLaMA-13B outperforms the much larger GPT-3 175B on most benchmarks, and LLaMA-65B is competitive with the best contemporary models such as Chinchilla-70B and PaLM-540B.