Segment Anything
Alexander Kirillov, Eric Mintun, Nikhila Ravi · and 9 others (Meta AI)
Summary
This paper introduces the Segment Anything project: a promptable image segmentation task, the Segment Anything Model (SAM), and the SA-1B dataset. SAM combines an image encoder, a flexible prompt encoder (points, boxes, masks, text), and a fast mask decoder to produce valid segmentation masks from arbitrary prompts. Trained on over 1 billion masks across 11 million images, SAM shows strong zero-shot transfer to many segmentation tasks without additional training.
Key findings
- Introduces a promptable segmentation task that lets a single model generate valid masks for diverse prompt types.
- SAM demonstrates strong zero-shot generalization, often competitive with or superior to prior fully supervised task-specific models.
- Releases SA-1B, the largest segmentation dataset to date with over 1 billion masks on 11 million images.
Subjects & keywords
Cite this paper
Alexander Kirillov, Eric Mintun, & Nikhila Ravi [and 9 others (Meta AI)] (2023). Segment Anything. IEEE/CVF International Conference on Computer Vision (ICCV). https://doi.org/10.48550/arXiv.2304.02643
@inproceedings{kirillov2023segment,
author = {Alexander Kirillov and Eric Mintun and Nikhila Ravi and {and 9 others (Meta AI)}},
title = {Segment Anything},
booktitle = {IEEE/CVF International Conference on Computer Vision (ICCV)},
year = {2023},
doi = {10.48550/arXiv.2304.02643},
url = {https://arxiv.org/abs/2304.02643}
}