YOLO视觉检测应用标注篇（二）

这一篇我们插播一下SAM2，但事实上也是为了实现标注。

SAM2

Segment Anything Model 2
SAM2是Meta在SAM后面的第二代产品。主要增加了对视频的支持。
SAM2官网

为什么要使用SAM2

SAM2主要用于图形分割。
也就是说，能够识别物体。

因此将其用于YOLO图像的标注，会起到事半功倍的效果。
本篇简单介绍SAM2是为了实现SAM2模型下的YOLO图像半自动标注。

安装

根据官网介绍，按以下方式下载安装

1
2
3

git clone https://github.com/facebookresearch/sam2.git && cd sam2

pip install -e .

模型

SAM2提供了4个大小的模型

Model	Size (M)	Speed (FPS)	SA-V test (J&F)	MOSE val (J&F)	LVOS v2 (J&F)
sam2.1_hiera_tiny	38.9	91.2	76.5	71.8	77.3
sam2.1_hiera_small	46	84.8	76.6	73.5	78.3
sam2.1_hiera_base_plus	80.8	64.1	78.2	73.7	78.2
sam2.1_hiera_large	224.4	39.5	79.5	74.6	80.6

下载网页
下载config.yaml文件和checkpoint.pt文件，需要一一对应，目前最新的模型发布于September 29, 2024

使用

下面使用官方的Sample程序简单介绍下SAM2的使用

Image

官方Sample程序:

import torch
from sam2.build_sam import build_sam2
from sam2.sam2_image_predictor import SAM2ImagePredictor
# 导入需要的库

checkpoint = "./checkpoints/sam2.1_hiera_large.pt"
# 定义模型
model_cfg = "configs/sam2.1/sam2.1_hiera_l.yaml"
# 定义模型配置文件
predictor = SAM2ImagePredictor(build_sam2(model_cfg, checkpoint))
# 定义预测器
with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16):
    predictor.set_image(<your_image>)
    masks, _, _ = predictor.predict(<input_prompts>)
# 进行预测

更加推荐体验的方式是使用官方的example.ipynb来理解SAM2的使用
image_predictor_example.ipynb
也可以在Git后的NOTEBOOKS文件夹中找到

1 2	`input_point = np.array([[380, 400]]) input_label = np.array([1])`

参考下图，程序中的坐标就是选定点
选定点坐标
如果选定两个点，
则如下

1 2	`input_point = np.array([[380, 400], [1200, 600]]) input_label = np.array([1, 1])`

选两个点
如果第二个点是需要去除的
则代码如下：

1 2	`input_point = np.array([[380, 400], [1200, 600]]) input_label = np.array([1, 0])`

排除的点
如果是需要矩形框,则两个点为矩形框的对顶点
代码为：

1	`input_box = np.array([1400, 550, 1600, 800])`

如果矩形框中需要去除一块
代码为：

1 2	`input_point = np.array([[1525, 675]]) input_label = np.array([0])`

框内去除
上面简单解释了以下SAM2的图形分割。

INFO
对于图片内的坐标选取，我写了一个GUI程序，后续再分享。
因为SAM2对于我而言只是为了YOLO半自动标注，因此坐标获取并不重要。

Video

视频的分割与图片差不多。官方Sample程序就不放了，放个exaple.ipynb
video_predictor_example.ipynb
同样可以在NOTEBOOKS文件夹中找到
事实上视频是将视频采用为图片，然后进行分割、学习的结果，因此不再赘述。
值得介绍的是如何将视频采样为图片。
这个我将单独用一篇来介绍。（在SAM2的Video Example.ipynb中，其实有介绍）

将这两个Example跑通后，相信对于SAM2会有比较直观的认识。

Technology

#YOLO #SAM2

YOLO视觉检测应用标注篇（二）

http://kevin.zone.id/2025/04/18/sam2/

作者

Kevin

发布于

2025年4月18日

许可协议

YOLO视觉检测应用标注篇（三）上一篇

YOLO视觉检测应用标注篇（一）下一篇