13. YOLO11¶
Ultralytics YOLO11 是新一代计算机视觉模型, 在目标检测、实例分割、图像分类、姿势估计、定向物体检测和对象跟踪等计算机视觉任务上展现了卓越的性能和准确性。
整体上yolo11相较于yolov8变化不大, 主要的改变有加入多头注意力机制,分类检测头加入深度可分离卷积等等,在性能和准确度上相对于yolov8有显著提升。
YOLO11详情请查看:https://github.com/ultralytics/ultralytics 。
本章将简单测试YOLO11目标检测模型,然后在鲁班猫rk系列板卡上部署。
13.1. YOLO11简单测试¶
13.1.1. 环境配置¶
使用Anaconda进行环境管理,创建一个yolo11环境:
conda create -n yolo11 python=3.10
conda activate yolo11
# 配置pip源
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple/
# 安装
pip install ultralytics
# 或者
git clone https://github.com/ultralytics/ultralytics.git
cd ultralytics
pip install -e .
# 检测版本
(yolo11) llh@llh:/xxx$ yolo version
8.3.29
13.1.2. 模型推理测试¶
# 使用测试yolo11n.pt模型
(yolo11) llh@llh:/xxx$ yolo predict model=yolo11n.pt source='https://ultralytics.com/images/bus.jpg'
Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n.pt to 'yolo11n.pt'...
100%|██████████████████████████████████████████████████████████████████████████| 5.35M/5.35M [00:16<00:00, 346kB/s]
Ultralytics 8.3.29 🚀 Python-3.10.15 torch-2.5.1 CUDA:0 (NVIDIA GeForce RTX 4070 Ti SUPER, 16376MiB)
YOLO11n summary (fused): 238 layers, 2,616,248 parameters, 0 gradients, 6.5 GFLOPs
Downloading https://ultralytics.com/images/bus.jpg to 'bus.jpg'...
100%|███████████████████████████████████████████████████████████████████████| 134k/134k [00:00<00:00, 1.25MB/s]
image 1/1 /xxx/bus.jpg: 640x480 4 persons, 1 bus, 54.0ms
Speed: 5.3ms preprocess, 54.0ms inference, 96.6ms postprocess per image at shape (1, 3, 640, 480)
Results saved to /xxx/runs/detect/predict
💡 Learn more at https://docs.ultralytics.com/modes/predict
VS Code: view Ultralytics VS Code Extension ⚡ at https://docs.ultralytics.com/integrations/vscode
结果保存在runs/detect/predict目录下:
13.2. 模型转换¶
13.2.1. 导出onnx模型¶
模型导出使用专门针对rknn优化的 ultralytics_yolo11 。 该工程在基于不影响输出结果, 不需要重新训练模型的条件下, 有以下改动:
修改输出结构, 移除后处理结构(后处理结果对于量化不友好);
dfl 结构在 NPU 处理上性能不佳,移至模型外部的后处理阶段,此操作大部分情况下可提升推理性能;
模型输出分支新增置信度的总和,用于后处理阶段加速阈值筛选。
git clone https://github.com/airockchip/ultralytics_yolo11.git
# 修改 ./ultralytics/cfg/default.yaml中model文件路径,默认为yolo11n.pt
# 导出onnx模型
export PYTHONPATH=./
python ./ultralytics/engine/exporter.py
导出的onnx模型会在同目录下,使用 netron 查看其模型输入输出:
13.2.2. 转换成rknn模型¶
使用toolkit2工具,简单编程,将onnx模型转换成rknn模型:
(toolkit2.2) llh@llh:/xxx/yolo11$ python onnx2rknn.py ./yolo11n.onnx rk3588 i8
I rknn-toolkit2 version: 2.2.0
--> Config model
done
--> Loading model
I Loading : 100%|██████████████████████████████████████████████| 174/174 [00:00<00:00, 81918.16it/s]
done
--> Building model
I OpFusing 0: 100%|██████████████████████████████████████████████| 100/100 [00:00<00:00, 571.25it/s]
I OpFusing 1 : 100%|█████████████████████████████████████████████| 100/100 [00:00<00:00, 279.63it/s]
I OpFusing 0 : 100%|█████████████████████████████████████████████| 100/100 [00:00<00:00, 164.75it/s]
I OpFusing 1 : 100%|█████████████████████████████████████████████| 100/100 [00:00<00:00, 152.38it/s]
I OpFusing 2 : 100%|█████████████████████████████████████████████| 100/100 [00:00<00:00, 144.98it/s]
I OpFusing 0 : 100%|█████████████████████████████████████████████| 100/100 [00:00<00:00, 104.63it/s]
I OpFusing 1 : 100%|█████████████████████████████████████████████| 100/100 [00:00<00:00, 100.10it/s]
I OpFusing 2 : 100%|██████████████████████████████████████████████| 100/100 [00:01<00:00, 75.79it/s]
W build: found outlier value, this may affect quantization accuracy
const name abs_mean abs_std outlier value
model.23.cv3.1.1.1.conv.weight 0.55 0.60 -12.173
model.23.cv3.0.0.0.conv.weight 0.25 0.35 -15.593
I GraphPreparing : 100%|███████████████████████████████████████| 228/228 [00:00<00:00, 10247.11it/s]
I Quantizating : 100%|████████████████████████████████████████████| 228/228 [00:18<00:00, 12.04it/s]
# 省略....................................
I rknn buiding done.
done
--> Export rknn model
done
13.3. 板卡上部署测试¶
获取配套例程,然后前面获得的rknn模型放到mode目录下:
# 拉取教程配套例程(测试例程可能及时没有更新)
git clone https://gitee.com/LubanCat/lubancat_ai_manual_code.git
# 或者拉取rknn_model_zoo例程
# git clone https://github.com/airockchip/rknn_model_zoo.git
重要
如果部署自己训练的模型,按前面的方式转换成rknn模型,在部署时请注意类别数量和类别,请修改例程中的coco_80_labels_list.txt文件和宏OBJ_CLASS_NUM等
切换到examples/yolo11/cpp目录下,然后编译例程(教程测试鲁班猫4,设置rk3588), 例程中yolo11_videocapture_demo将使用系统默认opencv, 如果你不需要,请修改CMakeLists.txt注释掉。
# 指定目标,如果内存大于4G,添加-d
cat@lubancat:/xxx/examples/yolo11/cpp$ ./build-linux.sh -t rk3588
./build-linux.sh -t rk3588
===================================
TARGET_SOC=rk3588
INSTALL_DIR=/xxx/examples/yolo11/cpp/install/rk3588_linux
BUILD_DIR=/xxx/examples/yolo11/cpp/build/build_rk3588_linux
ENABLE_DMA32=OFF
DISABLE_RGA=OFF
BUILD_TYPE=Release
ENABLE_ASAN=OFF
CC=aarch64-linux-gnu-gcc
CXX=aarch64-linux-gnu-g++
===================================
-- The C compiler identification is GNU 10.2.1
-- The CXX compiler identification is GNU 10.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/aarch64-linux-gnu-gcc - skipped
-- Detecting C compile features
# 省略...............................
[ 95%] Linking CXX executable yolo11_videocapture_demo
[100%] Linking CXX executable yolo11_videocapture_demo_zero_copy
[100%] Built target yolo11_videocapture_demo
[100%] Built target yolo11_videocapture_demo_zero_copy
[ 8%] Built target fileutils
[ 16%] Built target imageutils
[ 33%] Built target yolo11_videocapture_demo_zero_copy
[ 41%] Built target imagedrawing
[ 58%] Built target yolo11_image_demo_zero_copy
[ 75%] Built target yolo11_videocapture_demo
[ 91%] Built target yolo11_image_demo
[100%] Built target audioutils
在install/rk3588_linux目录下将生成4个可执行文件,带zero_copy结尾的是支持零拷贝的可执行程序。
下面执行yolo11_image_demo 例程,读取图像并进行检测,结果保存在当前目录下。
cat@lubancat:/xxx$ ./yolo11_image_demo model/yolo11n.rknn model/bus.jpg
load lable ./model/coco_80_labels_list.txt
model input num: 1, output num: 9
input tensors:
index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
output tensors:
index=0, name=462, n_dims=4, dims=[1, 64, 80, 80], n_elems=409600, size=409600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-44, scale=0.135466
index=1, name=onnx::ReduceSum_476, n_dims=4, dims=[1, 80, 80, 80], n_elems=512000, size=512000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003027
index=2, name=480, n_dims=4, dims=[1, 1, 80, 80], n_elems=6400, size=6400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003036
index=3, name=487, n_dims=4, dims=[1, 64, 40, 40], n_elems=102400, size=102400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-38, scale=0.093536
index=4, name=onnx::ReduceSum_501, n_dims=4, dims=[1, 80, 40, 40], n_elems=128000, size=128000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003476
index=5, name=505, n_dims=4, dims=[1, 1, 40, 40], n_elems=1600, size=1600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
index=6, name=512, n_dims=4, dims=[1, 64, 20, 20], n_elems=25600, size=25600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-42, scale=0.084287
index=7, name=onnx::ReduceSum_526, n_dims=4, dims=[1, 80, 20, 20], n_elems=32000, size=32000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003903
index=8, name=530, n_dims=4, dims=[1, 1, 20, 20], n_elems=400, size=400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
model is NHWC input fmt
model input height=640, width=640, channel=3
origin size=640x640 crop size=640x640
input image: 640 x 640, subsampling: 4:2:0, colorspace: YCbCr, orientation: 1
scale=1.000000 dst_box=(0 0 639 639) allow_slight_change=1 _left_offset=0 _top_offset=0 padding_w=0 padding_h=0
rga_api version 1.10.1_[0]
rknn_run
bus @ (91 136 554 440) 0.944
person @ (108 236 224 535) 0.894
person @ (211 240 284 509) 0.855
person @ (476 230 559 521) 0.816
person @ (79 358 118 516) 0.396
write_image path: out.png width=640 height=640 channel=3 data=0x55b715aee0
测试yolo11_videocapture_demo例程,将使用opencv读取摄像头或者视频文件:
# 教程测试鲁班猫4连接usb摄像头,然后执行程序,对其他屏幕显示的画面进行检测:
cat@lubancat:/xxx$ ./yolo11_videocapture_demo model/yolo11n.rknn 0
load lable ./model/coco_80_labels_list.txt
model input num: 1, output num: 9
input tensors:
index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800,
fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
output tensors:
# 省略......................