7. YOLOv5(目标检测)

Yolov5 是一种目标检测算法,属于单阶段目标检测方法,是在COCO数据集上预训练的物体检测架构和模型系列, 它代表了Ultralytics对未来视觉AI方法的开源研究,其中包含了经过数千小时的研究和开发而形成的经验教训和最佳实践。 最新的YOLOv5 v7.0有YOLOv5n、YOLOv5s、YOLOv5m、YOLOv5l、YOLOv5x等,除了目标检测,还有分割,分类等应用场景。

YOLOv5基本原理,简单的讲是:将整张图片划分为若干个网络,每个网格预测出该网格内物体的种类和位置信息 ,然后根据预测框与真实框之间的交并比值进行目标框的筛选,最终输出预测框。

本章将简单测试YOLOv5,并在鲁班猫板卡上部署测试。

提示

测试环境:鲁班猫RK板卡系统是Debian10/11,PC是WSL2(ubuntu20.04),PyTorch是2.1.2,YOLOv5 v7.0,airockchip/yolov5 v6.2。

7.1. YOLOv5环境安装

安装python等环境,以及相关依赖库,然后克隆YOLOv5仓库的源码。

# 安装anaconda参考前面环境搭建教程,然后使用conda命令创建环境
conda create -n yolov5 python=3.9
conda activate yolov5

# 拉取最新的yolov5(教程测试时是v7.0),可以指定下版本分支
# git clone https://github.com/ultralytics/yolov5.git -b v7.0
git clone https://github.com/ultralytics/yolov5
cd yolov5

# 安装依赖库
pip3 install -r requirements.txt

# 进入python命令行,检测安装的环境
(yolov5) llh@anhao:~yolov5$ python
Python 3.9.18 (main, Sep 11 2023, 13:41:44)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import utils
>>> display=utils.notebook_init()
Checking setup...
YOLOv5 🚀 v7.0-253-g63555c8 Python-3.9.18 torch-2.1.2+cu121 CUDA:0 (NVIDIA GeForce RTX 3060 Laptop GPU, 6144MiB)
Setup complete(12 CPUs, 15.6 GB RAM, 142.6/1006.9 GB disk)

7.2. YOLOv5简单测试

接下来将简单测试Ultralytics/YOLOv5,并在鲁班猫板卡上部署。

7.2.1. 获取预训练权重文件

下载yolov5s.pt,yolov5m.pt,yolov5l.pt,yolov5x.pt权重文件,可以直接从 这里 获取。 其中后面n、s、m、l、x表示网络的宽度和深度,最小的是n,它速度最快、精度最低。

7.2.2. YOLOv5简单测试

进入yolov5源码目录,把前面下载的权重文件放到当前目录下,两张测试图片位于./data/images/,

# 简单测试
# --source指定测试数据,可以是图片或者视频等
# --weights指定权重文件路径,可以是自己训练的,或者直接指定会默认拉取
(yolov5) llh@anhao:~/yolov5$ python3 detect.py --source ./data/images/ --weights yolov5s.pt
detect: weights=['yolov5s.pt'], source=./data/images/, data=data/coco128.yaml, imgsz=[640, 640], conf_thres=0.25,
iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_csv=False, save_conf=False, save_crop=False,
nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect,
name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v7.0-253-g63555c8 Python-3.9.18 torch-2.1.2+cu121 CUDA:0 (NVIDIA GeForce RTX 3060 Laptop GPU, 6144MiB)
Fusing layers...
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
image 1/2 /home/llh//yolov5/data/images/bus.jpg: 640x480 4 persons, 1 bus, 73.5ms
image 2/2 /home/llh/yolov5/data/images/zidane.jpg: 384x640 2 persons, 2 ties, 71.4ms
Speed: 1.0ms pre-process, 72.4ms inference, 159.0ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/detect/exp

上面依次显示:基础配置、网络参数、检测的结果、处理速度、最后是保存的位置, 切换到runs/detect/exp目录,查看检测的结果图片(下面展示其中一张)。

broken

7.2.3. 导出rknn模型以及简单测试

接下来将基于yolov5s.pt,导出rknn:

1、转成yolov5s.onnx(也可以是torchscript等模型),需要安装下onnx环境。

# 安装下onnx的环境
pip3 install -r requirements.txt onnx onnx-simplifier

# 指定权重文件yolov5s.pt或者自己训练的模型文件best.pt,使用下面命令将导出onnx模型
python3 export.py --weights yolov5s.pt --include  onnx

# 或者使用下面命令,导出torchscript
python3 export.py --weights yolov5s.pt --include  torchscript

使用export.py导出onnx模型:

(yolov5) llh@anhao:~/yolov5$ python export.py --weights yolov5s.pt --include  onnx
export: data=data/coco128.yaml, weights=['yolov5s.pt'], imgsz=[640, 640], batch_size=1, device=cpu, half=False,
inplace=False, keras=False, optimize=False, int8=False, per_tensor=False, dynamic=False, simplify=False, opset=17,
verbose=False, workspace=4, nms=False, agnostic_nms=False, topk_per_class=100, topk_all=100,
iou_thres=0.45, conf_thres=0.25, include=['onnx']
YOLOv5 🚀 v7.0-253-g63555c8 Python-3.9.18 torch-2.1.2+cu121 CPU

Fusing layers...
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs

PyTorch: starting from yolov5s.pt with output shape (1, 255, 80, 80) (14.1 MB)

ONNX: starting export with onnx 1.12.0...
ONNX: export success ✅ 0.8s, saved as yolov5s.onnx (27.6 MB)

Export complete (2.4s)
Results saved to /home/llh/yolov5
Detect:          python detect.py --weights yolov5s.onnx
Validate:        python val.py --weights yolov5s.onnx
PyTorch Hub:     model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s.onnx')
Visualize:       https://netron.app

会在当前目录下生成yolov5s.onnx文件。我们可以使用 Netron 可视化模型:

broken

2、重新导出

看上面的图片,转出onnx模型尾部最后经过Detect,为适配到rknn做适当处理,这里移除这个网络结构(但模型结构尾部的sigmoid函数没有删除),直接输出三个特征图。 详细参考下 这里

需要修改yolov5的检查头部分,以及模型导出部分,注意下面只是简单修改测试,只用于模型导出。

models/yolo.py
# 在models/yolo.py中,修改类Detect的forward函数

# def forward(self, x):
#     z = []  # inference output
#     for i in range(self.nl):
#         x[i] = self.m[i](x[i])  # conv
#         bs, _, ny, nx = x[i].shape  # x(bs,255,20,20) to x(bs,3,20,20,85)
#         x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()

#         if not self.training:  # inference
#             if self.dynamic or self.grid[i].shape[2:4] != x[i].shape[2:4]:
#                 self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)

#             if isinstance(self, Segment):  # (boxes + masks)
#                 xy, wh, conf, mask = x[i].split((2, 2, self.nc + 1, self.no - self.nc - 5), 4)
#                 xy = (xy.sigmoid() * 2 + self.grid[i]) * self.stride[i]  # xy
#                 wh = (wh.sigmoid() * 2) ** 2 * self.anchor_grid[i]  # wh
#                 y = torch.cat((xy, wh, conf.sigmoid(), mask), 4)
#             else:  # Detect (boxes only)
#                 xy, wh, conf = x[i].sigmoid().split((2, 2, self.nc + 1), 4)
#                 xy = (xy * 2 + self.grid[i]) * self.stride[i]  # xy
#                 wh = (wh * 2) ** 2 * self.anchor_grid[i]  # wh
#                 y = torch.cat((xy, wh, conf), 4)
#             z.append(y.view(bs, self.na * nx * ny, self.no))

#     return x if self.training else (torch.cat(z, 1), ) if self.export else (torch.cat(z, 1), x)

def forward(self, x):
    z = []  # inference output
    for i in range(self.nl):
        z.append(torch.sigmoid(self.m[i](x[i])))

    return z
export.py
# 在export.py文件run()函数中修改:

    if half and not coreml:
        im, model = im.half(), model.half()  # to FP16
-   shape = tuple((y[0] if isinstance(y, tuple) else y).shape)  # model output shape
+   shape = tuple((y[0] if (isinstance(y, tuple) or (isinstance(y, list))) else y).shape)  # model output shape
    metadata = {'stride': int(max(model.stride)), 'names': model.names}  # model metadata
    LOGGER.info(f"\n{colorstr('PyTorch:')} starting from {file} with output shape {shape} ({file_size(file):.1f} MB)")

修改之后重新执行命令转换: python export.py --weights yolov5s.pt --include  onnx 使用 Netron 可视化模型:

broken

3、转换成rknn模型

这里将前面的onnx模型转换成rknn模型,需要安装下rknn-Toolkit2等环境,参考下前面开发环境章节。

onnx2rknn.py(参考配套例程)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
if __name__ == '__main__':

    # Create RKNN object
    rknn = RKNN(verbose=False)

    # Pre-process config
    print('--> Config model')
    rknn.config(mean_values=[[0, 0, 0]], std_values=[
                    [255, 255, 255]], target_platform=platform)
    print('done')

    # Load model
    print('--> Loading model')
    ret = rknn.load_onnx(model=MODEL_PATH)
    #ret = rknn.load_pytorch(model=MODEL_PATH, input_size_list=[[1, 3, 640, 640]])
    if ret != 0:
        print('Load model failed!')
        exit(ret)
    print('done')

    # Build model
    print('--> Building model')
    ret = rknn.build(do_quantization=DEFAULT_QUANT, dataset=DATASET_PATH)
    if ret != 0:
        print('Build model failed!')
        exit(ret)
    print('done')

    # Export rknn model
    print('--> Export rknn model')
    ret = rknn.export_rknn(RKNN_MODEL)
    if ret != 0:
        print('Export rknn model failed!')
        exit(ret)
    print('done')

    # 精度分析,,输出目录./snapshot
    #print('--> Accuracy analysis')
    #ret = rknn.accuracy_analysis(inputs=['./subset/000000052891.jpg'])
    #if ret != 0:
    #    print('Accuracy analysis failed!')
    #    exit(ret)
    #print('done')

    # Release
    rknn.release()

运行模型转换程序,导出rknn模型:

# 教程测试是lubancat-4(设置rk3588平台)
(toolkit2_1.6) llh@YH-LONG:~/xxx/yolov5$ python3 test.py
W __init__: rknn-toolkit2 version: 1.6.0+81f21f4d
--> Config model
done
--> Loading model
W load_onnx: It is recommended onnx opset 19, but your onnx model opset is 17!
W load_onnx: Model converted from pytorch, 'opset_version' should be set 19 in torch.onnx.export for successful convert!
Loading : 100%|████████████████████████████████████████████████| 134/134 [00:00<00:00, 18444.97it/s]
done
--> Building model
W build: found outlier value, this may affect quantization accuracy
const name               abs_mean    abs_std     outlier value
model.0.conv.weight      0.83        1.39        14.282
GraphPreparing : 100%|██████████████████████████████████████████| 145/145 [00:00<00:00, 5296.85it/s]
Quantizating : 100%|█████████████████████████████████████████████| 145/145 [00:00<00:00, 265.99it/s]
# 省略.....
done
--> Export rknn model
done

# 导出rknn模型,保存在./model/yolov5s.rknn

4、连板测试模型

我们也可以使用rknn-Toolkit2进行连板测试、性能和内存评估等等。需要板卡通过usb或者连接PC,确认adb连接正常,以及启动rknn_server。

# 连接板卡测试模型,程序请从配套例程获取
(toolkit2_1.6) llh@YH-LONG:~/xxx/yolov5$ python test.py
W __init__: rknn-toolkit2 version: 1.6.0+81f21f4d
*************************
all device(s) with adb mode:
192.168.103.152:5555
*************************
I NPUTransfer: Starting NPU Transfer Client, Transfer version 2.1.0 (b5861e7@2020-11-23T11:50:36)
D RKNNAPI: ==============================================
D RKNNAPI: RKNN VERSION:
D RKNNAPI:   API: 1.6.0 (535b468 build@2023-12-11T09:05:46)
D RKNNAPI:   DRV: rknn_server: 1.5.0 (17e11b1 build: 2023-05-18 21:43:39)
D RKNNAPI:   DRV: rknnrt: 1.6.0 (9a7b5d24c@2023-12-13T17:31:11)
D RKNNAPI: ==============================================
D RKNNAPI: Input tensors:
D RKNNAPI:   index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800,
 w_stride = 0, size_with_stride = 0, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
D RKNNAPI: Output tensors:
D RKNNAPI:   index=0, name=output0, n_dims=4, dims=[1, 255, 80, 80], n_elems=1632000, size=1632000,
 w_stride = 0, size_with_stride = 0, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003905
D RKNNAPI:   index=1, name=360, n_dims=4, dims=[1, 255, 40, 40], n_elems=408000, size=408000,
w_stride = 0, size_with_stride = 0, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003917
D RKNNAPI:   index=2, name=362, n_dims=4, dims=[1, 255, 20, 20], n_elems=102000, size=102000,
w_stride = 0, size_with_stride = 0, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003918
--> Running model
done
class        score      xmin, ymin, xmax, ymax
--------------------------------------------------
person       0.816     [ 211,  241,  285,  518]
person       0.812     [ 472,  230,  561,  522]
person       0.790     [ 115,  233,  207,  546]
person       0.455     [  79,  334,  122,  517]
 bus         0.770     [  88,  131,  557,  464]
Save results to result.jpg!

测试结果保存在result.jpg。

7.2.4. 部署到鲁班猫板卡

导出rknn模型后,使用RKNN Toolkit Lite2在板端进行简单部署,获得结果后进行画框等后处理等等,也可以使用rknpu2的C/C++ API进行部署, 例程参考 rknn_model_zoo仓库

拉取例程到板卡,然后编译例程,注意该例程目前只能在linux平台上。

# 拉取配套例程程序或者rknn_model_zoo仓库程序到板卡
cat@lubancat:~/$ git clone https://gitee.com/LubanCat/lubancat_ai_manual_code.git
cat@lubancat:~/$ cd lubancat_ai_manual_code/example/yolov5/cpp

# 编译例程,教程测试是lubancat-4(指定参数rk3588)
cat@lubancat:~/lubancat_ai_manual_code/example/yolov5/cpp$ ./build-linux.sh -t rk3588
./build-linux.sh -t rk3588
===================================
TARGET_SOC=rk3588
INSTALL_DIR=/home/cat/lubancat_ai_manual_code/example/yolov5/cpp/install/rk3588_linux
BUILD_DIR=/home/cat/lubancat_ai_manual_code/example/yolov5/cpp/build/build_rk3588_linux
CC=aarch64-linux-gnu-gcc
CXX=aarch64-linux-gnu-g++
===================================
-- The C compiler identification is GNU 10.2.1
-- The CXX compiler identification is GNU 10.2.1
# 省略....
[100%] Linking CXX executable rknn_yolov8_demo
[100%] Built target rknn_yolov8_demo
[100%] Built target rknn_yolov8_demo
Install the project...
# 省略....

复制前面导出rknn模型到install/rk3588_linux目录下,然后运行程序,模型推理。

# ./rknn_yolov5_demo <model_path> <image_path>
cat@lubancat:~/lubancat_ai_manual_code/example/yolov5/cpp/install/rk3588_linux$ ./rknn_yolov5_demo ./yolov5s.rknn ./model/bus.jpg
load lable ./model/coco_80_labels_list.txt
model input num: 1, output num: 3
input tensors:
index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
output tensors:
index=0, name=output0, n_dims=4, dims=[1, 255, 80, 80], n_elems=1632000, size=1632000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003905
index=1, name=360, n_dims=4, dims=[1, 255, 40, 40], n_elems=408000, size=408000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003917
index=2, name=362, n_dims=4, dims=[1, 255, 20, 20], n_elems=102000, size=102000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003918
model is NHWC input fmt
model input height=640, width=640, channel=3
scale=1.000000 dst_box=(0 0 639 639) allow_slight_change=1 _left_offset=0 _top_offset=0 padding_w=0 padding_h=0
src width=640 height=640 fmt=0x1 virAddr=0x0x7f95dfe040 fd=0
dst width=640 height=640 fmt=0x1 virAddr=0x0x5590d37cf0 fd=0
src_box=(0 0 639 639)
dst_box=(0 0 639 639)
color=0x72
rga_api version 1.10.0_[2]
rknn_run
person @ (211 241 285 518) 0.816
person @ (472 230 561 522) 0.812
person @ (115 233 207 546) 0.790
bus @ (88 131 557 464) 0.770
person @ (79 334 122 517) 0.455
rknn run and process use 29.048000 ms

结果会保存在当前目录下的out.jpg,查看图片:

broken

如果测试提示RGA相关问题,请参考下 Rockchip_FAQ_RGA_CN.md

7.3. airockchip/yolov5简单测试

上面是使用Ultralytics官方的YOLOv5 v7.0进行简单测试,接下来我们使用 airockchip/yolov5 仓库, 该仓库的yolov5对rknpu设备进行了部署优化:

  • 优化focus/SPPF块,以相同的结果获得更好的性能;

  • 更改输出节点,从模型中删除post_process(后处理量化方面不友好);

  • 使用ReLU代替SiLU作为激活层(仅在使用该仓库训练新模型时才会替换)。

拉取最新的airockchip/yolov5仓库,切换到rk_opt@v6.2.1分支,如果使用默认使用master也是一样的。

# 测试时是rk_opt@v6.2.1分支,目标检测使用master也是一样的,操作类似。
git clone -b rk_opt@v6.2.1 https://github.com/airockchip/yolov5.git
cd yolov5

获取权重文件,然后需要我们重新训练模型,也可以使用自己Ultralytics官方仓库训练的模型,然后使用该库导出适配rknpu的模型。

# 可以自定义数据集添加数据集配置文件,修改模型配置文件等等,最后使用该仓库训练模型
# 教程测试就重新训练coco128,使用预训练权重yolov5s.pt
(yolov5) llh@anhao:~/yolov5$ python3 train.py --data coco128.yaml --weights yolov5s.pt  --img 640

# 训练会自动拉取权重和数据集,训练过程中结束输出信息很多,其中会输出:
...
Optimizer stripped from runs/train/exp/weights/last.pt, 14.9MB
Optimizer stripped from runs/train/exp/weights/best.pt, 14.9MB

Validating runs/train/exp/weights/best.pt...
...
# 训练一些分析和权重保存在runs/train/exp/目录下

模型最后保存在runs/train/exp/weights/目录下的best.bt,接下来将该模型导出为onnx模型。

# export.py 参数说明
# --weights指定权重文件的路径
# --rknpu指定平台(rk_platform支持 rk1808, rv1109, rv1126, rk3399pro, rk3566, rk3568, rk3588, rv1103, rv1106)
# --include 设置导出模型格式,默认导出torchscript格式模型,也可以指定onnx
python3 export.py --weights runs/train/exp/weights/best.pt --rknpu rk3588
export: data=data/coco128.yaml, weights=['runs/train/exp4/weights/best.pt'], imgsz=[640, 640],
batch_size=1, device=cpu, half=False, inplace=False, train=False, keras=False, optimize=False, int8=False,
dynamic=False, simplify=False, opset=12, verbose=False, workspace=4, nms=False, agnostic_nms=False, topk_per_class=100,
topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['torchscript'], rknpu=rk3588
YOLOv5 🚀 v6.2-4-g23a20ef3 Python-3.9.18 torch-2.1.2+cu121 CPU

Fusing layers...
Model summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
---> save anchors for RKNN
[[10.0, 13.0], [16.0, 30.0], [33.0, 23.0], [30.0, 61.0], [62.0, 45.0], [59.0, 119.0], [116.0, 90.0], [156.0, 198.0], [373.0, 326.0]]

PyTorch: starting from runs/train/exp4/weights/best.pt with output shape (1, 255, 80, 80) (14.2 MB)

TorchScript: starting export with torch 2.1.2+cu121...
TorchScript: export success, saved as runs/train/exp4/weights/best.torchscript (27.9 MB)

Export complete (2.82s)
Results saved to /mnt/f/wsl_file/wsl_ai/yolov5/airockchip/yolov5-6.2.1/runs/train/exp4/weights
Detect:          python detect.py --weights runs/train/exp4/weights/best.torchscript
Validate:        python val.py --weights runs/train/exp4/weights/best.torchscript
PyTorch Hub:     model = torch.hub.load('ultralytics/yolov5', 'custom', 'runs/train/exp4/weights/best.torchscript')
Visualize:       https://netron.app

# 或者使用下面命令,导出onnx模型,会在runs/train/exp/weights目录下生成best.onnx文件,注意下环境可能需要安装onnx。
python3 export.py --weights runs/train/exp/weights/best.pt --include onnx --rknpu rk3588

导出torchscript模型,会保存在对应模型目录runs/train/exp/weights下,名称为best.torchscript。

7.3.1. 转换成rknn模型并连板测试

测试使用前面的转换文件,直接运行程序导出rknn模型文件, 或者使用 这里 的工具进行模型转换,模型评估,模型部署等。

# 运行转换程序,将torchscript或者onnx模型转成rknn模型。
# 请根据自己环境,修改pt2rknn.py文件中模型的路径以及运行的平台等等。
(toolkit2_1.6) llh@YH-LONG:~/xxx/yolov5$ python pt2rknn.py
W __init__: rknn-toolkit2 version: 1.6.0+81f21f4d
--> Config model
done
--> Loading model
PtParse: 100%|███████████████████████████████████████████████████| 728/728 [00:01<00:00, 638.25it/s]
Loading : 100%|██████████████████████████████████████████████████| 129/129 [00:00<00:00, 225.55it/s]
done
--> Building model
W build: found outlier value, this may affect quantization accuracy
const name     abs_mean    abs_std     outlier value
weight.24      0.81        1.33        13.767
GraphPreparing : 100%|██████████████████████████████████████████| 149/149 [00:00<00:00, 5248.43it/s]
Quantizating : 100%|██████████████████████████████████████████████| 149/149 [00:01<00:00, 92.63it/s]
# 省略...
done
--> Export rknn model
done

然后再使用rknn-toolkit2工具连接板卡,进行模型测试,评估等等。需要板卡通过usb或者连接PC,确认adb连接正常,以及启动rknn_server。

# 连接板卡测试模型,结果保存在result.jpg
(toolkit2_1.6) llh@YH-LONG:~/xxx/yolov5$ python test.py
W __init__: rknn-toolkit2 version: 1.6.0+81f21f4d
*************************
all device(s) with adb mode:
192.168.103.152:5555
*************************
I NPUTransfer: Starting NPU Transfer Client, Transfer version 2.1.0 (b5861e7@2020-11-23T11:50:36)
D RKNNAPI: ==============================================
D RKNNAPI: RKNN VERSION:
D RKNNAPI:   API: 1.6.0 (535b468 build@2023-12-11T09:05:46)
D RKNNAPI:   DRV: rknn_server: 1.5.0 (17e11b1 build: 2023-05-18 21:43:39)
D RKNNAPI:   DRV: rknnrt: 1.6.0 (9a7b5d24c@2023-12-13T17:31:11)
D RKNNAPI: ==============================================
D RKNNAPI: Input tensors:
D RKNNAPI:   index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, w_stride = 0,
# 省略....
--> Running model
done
class        score      xmin, ymin, xmax, ymax
--------------------------------------------------
person       0.884     [ 209,  244,  286,  506]
person       0.868     [ 478,  238,  559,  526]
person       0.825     [ 110,  238,  230,  534]
person       0.339     [  79,  354,  122,  516]
 bus         0.705     [  94,  129,  553,  468]
Save results to result.jpg!

7.3.2. 部署到板卡

在板卡部署,可以使用rknn-toolkit-lite2(python接口),也可以使用RKNPU(C/C++接口)。 接下来我们将参考 rknn_model_zoo仓库 提供的部署例程,进行部署测试。

# 拉取例程(和前面测试是同一个部署程序),如果没有编译可以执行下面命令编译程序,教程测试是lubancat-4(指定参数rk3588)
cat@lubancat:~/lubancat_ai_manual_code/example/yolov5/cpp$ ./build-linux.sh -t rk3588

编译完成后,会生成多个可执行程序(保存在install/rk3588_linux目录下),教程测试对图片进行推理,切换到该目录下执行程序:

# ./rknn_yolov5_demo <model_path> <image_path>
cat@lubancat:~/lubancat_ai_manual_code/example/yolov5/cpp$ ./rknn_yolov5_demo ./yolov5s.rknn ./model/bus.jpg
load lable ./model/coco_80_labels_list.txt
model input num: 1, output num: 3
input tensors:
index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
output tensors:
index=0, name=output, n_dims=4, dims=[1, 255, 80, 80], n_elems=1632000, size=1632000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003860
index=1, name=283, n_dims=4, dims=[1, 255, 40, 40], n_elems=408000, size=408000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
index=2, name=285, n_dims=4, dims=[1, 255, 20, 20], n_elems=102000, size=102000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003915
model is NHWC input fmt
model input height=640, width=640, channel=3
scale=1.000000 dst_box=(0 0 639 639) allow_slight_change=1 _left_offset=0 _top_offset=0 padding_w=0 padding_h=0
src width=640 height=640 fmt=0x1 virAddr=0x0x7f834b2040 fd=0
dst width=640 height=640 fmt=0x1 virAddr=0x0x55789d1520 fd=0
src_box=(0 0 639 639)
dst_box=(0 0 639 639)
color=0x72
rga_api version 1.10.0_[2]
rknn_run
person @ (209 244 286 506) 0.884
person @ (478 238 559 526) 0.868
person @ (110 238 230 534) 0.825
bus @ (94 129 553 468) 0.705
person @ (79 354 122 516) 0.339
rknn run and process use 21.987000 ms

结果保存在当前目录的out.jpg,更多模型测试程序参考下 rknn_model_zoo仓库