6. YoloX(目标检测)¶
YOLOX 是anchor-free版本的YOLO, 设计更简单,性能更好!它旨在弥合研究界和工业界之间的差距。
YoloX模型结构详细请查看:https://arxiv.org/abs/2107.08430 。
本章将简单进行YoloX推理测试,并在鲁班猫rk系列板卡上部署测试。
6.1. YoloX简单测试¶
创建一个虚拟环境,然后克隆yolox仓库的源码,安装相关环境。
# 创建一个环虚拟环境
conda create -n yolox python=3.9
conda activate yolox
# 安装 yolox
git clone git@github.com:Megvii-BaseDetection/YOLOX.git
cd YOLOX
pip3 install -r requirements.txt
pip3 install -v -e .
从 YOLOX 获取权重文件,然后推理测试:
# 测试yolox-s模型,获取权重(基于COCO数据集):
cd YOLOX
wget https://github.com/Megvii-BaseDetection/YOLOX/releases/download/0.1.1rc0/yolox_s.pth
# 测试推理图像,使用工程文件tools/demo.py,测试图像是YOLOx工程文件assets/dog.jpg
python tools/demo.py image -n yolox-s -c yolox_s.pth --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result --device cpu
2024-10-15 17:23:06.080 | INFO | __main__:main:259 - Args: Namespace(demo='image', experiment_name='yolox_s',
name='yolox-s', path='assets/dog.jpg', camid=0, save_result=True, exp_file=None, ckpt='../yolox_s.pth',
device='cpu', conf=0.25, nms=0.45, tsize=640, fp16=False, legacy=False, fuse=False, trt=False)
2024-10-15 17:23:06.368 | INFO | __main__:main:269 - Model Summary: Params: 8.97M, Gflops: 26.93
2024-10-15 17:23:06.369 | INFO | __main__:main:282 - loading checkpoint
ckpt = torch.load(ckpt_file, map_location="cpu")
2024-10-15 17:23:06.652 | INFO | __main__:main:286 - loaded checkpoint done.
2024-10-15 17:23:06.784 | INFO | __main__:inference:165 - Infer time: 0.0971s
2024-10-15 17:23:06.788 | INFO | __main__:image_demo:202 - Saving detection result in ./YOLOX_outputs/yolox_s/vis_res/2024_10_15_17_23_06/dog.jpg
结果保存在当前目录下的YOLOX_outputs目录下,查看结果图像:
YOLOX更多使用教程参考下 YOLOX documentation 。
6.2. YoloX模型导出¶
模型导出,我们使用 airockchip/yolox 仓库,该仓库的yolox对rknpu设备进行了部署优化:
优化focus/SPPF块,以相同的结果获得更好的性能;
更改输出节点,从模型中删除post_process(后处理量化方面不友好);
使用自行训练yolox模型或者从 YOLOX 获取权重文件,然后导出onnx模型。
# 可以重新创建一个虚拟环境,然后获取
git clone https://github.com/airockchip/YOLOX.git
cd YOLOX
pip3 install -r requirements.txt
# 测试yolox-s 模型,获取权重:
wget https://github.com/Megvii-BaseDetection/YOLOX/releases/download/0.1.1rc0/yolox_s.pth
# 导出onnx模型
python3 tools/export_onnx.py --output-name yolox_s.onnx -n yolox-s -c yolox_s.pth --rknpu
2024-10-15 16:09:15.700 | INFO | __main__:main:70 - args value: Namespace(output_name='yolox_s.onnx', input='images',
output='output', opset=11, batch_size=1, dynamic=False, no_onnxsim=False, exp_file=None, experiment_name=None, name='yolox-s',
ckpt='../yolox_s.pth', rknpu=True, opts=[], decode_in_inference=False)
/xxx/tools/export_onnx.py:85: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value),
ckpt = torch.load(ckpt_file, map_location="cpu")
2024-10-15 16:09:16.116 | INFO | __main__:main:100 - loading checkpoint done.
/xxx/tools/export_onnx.py:103: FutureWarning: 'torch.onnx._export' is
deprecated in version 1.12.0 and will be removed in 2.0. Please use `torch.onnx.export` instead.
torch.onnx._export(
2024-10-15 16:09:16.902 | INFO | __main__:main:113 - generated onnx model named yolox_s.onnx
2024-10-15 16:09:17.364 | INFO | __main__:main:129 - generated simplified onnx model named yolox_s.onnx
在当前目录下输出onnx模型,然后使用toolkit2工具将器转换成rknn模型。
(toolkit2.2) llh@llh:/xxx/python$ python convert.py ./yolox_s.onnx rk3588 i8
I rknn-toolkit2 version: 2.2.0
--> Config model
done
--> Loading model
I Loading : 100%|██████████████████████████████████████████████| 169/169 [00:00<00:00, 34709.50it/s]
done
--> Building model
I OpFusing 0: 100%|██████████████████████████████████████████████| 100/100 [00:00<00:00, 747.18it/s]
I OpFusing 1 : 100%|█████████████████████████████████████████████| 100/100 [00:00<00:00, 565.37it/s]
I OpFusing 2 : 100%|█████████████████████████████████████████████| 100/100 [00:00<00:00, 155.85it/s]
I GraphPreparing : 100%|███████████████████████████████████████| 195/195 [00:00<00:00, 11410.76it/s]
I Quantizating : 100%|████████████████████████████████████████████| 195/195 [00:33<00:00, 5.78it/s]
W build: The default input dtype of 'images' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of 'output' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of '788' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of 'output.1' is changed from 'float32' to 'int8' in rknn model for performance!
Please take care of this change when deploy rknn model with Runtime API!
I rknn building ...
I rknn buiding done.
done
--> Export rknn model
done
6.3. YoloX模型部署¶
在板卡部署使用RKNPU(C/C++接口),参考 rknn_model_zoo仓库 提供的部署例程,进行部署测试。
# 拉取例程,或者拉取rknn_model_zoo仓库例程,直接在板卡上编译
cat@lubancat:~/$ git clone https://gitee.com/LubanCat/lubancat_ai_manual_code.git
cat@lubancat:~/$ cd lubancat_ai_manual_code/example/yolov5/cpp
# 如果没有编译可以执行下面命令编译程序,教程测试是lubancat-4/5(指定参数rk3588)
cat@lubancat:~/lubancat_ai_manual_code/example/yolov5/cpp$ ./build-linux.sh -t rk3588 -r
./build_linux.sh -t rk3588 -r
===================================
TARGET_SOC=rk3588
INSTALL_DIR=/home/cat/xxx/examples/yolox/cpp/install/rk3588_linux
BUILD_DIR=/home/cat/xxx/examples/yolox/cpp/build/build_rk3588_linux
ENABLE_DMA32=
DISABLE_RGA=ON
CC=aarch64-linux-gnu-gcc
CXX=aarch64-linux-gnu-g++
===================================
-- The C compiler identification is GNU 10.2.1
-- The CXX compiler identification is GNU 10.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/aarch64-linux-gnu-gcc - skipped
-- Detecting C compile features
# 省略.............
[ 75%] Building CXX object CMakeFiles/rknn_yolox_demo.dir/main.cc.o
[ 91%] Building CXX object CMakeFiles/rknn_yolox_demo.dir/rknpu2/yolox.cc.o
[100%] Linking CXX executable rknn_yolox_demo
# 省略.............
编译完成后,会生成可执行程序(保存在install/rk3588_linux目录下),教程测试对图片进行推理,切换到该目录下执行程序:
# ./rknn_yolox_demo <model_path> <image_path>
cat@lubancat:~/xxx/install/rk3588_linux$ ./rknn_yolox_demo ./yolox.rknn model/bus.jpg
load lable ./model/coco_80_labels_list.txt
model input num: 1, output num: 3
input tensors:
index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=1.000000
output tensors:
index=0, name=output, n_dims=4, dims=[1, 85, 80, 80], n_elems=544000, size=544000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-28, scale=0.022949
index=1, name=788, n_dims=4, dims=[1, 85, 40, 40], n_elems=136000, size=136000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-26, scale=0.024599
index=2, name=output.1, n_dims=4, dims=[1, 85, 20, 20], n_elems=34000, size=34000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-19, scale=0.021201
model is NHWC input fmt
model input height=640, width=640, channel=3
init_yolox_model use: 35.960999 ms
origin size=640x640 crop size=640x640
input image: 640 x 640, subsampling: 4:2:0, colorspace: YCbCr, orientation: 1
scale=1.000000 dst_box=(0 0 639 639) allow_slight_change=1 _left_offset=0 _top_offset=0 padding_w=0 padding_h=0
convert image use cpu
finish
convert_image_with_letterbox use: 25.207001 ms
rknn_inputs_set use: 1.952000 ms
rknn_run use: 28.274000 ms
rknn_outputs_get use: 0.419000 ms
post_process use: 0.117000 ms
inference_yolox_model use: 56.014999 ms
bus @ (87 137 550 428) 0.930
person @ (103 237 223 535) 0.896
person @ (210 235 286 513) 0.871
person @ (474 235 559 519) 0.831
person @ (80 328 118 516) 0.499
write_image path: result.png width=640 height=640 channel=3 data=0x559024524
在install/rk3588_linux目录下, 还有一个yolox_videocapture_demo例程,支持打开摄像头,这里测试打开摄像头,然后对其他显示器显示的图片进行检测。
# ./rknn_yolox_demo <model_path> <video path/capture>
load lable ./model/coco_80_labels_list.txt
model input num: 1, output num: 3
input tensors:
index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=1.000000
output tensors:
index=0, name=output, n_dims=4, dims=[1, 85, 80, 80], n_elems=544000, size=544000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-28, scale=0.022949
index=1, name=788, n_dims=4, dims=[1, 85, 40, 40], n_elems=136000, size=136000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-26, scale=0.024599
index=2, name=output.1, n_dims=4, dims=[1, 85, 20, 20], n_elems=34000, size=34000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-19, scale=0.021201
model is NHWC input fmt
model input height=640, width=640, channel=3
scale=1.000000 dst_box=(0 0 639 639) allow_slight_change=1 _left_offset=0 _top_offset=0 padding_w=0 padding_h=0
rga_api version 1.10.1_[0]
convert_image_with_letterbox use: 1.553000 ms
rknn_inputs_set use: 2.731000 ms
rknn_run use: 29.483999 ms
rknn_outputs_get use: 0.437000 ms
post_process use: 0.048000 ms
-- inference_yolox_model use: 34.460999 ms
bus @ (84 132 557 435) 0.951
person @ (106 235 218 534) 0.896
person @ (212 239 285 510) 0.871
person @ (474 236 559 520) 0.831
person @ (80 326 119 517) 0.535
handbag @ (212 367 252 418) 0.258
# 省略........
需要注意:板卡烧录带桌面系统并连接显示器,确认摄像头的设备id,修改例程中摄像头支持的分辨率和MJPG格式等等。