8. YOLOX¶

YOLOX 是anchor-free版本的YOLO, 设计更简单，性能更好！它旨在弥合研究界和工业界之间的差距。

YoloX模型结构详细请查看:https://arxiv.org/abs/2107.08430 。

本章将简单测试YOLOX目标检测模型，并在鲁班猫rk系列板卡上部署测试。

8.1. YoloX¶

在PC上创建一个虚拟环境，然后克隆yolox仓库的源码，安装相关环境。

# 创建一个环虚拟环境
conda create -n yolox python=3.9
conda activate yolox

# 安装 yolox
git clone git@github.com:Megvii-BaseDetection/YOLOX.git
cd YOLOX
pip3 install -r requirements.txt
pip3 install -v -e .

从 YOLOX 获取权重文件，然后推理测试：

# 测试yolox-s模型，获取权重（基于COCO数据集）：
cd YOLOX
wget https://github.com/Megvii-BaseDetection/YOLOX/releases/download/0.1.1rc0/yolox_s.pth

# 测试推理图像，使用工程文件tools/demo.py，测试图像是YOLOx工程文件assets/dog.jpg
python tools/demo.py image -n yolox-s -c yolox_s.pth --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result --device cpu

2024-10-15 17:23:06.080 | INFO     | __main__:main:259 - Args: Namespace(demo='image', experiment_name='yolox_s',
 name='yolox-s', path='assets/dog.jpg', camid=0, save_result=True, exp_file=None, ckpt='../yolox_s.pth',
  device='cpu', conf=0.25, nms=0.45, tsize=640, fp16=False, legacy=False, fuse=False, trt=False)
2024-10-15 17:23:06.368 | INFO     | __main__:main:269 - Model Summary: Params: 8.97M, Gflops: 26.93
2024-10-15 17:23:06.369 | INFO     | __main__:main:282 - loading checkpoint
ckpt = torch.load(ckpt_file, map_location="cpu")
2024-10-15 17:23:06.652 | INFO     | __main__:main:286 - loaded checkpoint done.
2024-10-15 17:23:06.784 | INFO     | __main__:inference:165 - Infer time: 0.0971s
2024-10-15 17:23:06.788 | INFO     | __main__:image_demo:202 - Saving detection result in ./YOLOX_outputs/yolox_s/vis_res/2024_10_15_17_23_06/dog.jpg

结果保存在当前目录下的YOLOX_outputs目录下，查看结果图像：

YOLOX更多使用教程参考下 YOLOX documentation 。

8.2. YoloX模型导出¶

模型导出，我们使用 airockchip/yolox 仓库，该仓库的yolox对rknpu设备进行了部署优化：

优化focus/SPPF块，以相同的结果获得更好的性能；
更改输出节点，从模型中删除post_process（后处理量化方面不友好）；

使用自行训练yolox模型或者从 YOLOX 获取权重文件，然后导出onnx模型。

提示

自行训练的模型，请修改后面部署例程的OBJ_CLASS_NUM参数，coco_80_labels_list.txt文件等等。

# 可以重新创建一个虚拟环境，然后获取
git clone https://github.com/airockchip/YOLOX.git
cd YOLOX
pip3 install -r requirements.txt

# 测试yolox-s 模型，获取权重：
wget https://github.com/Megvii-BaseDetection/YOLOX/releases/download/0.1.1rc0/yolox_s.pth

# 导出onnx模型
python3 tools/export_onnx.py --output-name yolox_s.onnx -n yolox-s -c yolox_s.pth --rknpu

2024-10-15 16:09:15.700 | INFO     | __main__:main:70 - args value: Namespace(output_name='yolox_s.onnx', input='images',
output='output', opset=11, batch_size=1, dynamic=False, no_onnxsim=False, exp_file=None, experiment_name=None, name='yolox-s',
 ckpt='../yolox_s.pth', rknpu=True, opts=[], decode_in_inference=False)
/xxx/tools/export_onnx.py:85: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value),
ckpt = torch.load(ckpt_file, map_location="cpu")
2024-10-15 16:09:16.116 | INFO     | __main__:main:100 - loading checkpoint done.
/xxx/tools/export_onnx.py:103: FutureWarning: 'torch.onnx._export' is
deprecated in version 1.12.0 and will be removed in 2.0. Please use `torch.onnx.export` instead.
torch.onnx._export(
2024-10-15 16:09:16.902 | INFO     | __main__:main:113 - generated onnx model named yolox_s.onnx
2024-10-15 16:09:17.364 | INFO     | __main__:main:129 - generated simplified onnx model named yolox_s.onnx

在当前目录下输出onnx模型，然后使用toolkit2工具将器转换成rknn模型。

(toolkit2.2) llh@llh:/xxx/python$ python convert.py  ./yolox_s.onnx rk3588 i8
I rknn-toolkit2 version: 2.2.0
--> Config model
done
--> Loading model
I Loading : 100%|██████████████████████████████████████████████| 169/169 [00:00<00:00, 34709.50it/s]
done
--> Building model
I OpFusing 0: 100%|██████████████████████████████████████████████| 100/100 [00:00<00:00, 747.18it/s]
I OpFusing 1 : 100%|█████████████████████████████████████████████| 100/100 [00:00<00:00, 565.37it/s]
I OpFusing 2 : 100%|█████████████████████████████████████████████| 100/100 [00:00<00:00, 155.85it/s]
I GraphPreparing : 100%|███████████████████████████████████████| 195/195 [00:00<00:00, 11410.76it/s]
I Quantizating : 100%|████████████████████████████████████████████| 195/195 [00:33<00:00,  5.78it/s]
W build: The default input dtype of 'images' is changed from 'float32' to 'int8' in rknn model for performance!
                    Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of 'output' is changed from 'float32' to 'int8' in rknn model for performance!
                    Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of '788' is changed from 'float32' to 'int8' in rknn model for performance!
                    Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of 'output.1' is changed from 'float32' to 'int8' in rknn model for performance!
                    Please take care of this change when deploy rknn model with Runtime API!
I rknn building ...
I rknn buiding done.
done
--> Export rknn model
done

8.3. YoloX模型部署¶

在板卡部署使用RKNPU（C/C++接口）,参考 rknn_model_zoo仓库提供的部署例程，进行部署测试。

# 拉取例程，或者拉取rknn_model_zoo仓库例程，直接在板卡上编译
cat@lubancat:~/$ git clone https://gitee.com/LubanCat/lubancat_ai_manual_code.git
cat@lubancat:~/$ cd lubancat_ai_manual_code/example/yolov5/cpp

# 如果没有编译可以执行下面命令编译程序，教程测试是lubancat-4/5(指定参数rk3588)
cat@lubancat:~/lubancat_ai_manual_code/example/yolov5/cpp$ ./build-linux.sh -t rk3588 -r
./build_linux.sh -t rk3588 -r
===================================
TARGET_SOC=rk3588
INSTALL_DIR=/home/cat/xxx/examples/yolox/cpp/install/rk3588_linux
BUILD_DIR=/home/cat/xxx/examples/yolox/cpp/build/build_rk3588_linux
ENABLE_DMA32=
DISABLE_RGA=ON
CC=aarch64-linux-gnu-gcc
CXX=aarch64-linux-gnu-g++
===================================
-- The C compiler identification is GNU 10.2.1
-- The CXX compiler identification is GNU 10.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/aarch64-linux-gnu-gcc - skipped
-- Detecting C compile features
# 省略.............
[ 75%] Building CXX object CMakeFiles/rknn_yolox_demo.dir/main.cc.o
[ 91%] Building CXX object CMakeFiles/rknn_yolox_demo.dir/rknpu2/yolox.cc.o
[100%] Linking CXX executable rknn_yolox_demo
# 省略.............

编译完成后，会生成可执行程序（保存在install/rk3588_linux目录下），教程测试对图片进行推理，切换到该目录下执行程序：

# ./rknn_yolox_demo <model_path> <image_path>
cat@lubancat:~/xxx/install/rk3588_linux$ ./rknn_yolox_demo ./yolox.rknn model/bus.jpg
load lable ./model/coco_80_labels_list.txt
model input num: 1, output num: 3
input tensors:
index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=1.000000
output tensors:
index=0, name=output, n_dims=4, dims=[1, 85, 80, 80], n_elems=544000, size=544000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-28, scale=0.022949
index=1, name=788, n_dims=4, dims=[1, 85, 40, 40], n_elems=136000, size=136000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-26, scale=0.024599
index=2, name=output.1, n_dims=4, dims=[1, 85, 20, 20], n_elems=34000, size=34000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-19, scale=0.021201
model is NHWC input fmt
model input height=640, width=640, channel=3
init_yolox_model use: 35.960999 ms
origin size=640x640 crop size=640x640
input image: 640 x 640, subsampling: 4:2:0, colorspace: YCbCr, orientation: 1
scale=1.000000 dst_box=(0 0 639 639) allow_slight_change=1 _left_offset=0 _top_offset=0 padding_w=0 padding_h=0
convert image use cpu
finish
convert_image_with_letterbox use: 25.207001 ms
rknn_inputs_set use: 1.952000 ms
rknn_run use: 28.274000 ms
rknn_outputs_get use: 0.419000 ms
post_process use: 0.117000 ms
inference_yolox_model use: 56.014999 ms
bus @ (87 137 550 428) 0.930
person @ (103 237 223 535) 0.896
person @ (210 235 286 513) 0.871
person @ (474 235 559 519) 0.831
person @ (80 328 118 516) 0.499
write_image path: result.png width=640 height=640 channel=3 data=0x559024524

在install/rk3588_linux目录下，还有一个yolox_videocapture_demo例程，支持打开摄像头或者视频文件（教程这里测试usb摄像头），测试打开摄像头，然后对其他显示器显示的图片进行检测。

# ./rknn_yolox_demo <model_path> <video path/capture>
load lable ./model/coco_80_labels_list.txt
model input num: 1, output num: 3
input tensors:
index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=1.000000
output tensors:
index=0, name=output, n_dims=4, dims=[1, 85, 80, 80], n_elems=544000, size=544000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-28, scale=0.022949
index=1, name=788, n_dims=4, dims=[1, 85, 40, 40], n_elems=136000, size=136000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-26, scale=0.024599
index=2, name=output.1, n_dims=4, dims=[1, 85, 20, 20], n_elems=34000, size=34000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-19, scale=0.021201
model is NHWC input fmt
model input height=640, width=640, channel=3
scale=1.000000 dst_box=(0 0 639 639) allow_slight_change=1 _left_offset=0 _top_offset=0 padding_w=0 padding_h=0
rga_api version 1.10.1_[0]
convert_image_with_letterbox use: 1.553000 ms
rknn_inputs_set use: 2.731000 ms
rknn_run use: 29.483999 ms
rknn_outputs_get use: 0.437000 ms
post_process use: 0.048000 ms
-- inference_yolox_model use: 34.460999 ms
bus @ (84 132 557 435) 0.951
person @ (106 235 218 534) 0.896
person @ (212 239 285 510) 0.871
person @ (474 236 559 520) 0.831
person @ (80 326 119 517) 0.535
handbag @ (212 367 252 418) 0.258
# 省略........

测试usb摄像头，请确认摄像头的设备号，修改yolox_videocapture_demo.cc例程中摄像头支持的分辨率和MJPG格式等，如果是mipi摄像头，需要opencv设置转换成rgb格式以及设置分辨率大小等等。

8.4. 参考链接¶

https://github.com/Megvii-BaseDetection/YOLOX

https://github.com/airockchip/rknn_model_zoo

https://github.com/airockchip/YOLOX