20类物体分类¶
本节例程的位置在 百度云盘资料\野火K210 AI视觉相机\1-教程文档_例程源码\例程\10-KPU\voc20_object_detect\voc20_object_detect.py
介绍¶
实现PASCAL-VOC数据集的20类目标检测。20 个物体类别:aeroplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, diningtable, dog, horse, motorbike, person, pottedplant, sheep, sofa, train, tvmonitor,下图为实机演示

例程¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | import sensor, image, time, lcd
from maix import KPU
import gc
lcd.init()
sensor.reset(dual_buff=True) # Reset and initialize the sensor. It will
# run automatically, call sensor.run(0) to stop
sensor.set_pixformat(sensor.RGB565) # Set pixel format to RGB565 (or GRAYSCALE)
sensor.set_framesize(sensor.QVGA) # Set frame size to QVGA (320x240)
#sensor.set_vflip(1)
sensor.skip_frames(time = 1000) # Wait for settings take effect.
clock = time.clock() # Create a clock object to track the FPS.
od_img = image.Image(size=(320,256))
obj_name = ("aeroplane","bicycle", "bird","boat","bottle","bus","car","cat","chair","cow","diningtable", "dog","horse", "motorbike","person","pottedplant", "sheep","sofa", "train", "tvmonitor")
anchor = (1.3221, 1.73145, 3.19275, 4.00944, 5.05587, 8.09892, 9.47112, 4.84053, 11.2364, 10.0071)
kpu = KPU()
print("ready load model")
#kpu.load_kmodel(0x300000, 1536936)
kpu.load_kmodel("/sd/KPU/voc20_object_detect/voc20_detect.kmodel")
kpu.init_yolo2(anchor, anchor_num=5, img_w=320, img_h=240, net_w=320 , net_h=256 ,layer_w=10 ,layer_h=8, threshold=0.5, nms_value=0.2, classes=20)
i = 0
while True:
i += 1
print("cnt :", i)
clock.tick() # Update the FPS clock.
img = sensor.snapshot()
a = od_img.draw_image(img, 0,0)
od_img.pix_to_ai()
kpu.run_with_output(od_img)
dect = kpu.regionlayer_yolo2()
fps = clock.fps()
if len(dect) > 0:
print("dect:",dect)
for l in dect :
a = img.draw_rectangle(l[0],l[1],l[2],l[3], color=(0, 255, 0))
a = img.draw_string(l[0],l[1], obj_name[l[4]], color=(0, 255, 0), scale=1.5)
a = img.draw_string(0, 0, "%2.1ffps" %(fps), color=(0, 60, 128), scale=1.0)
lcd.display(img)
gc.collect()
kpu.deinit()
|
例程解析¶
1 2 3 | import sensor, image, time, lcd
from maix import KPU
import gc
|
这些库提供了控制摄像头、图像处理、时间管理、LCD显示和内存管理等功能。
1 2 3 4 5 6 7 | lcd.init()
sensor.reset(dual_buff=True) # Reset and initialize the sensor. It will
# run automatically, call sensor.run(0) to stop
sensor.set_pixformat(sensor.RGB565) # Set pixel format to RGB565 (or GRAYSCALE)
sensor.set_framesize(sensor.QVGA) # Set frame size to QVGA (320x240)
#sensor.set_vflip(1)
sensor.skip_frames(time = 1000) # Wait for settings take effect.
|
初始化LCD显示,重置摄像头并开启双缓冲,设置图像格式为RGB565,设置图像大小为QVGA(320x240像素),跳过一些帧以确保设置生效。
1 | clock = time.clock() # Create a clock object to track the FPS.
|
定义一个时钟对象clock来跟踪帧率(FPS)。
1 2 3 4 | od_img = image.Image(size=(320,256))
obj_name = ("aeroplane","bicycle", "bird","boat","bottle","bus","car","cat","chair","cow","diningtable", "dog","horse", "motorbike","person","pottedplant", "sheep","sofa", "train", "tvmonitor")
anchor = (1.3221, 1.73145, 3.19275, 4.00944, 5.05587, 8.09892, 9.47112, 4.84053, 11.2364, 10.0071)
|
创建一个用于对象检测的图像对象od_img,并实例化KPU(神经网络处理器)对象。
1 2 3 4 5 | kpu = KPU()
print("ready load model")
#kpu.load_kmodel(0x300000, 1536936)
kpu.load_kmodel("/sd/KPU/voc20_object_detect/voc20_detect.kmodel")
kpu.init_yolo2(anchor, anchor_num=5, img_w=320, img_h=240, net_w=320 , net_h=256 ,layer_w=10 ,layer_h=8, threshold=0.5, nms_value=0.2, classes=20)
|
从SD卡加载预训练的Keras模型用于对象检测,然后初始化YOLOv2神经网络,设置锚点、图像尺寸、网络尺寸、层尺寸、置信度阈值、非极大值抑制阈值和类别数量。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | i = 0
while True:
i += 1
print("cnt :", i)
clock.tick() # Update the FPS clock.
img = sensor.snapshot()
a = od_img.draw_image(img, 0,0)
od_img.pix_to_ai()
kpu.run_with_output(od_img)
dect = kpu.regionlayer_yolo2()
fps = clock.fps()
if len(dect) > 0:
print("dect:",dect)
for l in dect :
a = img.draw_rectangle(l[0],l[1],l[2],l[3], color=(0, 255, 0))
a = img.draw_string(l[0],l[1], obj_name[l[4]], color=(0, 255, 0), scale=1.5)
a = img.draw_string(0, 0, "%2.1ffps" %(fps), color=(0, 60, 128), scale=1.0)
lcd.display(img)
gc.collect()
|
捕获一帧图像,将其绘制到od_img上,然后转换为神经网络可以处理的格式,运行神经网络,并获取检测结果。
绘制检测结果和FPS:
如果检测到对象,则在图像上绘制边框和对象名称,并显示当前的FPS(每秒帧数)。最后,显示图像并运行垃圾回收以释放内存。
1 | kpu.deinit()
|
程序结束时,去初始化KPU以释放资源。