手写数字识别¶
本节例程的位置在 百度云盘资料\野火K210 AI视觉相机\1-教程文档_例程源码\例程\10-KPU\mnist\mnist.py
介绍¶
手写数字识别,可以通过摄像头采集的画面来识别数字,下图为实机演示
例程¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | import sensor, image, time, lcd
from maix import KPU
import gc
lcd.init(freq=15000000)
sensor.reset(dual_buff=True) # Reset and initialize the sensor. It will
# run automatically, call sensor.run(0) to stop
sensor.set_pixformat(sensor.RGB565) # Set pixel format to RGB565 (or GRAYSCALE)
sensor.set_framesize(sensor.QVGA) # Set frame size to QVGA (320x240)
sensor.set_windowing((224, 224))
sensor.skip_frames(time = 1000) # Wait for settings take effect.
clock = time.clock() # Create a clock object to track the FPS.
kpu = KPU()
kpu.load_kmodel("/sd/KPU/mnist/uint8_mnist_cnn_model.kmodel")
while True:
gc.collect()
img = sensor.snapshot()
img_mnist1=img.to_grayscale(1) #convert to gray
img_mnist2=img_mnist1.resize(112,112)
a=img_mnist2.invert() #invert picture as mnist need
a=img_mnist2.strech_char(1) #preprocessing pictures, eliminate dark corner
a=img_mnist2.pix_to_ai()
out = kpu.run_with_output(img_mnist2, getlist=True)
max_mnist = max(out)
index_mnist = out.index(max_mnist)
#score = KPU.sigmoid(max_mnist)
display_str = "num: %d" % index_mnist
print(display_str)
a=img.draw_string(4,3,display_str,color=(0,0,0),scale=2)
lcd.display(img)
kpu.deinit()
|
例程解析¶
1 2 3 | import sensor, image, time, lcd
from maix import KPU
import gc
|
导入必要的库:sensor(摄像头控制),image(图像处理),time,lcd(LCD显示控制),KPU(Kendryte
1 2 3 4 5 6 7 | lcd.init(freq=15000000)
sensor.reset(dual_buff=True) # Reset and initialize the sensor. It will
# run automatically, call sensor.run(0) to stop
sensor.set_pixformat(sensor.RGB565) # Set pixel format to RGB565 (or GRAYSCALE)
sensor.set_framesize(sensor.QVGA) # Set frame size to QVGA (320x240)
sensor.set_windowing((224, 224))
sensor.skip_frames(time = 1000) # Wait for settings take effect.
|
初始化LCD,设置频率为15MHz。
重置并初始化摄像头,设置双缓冲模式,设置像素格式为RGB565,设置帧大小为QVGA(320x240),并设置窗口大小为224x224。
跳过1000帧,等待设置生效。
1 | clock = time.clock() # Create a clock object to track the FPS.
|
创建一个时钟对象,用于跟踪帧率。
1 2 | kpu = KPU()
kpu.load_kmodel("/sd/KPU/mnist/uint8_mnist_cnn_model.kmodel")
|
加载KPU模型,该模型位于SD卡上的/sd/KPU/mnist/uint8_mnist_cnn_model.kmodel路径。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | while True:
gc.collect()
img = sensor.snapshot()
img_mnist1=img.to_grayscale(1) #convert to gray
img_mnist2=img_mnist1.resize(112,112)
a=img_mnist2.invert() #invert picture as mnist need
a=img_mnist2.strech_char(1) #preprocessing pictures, eliminate dark corner
a=img_mnist2.pix_to_ai()
out = kpu.run_with_output(img_mnist2, getlist=True)
max_mnist = max(out)
index_mnist = out.index(max_mnist)
#score = KPU.sigmoid(max_mnist)
display_str = "num: %d" % index_mnist
print(display_str)
a=img.draw_string(4,3,display_str,color=(0,0,0),scale=2)
lcd.display(img)
|
进行垃圾回收,以释放内存。
拍摄一张照片,并将其转换为灰度图像。
将灰度图像缩小到112x112。
反转图像,因为MNIST数据集需要。
对图像进行预处理,消除暗角。
将图像转换为KPU可以处理的格式。
使用KPU运行模型,并获取输出结果。
找到最大值,并获取其索引,即为识别出的数字。
将识别结果转换为字符串,并打印到控制台。
在图像上绘制字符串,并显示到LCD上。
1 | kpu.deinit()
|
释放KPU资源。