Custom Keras model running on OpenCV AI Kit (OAK-1)
You have the peanut butter ( TF/Keras model ), and you have the chocolate ( OAK-1) — now how do you get them to work together so the combination is amazing.
In this article I am going to take you through all of the steps I went through to run a custom built Keras model from a previous project, on the OAK-1 device. There may well be better ways to handle this process, and if you know of a better way, please leave a comment.
Building a Face Mask Detector
Wait, don’t leave — I realize the world does not need another face mask detector — however this article is really about how to migrate the model to run on an OAK-1. Once we understand the process we can start to look at different models.
The model I am going to use came the PyImageSearch blog, “COVID-19: Face Mask Detector with OpenCV, Keras/TensorFlow, and Deep Learning” by Adrian Rosebrock. I am not going to cover anything about the development of the model, but if you are interested to see how to use transfer learning and build this detector, then please take a look at his blog.
I am going to pick up where his blog ends.
High Level Process
The documentation from Luxonis, the company the created the ‘OpenCV AI Kit’ devices has documentation on how to convert a model to run on the ‘MyriadX’ processor.
Image below (from Luxonis documentation ( from OpenCV Courses site)) where I have highlighted the steps we are going to cover.
As you can see, there are not a great deal of steps involved. However making sure you get each step correct can be challenging at first.
Step 1: Start with a frozen trained TensorFlow Model
This is where you should follow along with with the PyImageSearch blog and then come back here when you are done.
The only thing we need to change is the way we save the model. In the original blog post, Adrian saved the model like so:
# serialize the model to disk
print(“[INFO] saving mask detector model…”)
model.save(args[“model”], save_format=”h5")
I added another line:
model.save(‘mask_detector’)
This creates a directory called mask_detector
which contains a file called saved_model.pb
This is the frozen TensorFlow model that we need to start with.
Step 2: Use OpenVINO Model Optimizer
The VPU ( vision processing unit ) that powers the OAK is the MyriadX from Intel and uses the OpenVINO platform. As such we need to convert TensorFlow models into the format understood by OpenVINO.
There are a couple of great resources for background material.
- Towards Data Science Medium Article. In this article the author covers the parameters needed to runthe model optimizer.
- OpenVINO Converting a Model to Intermediate Representation (IR)
In this step we are going to use the OpenVINO Model Optimizer to generate an intermediate representation, with a model config file that has the .xml extension, and a model weights file that has a .bin extension.
Before we can look at the command line to run the model optimizer we have to understand a little about the model used and the preprocessing that happened.
The Luxonis devices cameras output values as:
- BGR color format
- Pixel values in the 0 to 255 range
The base model used was a MobileNetV2 along with the Keras MobileNetV2 preprocessing. This means the image characteristics need to be:
- Image size of 224x224
- Image must be in RGB format
- Input pixel values are scaled between -1 and 1
- Input shape to the model is [1,224,224,3]
The image size requirement will be handled when we created the DepthAI pipeline. The color format, pixel scaling and input shape will be handled with the model optimzation.
In my version of the PyImageSearch blog I installed an additional OpenVINO library:
pip install openvino-dev
This module gives us access to the model optimizer script which can be accessed as a module like:
python -m mo <additional parameters to be discussed soon>
Lets take a look at the options for the ‘mo’ script that we will use to handle the preprocessing for your model
must be in RGB format
- — reverse_input_channels
The OAK device, because it is from OpenCV and OpenCV uses BGR color format we have to reverse the input channels. We could handle this in the DepthAI pipeline but I think it makes most sense to make this part of the model optimizer conversion.
Input pixel values are scaled between -1 and 1
The goal is to take the 0 to 255 image values from the OAK device and scale them to -1 to 1.
This will be a combination of a couple of values. I encourage you to read the OpenVINO Model Optimizer link above and it describes the parameters that can be used to scale the input. It was not initially obvious to me how to handle the conversion and after talking with the Luxonis team and reading the Toward Data Science medium article did it become clear to me.
To handle the conversion we actually need a couple of parameters.
- — mean_values \[127.5, 127.5, 127.5\] — scale_values \[127.5, 127.5, 127.5\]
The 3 values are for the RGB image values and the scaling happens using the following formula (using just red channel as example):
RScaled = (Rvalue-mean value)/scale value
RScaled255 = (255–127.5)/127.5=1
RScaled0 = (0–127.5)/127.5 = -1
Input shape to the model is [1,224,224,3]
There are a couple of options here.
— input_shape=\[1,224,224,3\]
The key here is the ‘1’ for the first dimension because we are going to pass a single image into the deployed model.
The other option is to specify the batch size
— batch 1
This will assume [224,224,3] for the input shape with a single image being passed through the model at a time.
You only need to specify either input_shape or batch, but not both in this case.
Finally we can call the model optimizer (‘mo’) script with the necessary parameters.
python -m mo --reverse_input_channels --batch 1 --mean_values \[127.5, 127.5, 127.5\] --scale_values \[127.5, 127.5, 127.5\] --saved_model_dir ./mask_detector --output_dir openvino_model
Running the above command will produce an OpenVINO Intermediate Representation (IR) that includes a .xml and .bin file in the directory ‘openvino_model’.
Step3: Compile OpenVINO Intermediate Representation into Myriad blob
Once we have the intermediate representation we need to compile it into an OpenVINO model blob.
There are two ways to do this:
- Luxonis Online Myriad Blob Converter
- Luxonis blobconverter module
The first option uses the Luxonis online blob converter web application. You can select the .xml and .bin files and it will convert the model to a Myriad blob and allow you to save that.
If you would like to use a local script to make part of local build process you install the blobconverter and call that directly.
First install the blobconverter package
pip install blobconverter
You can find information on the options on the blobconverter pypi page.
blobconverter --openvino-xml ./openvino_model/saved_model.xml --openvino-bin ./openvino_model/saved_model.bin --shaves 6 --output-dir ./openvino_model --no-cache
The script I used to convert the trained model can be found at my gist:
This script is part of my Github repo which contains my work going through the PyImageSearch blog post. You can find the OpenVINO IR files, and Myriad blob file in the openvino_models directory.
Step 4: Run the custom model on OAK device
Once we have a Myriad blob file, we are ready to run the model on an OAK device.
For this, I have the example file in my depthai-sandbox repo in the custom_models/face_mask directory.
Since we covered most of this pipeline in previous articles, I want to focus on the new nodes: ImageManip and NeuralNetwork
ImageManip
In this pipeline, the ColorCamera was setup to produce images 400x400 for display purposes. However the model expects 224x224, therefore the ImageManip node was added to resize the image.
manip = pipeline.create(depthai.node.ImageManip)
manip.initialConfig.setResize(224, 224)
NeuralNetwork
The NeuralNetwork is created and the path to the MyriadX blob file is set.
custom_nn = pipeline.create(depthai.node.NeuralNetwork)
custom_nn.setBlobPath("./saved_model_openvino_2021.4_6shave.blob")
The last element, is to read the predictions from neuralnet queue. To do that we need to read the values from the last layer of the model. There are two ways to get the name of the last layer as known by the OpenVINO IR layer.
- Open the .xml representation and and search for a layer with name ‘softmax’.
<layer id="290" name="StatefulPartitionedCall/model/dense_1/Softmax" type="SoftMax" version="opset1">
- Run the face_mask_tf.py script with the following line:
[print(f"Layer name: {l.name}, Type: {l.dataType}, Dimensions: {l.dims}") for l in in_nn.getAllLayers()]
This will print the name of the last layer. Using that name of the last layer, call the function on the returned NeuralNetwork message, ‘getLayerFp16’ with the name of the layer. This will return the mask, no mask probabilities.
# read the prediction from the output layer of the custom model
mask, no_mask = in_nn.getLayerFp16('StatefulPartitionedCall/model/dense_1/Softmax')
# print the results of the prediction
print(f"Mask[{round(mask,1)}], No Mask[{round(no_mask,1)}]")
Results
Running the OAK version on the left, and the model from my MacBookPro you can see that the models are predicting the mask/nomask really well. It was not my intention to make both visually equal — just to see if I could run the same model from the OAK device as from my MacBook.
Closing Thoughts
If you made it this far, thank you for reading this article and I hope you found it helpful.
While I believe the process to convert another model would follow the same steps, the details — especially around what the model expects for input and how the data needs to be prepared will likely change.