Converting for i.MX 93¶
To run models on devices with NPU acceleration supported by eIQ AI Toolkit, the model must first be converted to quantized TF Lite format. Depending on the specific platform, an additional conversion step may be required.
For example, when deploying a model on i.MX 93 devices with NPU acceleration, the quantized TF Lite model must be converted into a format optimized for execution on the i.MX 93 NPU. This process modifies the original quantized TFLite model to ensure compatibility and performance.
eIQ AI Toolkit uses the Olive (https://microsoft.github.io/Olive/) framework for these conversions. Olive applies passes, where each pass represents a single transformation of a model. Passes can also be chained together to perform multiple transformations in sequence.
This guide demonstrates how to:
Load a quantized TF Lite model into eIQ AI Toolkit
Convert the quantized TF Lite model into an i.MX 93 NPU-compatible format
Download the converted model for use in an i.MX 93 application
This guide requires the eIQ AI Toolkit backend to be running. If you haven’t set it up yet, refer to the following tutorial: eIQ AI Toolkit setup & launch
[ ]:
import requests
from pathlib import Path
# Set your eIQ AI Toolkit url:
AI_TOOLKIT_BACKEND_URL = "http://localhost:8000"
Load quantized TFLite model¶
Loading any type of model into an application involves two steps:
Specify the model metadata
Upload the model file
The metadata must always include the type of model being uploaded. For example, if you are uploading a PyTorch model, set the type to pytorch; for an ONNX model, use onnx, and so on.
For a TFLite model, no additional metadata is required.
1. Prepare quantized TFLite model¶
If you already have a model prepared, simply update the path to point to its location. If you do not yet have a model, set the path to the location where the model should be saved. (Refer to the following sections for instructions on downloading a sample model.)
[ ]:
# Modify the path to your TFLite model
model_path = Path("path_to_quantized_tflite_model.tflite")
Use the following script to download the example model:
Note: Skip this step if you already have your own model.
[ ]:
example_model_url = "https://eiq.nxp.com/training-materials/_misc/models/mobilenet_v3-small_224_1.0_uint8.tflite"
with open(model_path, "wb") as f:
response = requests.get(
url=example_model_url
)
f.write(response.content)
2. Specify metadata¶
In this section, we will upload the model metadata to the eIQ AI Toolkit. This step involves specifying only the type of model being uploaded.
[ ]:
MODELS_API_URL = f"{AI_TOOLKIT_BACKEND_URL}/models"
model_metadata = {
"model_type": "tflite",
}
response = requests.post(MODELS_API_URL, json=model_metadata)
response_data = response.json()
model_uuid = response_data["data"]["model"]["uuid"]
3. Upload model¶
Now we can upload the model file.
[ ]:
with open(model_path, "rb") as model_file:
response = requests.post(
url=f"{AI_TOOLKIT_BACKEND_URL}/models/{model_uuid}", # Model identifier is part of the request URL
files={
"model_file": model_file,
}
)
print(response.json())
After uploading the model metadata and file, you can verify the model’s registration and readiness status using the following endpoint. If the status remains in_progress, call the endpoint repeatedly until it changes to ready.
[ ]:
response = requests.get(f"{AI_TOOLKIT_BACKEND_URL}/models/{model_uuid}")
data = response.json()
print(f'Model status: {data["data"]["model"]["status"]}')
print(f'Model status description: {data["data"]["model"]["status_description"]}')
Conversion to i.MX 93 specific format¶
Now we can initiate the conversion by calling this endpoint, which executes the VelaConversion pass.
[ ]:
OPTIMIZATIONS_API_URL = f"{AI_TOOLKIT_BACKEND_URL}/optimizations"
RUN_OPTIMIZATION_API_URL = f"{OPTIMIZATIONS_API_URL}/run"
pass_config = {
"model_uuid": model_uuid,
"passes": [
{
"type": "VelaConversion",
"config": {
# VelaConversion does not require any configuration parameters
}
}
]
}
optimization_response = requests.post(RUN_OPTIMIZATION_API_URL, json=pass_config)
optimization_response_data = optimization_response.json()
optimization_uuid = optimization_response_data['data']['optimization']['uuid']
The conversion process is now running. You can check its status by calling this endpoint repeatedly until the status changes to success.
[ ]:
response = requests.get(f"{OPTIMIZATIONS_API_URL}/{optimization_uuid}")
data = response.json()
status = data["data"]["optimization"]["status"]
print(f"Conversion status: {status}")
if status == "success":
artifact_id = data["data"]["optimization"]["artifacts"][0]["artifact_id"]
Download converted model¶
Once the conversion is complete and successful, you can download the resulting model and use it in your i.MX 93 applications.
[ ]:
# Change model path to your location
dest_model_path = Path("imx_converted_model.tflite")
[ ]:
download_response = requests.get(f"{AI_TOOLKIT_BACKEND_URL}/optimizations/{optimization_uuid}/resources/{artifact_id}")
with dest_model_path.open("wb") as f:
f.write(download_response.content)