ONNX to TF Lite¶
eIQ AI Toolkit uses the Olive (https://microsoft.github.io/Olive/) framework for model quantization and conversion. This framework applies passes, where each pass represents a single transformation of a model. Passes can also be chained together to perform multiple transformations in sequence.
This guide explains how to use the TFLiteConversion pass to convert a quantized ONNX model into a quantized TFLite model. It demonstrates how to use the eIQ AI Toolkit API to:
Load a quantized ONNX model into application
Run the conversion to a TFLite model
Retrieve the converted model
This guide requires the eIQ AI Toolkit backend to be running. If you haven’t set it up yet, please refer to the following tutorial: eIQ AI Toolkit setup & launch
[ ]:
import requests
from pathlib import Path
# Set your eIQ AI Toolkit url:
AI_TOOLKIT_BACKEND_URL = "http://localhost:8000"
Load ONNX model¶
Loading any model into the application involves two steps:
Specify the model metadata
Upload the model file
The metadata must include the model type you are uploading. For example, use pytorch for a PyTorch model or onnx for an ONNX model.
For ONNX models, no additional metadata is required.
1. Prepare ONNX model¶
If you already have a model, update the path to point to its location. If you don’t have a model yet, set the path to the location where the model will be saved. (See the following sections for instructions on downloading a sample model.)
[ ]:
# Modify the path to your ONNX model
model_path = Path("path_to_quantized_onnx_model.onnx")
Use the following script to download the example model:
Note: Skip this step if you already have your own model.
[ ]:
example_model_url = "https://eiq.nxp.com/training-materials/_misc/models/quantized_model.onnx"
with open(model_path, "wb") as f:
response = requests.get(
url=example_model_url
)
f.write(response.content)
2. Specify metadata¶
In this section, we’ll upload the model metadata to eIQ AI Toolkit. This step only involves specifying the type of model you are uploading.
[ ]:
MODELS_API_URL = f"{AI_TOOLKIT_BACKEND_URL}/models"
model_metadata = {
"model_type": "onnx",
}
response = requests.post(MODELS_API_URL, json=model_metadata)
response_data = response.json()
model_uuid = response_data["data"]["model"]["uuid"]
3. Upload model¶
Now we can upload the model file.
[ ]:
with open(model_path, "rb") as model_file:
response = requests.post(
url=f"{AI_TOOLKIT_BACKEND_URL}/models/{model_uuid}", # Model identifier is part of the request URL
files={
"model_file": model_file,
}
)
print(response.json())
After uploading the model metadata and file, you can verify its registration and readiness status using the following endpoint. If the status remains in_progress, run the check multiple times until it changes to ready.
[ ]:
response = requests.get(f"{AI_TOOLKIT_BACKEND_URL}/models/{model_uuid}")
data = response.json()
print(f'Model status: {data["data"]["model"]["status"]}')
print(f'Model status description: {data["data"]["model"]["status_description"]}')
Convert to quantized TFLite¶
In the code below, you’ll notice that no parameters are set for the TFLiteConversion pass. All values use their defaults. If you want to view the list of available parameters, check this endpoint
[ ]:
available_passes_response = requests.get(f"{AI_TOOLKIT_BACKEND_URL}/optimizations/passes")
available_passes = available_passes_response.json()
# This prints configuration parameters only for TFLiteConversion pass. Feel free to change it and explore
# other passes as well.
tflite_conversion_pass_config = next(_pass for _pass in available_passes["data"]["passes"] if _pass["type"] == "TFLiteConversion")
print(tflite_conversion_pass_config)
Let’s run the conversion.
[ ]:
OPTIMIZATIONS_API_URL = f"{AI_TOOLKIT_BACKEND_URL}/optimizations"
RUN_OPTIMIZATION_API_URL = f"{OPTIMIZATIONS_API_URL}/run"
pass_config = {
"model_uuid": model_uuid,
"passes": [
{
"type": "TFLiteConversion",
"config": {
# Feel free to add specific TFLite conversion parameters here if needed
}
}
]
}
optimization_response = requests.post(RUN_OPTIMIZATION_API_URL, json=pass_config)
optimization_response_data = optimization_response.json()
optimization_uuid = optimization_response_data["data"]["optimization"]["uuid"]
The conversion is now running. You can check its status by calling this endpoint. Repeat the check until the status changes to success.
[ ]:
response = requests.get(f"{OPTIMIZATIONS_API_URL}/{optimization_uuid}")
data = response.json()
status = data["data"]["optimization"]["status"]
print(f"Conversion status: {status}")
if status == "success":
artifact_id = data["data"]["optimization"]["artifacts"][0]["artifact_id"]
Download quantized TFLite model¶
Once the conversion is complete and successful, you can download the resulting model.
[ ]:
# Change model path to your location
dest_model_path = Path("quantized_tflite_model.tflite")
[ ]:
download_response = requests.get(f"{AI_TOOLKIT_BACKEND_URL}/optimizations/{optimization_uuid}/resources/{artifact_id}")
with dest_model_path.open("wb") as f:
f.write(download_response.content)