i.MX 95¶
This guide walks you through the process of profiling a TensorFlow Lite model on the i.MX 95 MPU using the eIQ AI Toolkit online profiling feature. It covers the essential steps required to register your hardware and model, and highlights useful endpoints for profiling tasks.
What You’ll Learn:
How to register the i.MX 95 MPU device in eIQ AI Toolkit
How to upload and register a TF Lite model for profiling
Key eIQ AI Toolkit API endpoints relevant to model profiling
Note: This guide is specifically focused on profiling TF Lite models. Other model formats may require different steps or configurations.
Note: This guide was developed and run using Python 3.11.
This guide requires the eIQ AI Toolkit backend to be running. If you haven’t set it up yet, please refer to the following tutorial: eIQ AI Toolkit setup & launch
[ ]:
import requests
from pathlib import Path
# Set your eIQ AI Toolkit url:
AI_TOOLKIT_BACKEND_URL = "http://localhost:8000"
Device¶
Ensure that this device is properly set up and accessible by the eIQ AI Toolkit backend before initiating the profiling process:
[ ]:
device_url = "http://your_device_url"
device_name = "your_custom_device_name"
response = requests.post(
url=f"{AI_TOOLKIT_BACKEND_URL}/devices",
json={
"url_address": device_url,
"name": device_name,
}
)
print(response.json())
Once the device has been successfully added to eIQ AI Toolkit, its details can be retrieved using the following endpoint:
[ ]:
response = requests.get(
url=f"{AI_TOOLKIT_BACKEND_URL}/devices/info",
params={
"url_address": device_url,
}
)
print(response.json())
To proceed with profiling, the device detail must report device_available: True.
Model¶
If you already have a trained model ready, simply update the path to point to its location. If you don’t have a trained model yet, set the path to a location where the model should be saved. (See the following sections for instructions on how to download a sample model.)
[ ]:
model_path = Path("your_model_path.tflite")
Use the following script to download the example model:
Note: Skip this step if you already have your own model.
[ ]:
example_model_url = "https://eiq.nxp.com/training-materials/_misc/models/mobilenet_v3-small_224_1.0_uint8.tflite"
with open(model_path, "wb") as f:
response = requests.get(
url=example_model_url
)
f.write(response.content)
Upload model to eIQ AI Toolkit¶
Uploading a model to eIQ AI Toolkit consists of two steps:
Upload Metadata - This includes information such as the model name, format (e.g., TF Lite), input/output shapes, and other relevant attributes.
Upload Model File - After the metadata is registered, the actual model file (e.g., .tflite) is uploaded to the platform.
Submit the metadata:
[ ]:
response = requests.post(
url=f"{AI_TOOLKIT_BACKEND_URL}/models",
params={
"model_name": "your_custom_model_name",
},
json={
"model_type": "tflite"
}
)
data = response.json()
print(data)
model_uuid = data["data"]["model"]["uuid"] # Assigned model identifier
Upload the model file:
[ ]:
with open(model_path, "rb") as model_file:
response = requests.post(
url=f"{AI_TOOLKIT_BACKEND_URL}/models/{model_uuid}", # Model identifier is part of the request URL
files={
"model_file": model_file,
}
)
print(response.json())
After uploading the model metadata and file, you can verify the model’s registration and readiness status using the following endpoint:
[ ]:
response = requests.get(f"{AI_TOOLKIT_BACKEND_URL}/models/{model_uuid}")
data = response.json()
print(f'Model status: {data["data"]["model"]["status"]}')
print(f'Model status description: {data["data"]["model"]["status_description"]}')
The model can be used for profiling once its status is reported as ready.
Convert model to Neutron NPU specific format¶
Before running the conversion, two important parameters must be configured:
Neutron target
Neutron flavor (version)
The code below displays all available options for these parameters. Neutron target should be set to imx95. Neutron flavor should match the exact version of Yocto you are using.
If you are following this guide to convert and use a model on your device, make sure to adjust these parameters accordingly.
[ ]:
available_passes_response = requests.get(f"{AI_TOOLKIT_BACKEND_URL}/optimizations/passes")
available_passes = available_passes_response.json()
# This prints configuration parameters only for NeutronConversion pass. Feel free to change it and explore
# other passes as well.
neutron_conversion_pass_config = next(_pass for _pass in available_passes["data"]["passes"] if _pass["type"] == "NeutronConversion")
print(neutron_conversion_pass_config)
[ ]:
OPTIMIZATIONS_API_URL = f"{AI_TOOLKIT_BACKEND_URL}/optimizations"
RUN_OPTIMIZATION_API_URL = f"{OPTIMIZATIONS_API_URL}/run"
pass_config = {
"model_uuid": model_uuid,
"passes": [
{
"type": "NeutronConversion",
"config": {
"target": "imx95",
"flavor": "LF6.12.49_2.2.0" # Change this to match your Yocto version
}
}
]
}
optimization_response = requests.post(RUN_OPTIMIZATION_API_URL, json=pass_config)
optimization_response_data = optimization_response.json()
optimization_uuid = optimization_response_data["data"]["optimization"]["uuid"]
The conversion process is now running. You can check its status by calling this endpoint repeatedly until the status changes to success.
[ ]:
response = requests.get(f"{OPTIMIZATIONS_API_URL}/{optimization_uuid}")
data = response.json()
status = data["data"]["optimization"]["status"]
print(f"Conversion status: {status}")
if status == "success":
artifact_id = data["data"]["optimization"]["artifacts"][0]["artifact_id"]
Once the conversion is complete and successful, you can copy model from pass neutronconvertion to My model and use it on your Neutron device.
Profiling¶
To start online profiling, invoke the endpoint /profiling/run_on_device.
You will need the following parameters:
Model identifier – the unique ID of the model you uploaded
Device URL – the address of the target device (e.g., i.MX95 MPU)
Delegate – the profiling hardware parameter (see cell below)
Run name (optional) – a custom name for the profiling session, useful for tracking and organizing results
[ ]:
profiling_delegate = "npu" # The two available options are "cpu" and "npu"
profiling_run_name = "example_online_profiling_tflite_model"
print(f"Model identifier: {model_uuid}")
print(f"Device url: {device_url}")
print(f"Delegate: {profiling_delegate}")
print(f"Custom profiling name run: {profiling_run_name}")
Request the profiling:
[ ]:
response = requests.post(
url=f"{AI_TOOLKIT_BACKEND_URL}/profiling/run_on_device",
json={
"model_uuid": model_uuid,
"target_url": device_url,
"delegate": profiling_delegate,
"name": profiling_run_name,
}
)
data = response.json()
print(data)
profiling_uuid = data["data"]["profiling"]["uuid"] # Assigned identifier of the requested profiling run
After initiating the profiling job, you can monitor its progress using the following API call:
[ ]:
response = requests.get(
url=f"{AI_TOOLKIT_BACKEND_URL}/profiling/{profiling_uuid}" # Profiling run identifier is part of the request URL
)
data = response.json()
print(data)
print(f'Profiling status: {data["data"]["profiling"]["status"]}')
print(f'Profiling status description: {data["data"]["profiling"]["status_description"]}')
Once the profiling status is marked as success you can proceed to analyze the results. If the status is still in_progress, re-run the status check (cell above) until the profiling completes.
[ ]:
profiling_data = data["data"]["profiling"]
print(profiling_data)