i.MX 9x with eIQ Neutron NPU

This guide walks you through the process of profiling a TensorFlow Lite model with eIQ Neutron NP using the eIQ AI Toolkit simulated profiling feature. It covers the essential steps required to register model, and highlights useful endpoints for profiling tasks.

Simulated profiling does not require access to a physical board with eIQ Neutron NPU. Instead, it provides an estimated performance profile based on representative hardware characteristics.

What you will learn:

  • How to upload and register a TF Lite model for profiling

  • Key eIQ AI Toolkit API endpoints relevant to model profiling

Note:eIQ Neutron NPUprofiling supports only TF Lite models. Other model formats require conversion to TF Lite.

Note: This guide was developed and run using Python 3.11.

This guide requires the eIQ AI Toolkit backend to be running. If you haven’t set it up yet, please refer to the following tutorial: eIQ AI Toolkit setup & launch

[ ]:
import requests
from pathlib import Path

# Set your eIQ AI Toolkit url:
AI_TOOLKIT_BACKEND_URL = "http://localhost:8000"

Model

If you already have a trained model ready, simply update the path to point to its location. If you don’t have a trained model yet, set the path to a location where the model should be saved. (See the following sections for instructions on how to download a sample model.)

[ ]:
model_path = Path("your_model_path.tflite")

Use the following command or script to download the example model:

Note: Skip this step if you already have your own model.

[ ]:
example_model_url = "https://eiq.nxp.com/training-materials/_misc/models/mobilenet_v3-small_224_1.0_uint8.tflite"

with open(model_path, "wb") as f:
    response = requests.get(
        url=example_model_url
    )
    f.write(response.content)

Upload model to eIQ AI Toolkit

Uploading a model to eIQ AI Toolkit consists of two steps:

  1. Upload Metadata This includes information such as the model name, format (e.g., TF Lite), input/output shapes, and other relevant attributes.

  2. Upload Model File After the metadata is registered, the actual model file (e.g., .tflite) is uploaded to the platform.

Submit the metadata:

[ ]:
response = requests.post(
    url=f"{AI_TOOLKIT_BACKEND_URL}/models",
    params={
        "model_name": "your_custom_model_name",
    },
    json={
        "model_type": "tflite"
    }
)

data = response.json()
print(data)
model_uuid = data["data"]["model"]["uuid"] # Assigned model identifier

Upload the model file:

[ ]:
with open(model_path, "rb") as model_file:
    response = requests.post(
        url=f"{AI_TOOLKIT_BACKEND_URL}/models/{model_uuid}", # Model identifier is part of the request URL
        files={
            "model_file": model_file,
        }
    )

print(response.json())

After uploading the model metadata and file, you can verify the model’s registration and readiness status using the following endpoint:

[ ]:
response = requests.get(f"{AI_TOOLKIT_BACKEND_URL}/models/7409abf3-9b6a-4351-b9d7-02480cc1cd3e")
data = response.json()
print(f'Model status: {data["data"]["model"]["status"]}')
print(f'Model status description: {data["data"]["model"]["status_description"]}')

The model can be used for profiling once its status is reported as ready.

Profiling

To start simulated profiling, invoke the endpoint /profiling/run_simulated.

You will need the following parameters:

  • Model identifier – the unique ID of the model you uploaded

  • Engine – the engine running the simulation. In case of eIQ Neutron NPU the engine is neutron

  • Parameters - in order to profile on eIQ Neutron NPU, simulation target must be specified

  • Run name (optional) – a custom name for the profiling session, useful for tracking and organizing results

The /profiling/supported_simulated_types endpoint returns information about the simulated profiling parameters:

[ ]:
response = requests.get("http://localhost:8000/profiling/supported_simulated_types")

data = response.json()
print(data)

types = data["data"]["types"]
supported_imx95 = [x for x in types if x["engine"] == "neutron"]
print(f"Neutron: {supported_imx95[0]}")

Set the parameters used for the profiling request:

[ ]:
profiling_engine = "neutron" # The engine simulating eIQ Neturon NPU
profiling_target = "imxrt700" # See available target options above
profiling_run_name = "example_imx93_profiling_tflite_model"

print(f"Model identifier: {model_uuid}")
print(f"Engine: {profiling_engine}")
print(f"Target: {profiling_engine}")
print(f"Custom profiling name run: {profiling_run_name}")

Request the profiling:

[ ]:
response = requests.post(
    url=f"{AI_TOOLKIT_BACKEND_URL}/profiling/run_simulated",
    json={
        "model_uuid": model_uuid,
        "engine": profiling_engine,
        "parameters": {
            "target": profiling_target
        },
        "name": profiling_run_name,
    }
)

data = response.json()
print(data)
profiling_uuid = data["data"]["profiling"]["uuid"] # Assigned identifier of the requested profiling run

After initiating the profiling job, you can monitor its progress using the following API call:

[ ]:
response = requests.get(
    url=f"{AI_TOOLKIT_BACKEND_URL}/profiling/{profiling_uuid}" # Profiling run identifier is part of the request URL
)

data = response.json()
print(data)
print(f'Profiling status: {data["data"]["profiling"]["status"]}')
print(f'Profiling status description: {data["data"]["profiling"]["status_description"]}')

Once the profiling status is marked as success you can proceed to analyze the results. If the status is still in_progress, re-run the status check (cell above) until the profiling completes.

[ ]:
profiling_data = data["data"]["profiling"]
print(profiling_data)