{
 "cells": [
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "# ONNX Quantization",
   "id": "21ec8cef20db4115"
  },
  {
   "cell_type": "markdown",
   "id": "c06c3b4b-01c8-4b99-9b30-7c8bd7a676cb",
   "metadata": {},
   "source": [
    "**eIQ AI Toolkit** performs quantization using the **Olive** (https://microsoft.github.io/Olive/) framework.\n",
    "Olive uses **passes**, where each pass represents a single transformation of the model.\n",
    "Passes can also be chained together to apply multiple transformations in sequence.\n",
    "\n",
    "To quantize an ONNX model, a **calibration dataset** is required.\n",
    "This guide will show you how to use the **ONNX2Quant** pass to quantize an ONNX model.\n",
    "Specifically, it will demonstrate how to use the **eIQ AI Toolkit** API to:\n",
    "- Load an ONNX model into the application\n",
    "- Load a calibration dataset into the application\n",
    "- Run ONNX quantization\n",
    "- Retrieve the quantized model\n",
    "\n",
    "This guide requires the **eIQ AI Toolkit** backend to be running.\n",
    "If you haven't set it up yet, please refer to the following tutorial:\n",
    "[eIQ AI Toolkit setup & launch](../tools/aiToolkit/installRun.ipynb)"
   ]
  },
  {
   "metadata": {},
   "cell_type": "code",
   "source": [
    "import requests\n",
    "from pathlib import Path\n",
    "\n",
    "# Set your eIQ AI Toolkit url:\n",
    "AI_TOOLKIT_BACKEND_URL = \"http://localhost:8000\""
   ],
   "id": "f5b88f16f253162b",
   "outputs": [],
   "execution_count": null
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "## Load ONNX model",
   "id": "bd7954fa8d49e9a5"
  },
  {
   "cell_type": "markdown",
   "id": "8b6e9db6-6c7a-4ea0-93f4-3934c9990e80",
   "metadata": {},
   "source": [
    "Loading any type of model into an application involves two steps:\n",
    "1. Specify the model metadata\n",
    "2. Upload the model file\n",
    "\n",
    "The metadata must always include the type of model being uploaded. For example, if you are uploading a PyTorch model, set the type to `pytorch`; for an ONNX model, use `onnx`, and so on.\n",
    "\n",
    "For an ONNX model, no additional metadata is required."
   ]
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "### 1. Prepare ONNX model",
   "id": "19efcc457efd98f5"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "If you already have a model prepared, simply update the path to point to its location.\n",
    "If you do not yet have a model, set the path to the location where the model should be saved.\n",
    "(Refer to the following sections for instructions on downloading a sample model.)"
   ],
   "id": "ee8e66edf05117c1"
  },
  {
   "cell_type": "code",
   "id": "0786131d-a766-4172-8d3d-8f67d9c2dd62",
   "metadata": {},
   "source": [
    "# Modify the path to your ONNX model\n",
    "model_path = Path(\"path_to_onnx_model.onnx\")"
   ],
   "outputs": [],
   "execution_count": null
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "Use the following script to download the example model:\n",
    "\n",
    "*Note: Skip this step if you already have your own model.*"
   ],
   "id": "c374fef72f6ff68b"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "source": [
    "example_model_url = \"https://eiq.nxp.com/training-materials/_misc/models/model.onnx\"\n",
    "\n",
    "with open(model_path, \"wb\") as f:\n",
    "    response = requests.get(\n",
    "        url=example_model_url\n",
    "    )\n",
    "    f.write(response.content)"
   ],
   "id": "306a139eba7dd2b5",
   "outputs": [],
   "execution_count": null
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "### 2. Specify metadata",
   "id": "48dd7379780ab86a"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "In this section, we will upload the model metadata to **eIQ AI Toolkit**. This step involves specifying only the type of model being uploaded.",
   "id": "c1381034da115702"
  },
  {
   "cell_type": "code",
   "id": "ed6da5f0-31c9-4b2a-953e-86f76d74f6cd",
   "metadata": {},
   "source": [
    "import requests\n",
    "\n",
    "MODELS_API_URL = f\"{AI_TOOLKIT_BACKEND_URL}/models\"\n",
    "\n",
    "model_metadata = {\n",
    "        \"model_type\": \"onnx\",\n",
    "    }\n",
    "\n",
    "response = requests.post(MODELS_API_URL, json=model_metadata)\n",
    "response_data = response.json()\n",
    "model_uuid = response_data[\"data\"]['model']['uuid']"
   ],
   "outputs": [],
   "execution_count": null
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "### 3. Upload model",
   "id": "4ef7a906ba4ceeee"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "Now we can upload the model file.",
   "id": "46cde98286e58e6c"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "source": [
    "with open(model_path, \"rb\") as model_file:\n",
    "    response = requests.post(\n",
    "        url=f\"{AI_TOOLKIT_BACKEND_URL}/models/{model_uuid}\", # Model identifier is part of the request URL\n",
    "        files={\n",
    "            \"model_file\": model_file,\n",
    "        }\n",
    "    )\n",
    "\n",
    "print(response.json())"
   ],
   "id": "f277298c9178d7ac",
   "outputs": [],
   "execution_count": null
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "After uploading the model metadata and file, you can verify the model’s registration and readiness status using the following endpoint. If the status remains `in_progress`, call the endpoint repeatedly until it changes to `ready`.",
   "id": "75a635d9a083929a"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "source": [
    "response = requests.get(f\"{AI_TOOLKIT_BACKEND_URL}/models/{model_uuid}\")\n",
    "data = response.json()\n",
    "print(f'Model status: {data[\"data\"][\"model\"][\"status\"]}')\n",
    "print(f'Model status description: {data[\"data\"][\"model\"][\"status_description\"]}')"
   ],
   "id": "e0be7bb67a83a36c",
   "outputs": [],
   "execution_count": null
  },
  {
   "cell_type": "markdown",
   "id": "8871b96a-47b9-45a5-912d-7e4824586054",
   "metadata": {},
   "source": [
    "## Upload calibration dataset\n",
    "Quantizing an ONNX model requires a calibration dataset to be provided. To achieve this, you need to upload the dataset to the application as a *.zip* file. The calibration dataset must follow the correct directory structure.\n",
    "For example, for a model with **two** inputs named *input_name_1* and *input_name_2*, the expected dataset structure is as follows:\n",
    "```\n",
    "dataset.zip\n",
    "├── input_name_1\n",
    "│   ├── sample_1.npy\n",
    "│   ├── sample_2.npy\n",
    "│   └── ...\n",
    "└── input_name_2\n",
    "    ├── sample_1.npy\n",
    "    ├── sample_2.npy\n",
    "    └── ...\n",
    "```\n",
    "You can have multiple *.npy* files in each input folder representing different calibration samples."
   ]
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "### 1. Prepare dataset",
   "id": "66b161f0305a68fc"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "If you already have a dataset prepared, simply update the path to point to its location.\n",
    "If you do not yet have a dataset, set the path to the location where the dataset should be saved.\n",
    "(Refer to the following sections for instructions on downloading a sample dataset.)"
   ],
   "id": "faf6d69c2ebd5da8"
  },
  {
   "cell_type": "code",
   "id": "b9013292-f77d-4938-bb61-c9bd5ca10843",
   "metadata": {},
   "source": [
    "# Modify the path if you already have a dataset\n",
    "dataset_path = Path(\"path_to_calibration_dataset.zip\")"
   ],
   "outputs": [],
   "execution_count": null
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "Use the following script to download the example dataset:\n",
    "\n",
    "*Note: Skip this step if you already have your own dataset.*"
   ],
   "id": "ee40fbcdd70c1360"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "source": [
    "example_dataset_url = \"https://eiq.nxp.com/training-materials/_misc/datasets/kws_calib.zip\"\n",
    "\n",
    "with open(dataset_path, \"wb\") as f:\n",
    "    response = requests.get(\n",
    "        url=example_dataset_url\n",
    "    )\n",
    "    f.write(response.content)"
   ],
   "id": "eb5a9818d52c4ac4",
   "outputs": [],
   "execution_count": null
  },
  {
   "cell_type": "markdown",
   "id": "50e0bba1-8590-484d-b4b4-cd3b8b713dc8",
   "metadata": {},
   "source": "### 2. Upload dataset"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "Uploading a dataset **does not** require two steps as it does when uploading models. You can upload a dataset with a single endpoint call:",
   "id": "2c0047d72bb8f37c"
  },
  {
   "cell_type": "code",
   "id": "32e7f157-395a-4bdb-984a-584d9af3269a",
   "metadata": {},
   "source": [
    "DATASETS_API_URL = f\"{AI_TOOLKIT_BACKEND_URL}/datasets\"\n",
    "uploaded_dataset_name = \"My calibration dataset\"\n",
    "\n",
    "with dataset_path.open(\"rb\") as zip_file:\n",
    "    files = {\n",
    "        \"dataset_file\": (\"dataset.zip\", zip_file, \"application/zip\")\n",
    "    }\n",
    "    data = {\n",
    "        \"dataset_name\": uploaded_dataset_name,\n",
    "        \"dataset_type\": \"calibration\"\n",
    "    }\n",
    "    response = requests.post(DATASETS_API_URL, files=files, data=data)\n",
    "\n",
    "response_data = response.json()\n",
    "dataset_uuid = response_data[\"data\"]['dataset']['uuid']"
   ],
   "outputs": [],
   "execution_count": null
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "After uploading the dataset archive, you can verify the datasets’s status using the following endpoint. If the status remains `in_progress`, call the endpoint repeatedly until it changes to `ready`.",
   "id": "587ebfb5b64a21fa"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "source": [
    "response = requests.get(f\"{DATASETS_API_URL}/{dataset_uuid}\")\n",
    "data = response.json()\n",
    "print(f'Dataset status: {data[\"data\"][\"dataset\"][\"status\"]}')\n",
    "print(f'Dataset status description: {data[\"data\"][\"dataset\"][\"status_description\"]}')"
   ],
   "id": "1c08df22551189ff",
   "outputs": [],
   "execution_count": null
  },
  {
   "cell_type": "markdown",
   "id": "d567011d-4c57-4b69-846a-39f4c85e786b",
   "metadata": {},
   "source": "## Quantization"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "Now that both the dataset and model are prepared, you can run the **quantization pass**.\n",
    "If you want to see which configuration parameters can be set for this pass, send a request to the following endpoint:"
   ],
   "id": "6c2242207de07668"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "source": [
    "available_passes_response = requests.get(f\"{AI_TOOLKIT_BACKEND_URL}/optimizations/passes\")\n",
    "available_passes = available_passes_response.json()\n",
    "\n",
    "# This prints configuration parameters only for ONNX2Quant pass. Feel free to change it and explore\n",
    "# other passes as well.\n",
    "onnx2quant_pass_config = next(_pass for _pass in available_passes[\"data\"][\"passes\"] if _pass[\"type\"] == \"ONNX2Quant\")\n",
    "print(onnx2quant_pass_config)"
   ],
   "id": "25aa71e05490c2f9",
   "outputs": [],
   "execution_count": null
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "Now run the quantization:",
   "id": "5d02995235a88797"
  },
  {
   "cell_type": "code",
   "id": "8546db25-8722-4112-8659-872cf4b01e9c",
   "metadata": {},
   "source": [
    "OPTIMIZATIONS_API_URL = f\"{AI_TOOLKIT_BACKEND_URL}/optimizations\"\n",
    "RUN_OPTIMIZATION_API_URL = f\"{OPTIMIZATIONS_API_URL}/run\"\n",
    "\n",
    "pass_config = {\n",
    "    \"model_uuid\": model_uuid,\n",
    "    \"passes\": [\n",
    "        {\n",
    "            \"type\": \"ONNX2Quant\",\n",
    "            \"config\": {\n",
    "                \"allow_opset_10_and_lower\": \"false\",\n",
    "                \"dataset_uuid\": dataset_uuid,\n",
    "            }\n",
    "        },\n",
    "    ]\n",
    "}\n",
    "\n",
    "optimization_response = requests.post(RUN_OPTIMIZATION_API_URL, json=pass_config)\n",
    "optimization_response_data = optimization_response.json()\n",
    "optimization_uuid = optimization_response_data[\"data\"][\"optimization\"][\"uuid\"]"
   ],
   "outputs": [],
   "execution_count": null
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "The quantization process is now running. You can check its status by calling this endpoint repeatedly until the status changes to `success`.",
   "id": "f8bbde52496d4ef3"
  },
  {
   "metadata": {},
   "cell_type": "code",
   "source": [
    "response = requests.get(f\"{OPTIMIZATIONS_API_URL}/{optimization_uuid}\")\n",
    "response_data = response.json()\n",
    "status = response_data[\"data\"][\"optimization\"][\"status\"]\n",
    "print(f\"Quantization status: {status}\")\n",
    "\n",
    "if status == \"success\":\n",
    "    artifact_id = response_data[\"data\"][\"optimization\"][\"artifacts\"][0][\"artifact_id\"]"
   ],
   "id": "c07e83dd616e50c",
   "outputs": [],
   "execution_count": null
  },
  {
   "cell_type": "markdown",
   "id": "b8bfa04d-deee-4738-8e0e-f451381d0c98",
   "metadata": {},
   "source": "## Download quantized model"
  },
  {
   "cell_type": "code",
   "id": "91ffb356-0a66-43a1-98a0-0aa7978a4311",
   "metadata": {},
   "source": [
    "# Change model path to your location\n",
    "dest_model_path = Path(\"quantized_model.onnx\")"
   ],
   "outputs": [],
   "execution_count": null
  },
  {
   "cell_type": "code",
   "id": "78e2cba0-b83a-4436-aa9f-28c8e4788ee8",
   "metadata": {},
   "source": [
    "download_response = requests.get(f\"{AI_TOOLKIT_BACKEND_URL}/optimizations/{optimization_uuid}/resources/{artifact_id}\")\n",
    "\n",
    "with dest_model_path.open(\"wb\") as f:\n",
    "    f.write(download_response.content)"
   ],
   "outputs": [],
   "execution_count": null
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}