TF Lite Quantizer

This tool quantizes a FLOAT (FP16/FP32) TensorFlow Lite model using profiling guided quantization. The quantizer uses dynamic range information from a profile file (generated by tflite-profiler) to determine optimal quantization parameters for each tensor.

Prerequisites

  • Model must be a FLOAT TensorFlow Lite model.

  • Profile file must be generated using tflite-profiler with the same model and representative dataset.

Usage

The tflite-quantizer has the following syntax:

tflite-quantizer --input <input_model_path> --profile <profile_file_path> [OPTIONS]

Required Parameters

  • --input - The path to the input FLOAT TensorFlow Lite model.

  • --profile - The path to the input profile file in CSV format generated by tflite-profiler. The profile file contains dynamic range information with the following structure:

    <tensor1_index>,<tensor1_min>,<tensor1_max>,<tensor1_histogram_bin1>,<tensor1_histogram_bin2>,..,<tensor1_histogram_binN>,
    <tensor2_index>,<tensor2_min>,<tensor2_max>,<tensor2_histogram_bin1>,<tensor2_histogram_bin2>,...<tensor2_histogram_binN>,
    ...
    

Optional Parameters

  • --output - The path for the output quantized TensorFlow Lite model. If not provided, the output model will be written in the same directory as the input model with the suffix _quantized added to the filename.

  • --graph-name - For models with multiple graphs, this specifies the name of the graph to quantize. This parameter is ignored for single-graph models.

Quantization Configuration Parameters

Global Quantization Options

  • --quantize-inputs - Whether to quantize the input graph placeholders. Options: True, False. Default is True.

  • --quantize-outputs - Whether to quantize the output graph placeholders. Options: True, False. Default is True.

  • --quantize-constants - Whether to quantize the constant tensors (weights). Options: True, False. Default is True.

  • --quantize-variables - Whether to quantize the variable tensors (activations). Options: True, False. Default is True.

Data Type Configuration

  • --quantization-constant-type - The data type for quantizing constant tensors (weights). Options: INT8, INT16, INT32. Default is INT8.

  • --quantization-constant-type-bias - The data type for quantizing bias tensors in Conv2D, DepthwiseConv2D, TransposeConv2D, Conv3D, TransposeConv3D, and FullyConnected operations. Options: INT8, INT16, INT32. Default is INT32.

  • --quantization-variable-type - The data type for quantizing variable tensors (activations). Options: INT8, INT16, INT32. Default is INT8.

Granularity and Schema Configuration

  • --quantization-constant-granularity - The granularity for quantizing constant weight tensors in Conv2D, DepthwiseConv2D, TransposeConv2D, Conv3D, and TransposeConv3D operations. Options: PTQ (Per-Tensor Quantization), PCQ (Per-Channel Quantization). Default is PCQ.

  • --quantization-constant-schema - The schema for quantizing constant tensors (weights). Options: Asymmetric, Symmetric, SymmetricWithPower2Scale. Default is Symmetric.

  • --quantization-variable-schema - The schema for quantizing variable tensors (activations). Options: Asymmetric, Symmetric, SymmetricWithPower2Scale. Default is Asymmetric.

Custom Quantization Options

  • --quantization-variable-custom-symmetric - Option to enable special quantization for Mul and BatchMathMul variable inputs. Default is false.

Profiling configuration

  • --quantization-calibration-method - Method used to determine quantization range for each tensor. Options: MinMax ( takes Min/Max values from profiling), Percentile (remove outliers according to the value specified in --quantization_percentile_val). Default is MinMax.

  • --quantization-percentile-val - Percentile-based clipping value. Preserve the central portion of the histogram symmetrically. Default is 99.9.

Operator-Level Control

Operator Type Filtering

  • --quantize-operator-types-except - Exclude specific operator types from quantization. Accepts comma-separated list of operator names or indices (e.g., ADD,AVERAGE_POOL_2D or 0,1).

  • --quantize-operator-types-only - Only quantize specific operator types. Accepts comma-separated list of operator names or indices (e.g., ADD,AVERAGE_POOL_2D or 0,1).

Beside individual operators, there is also the option to exclude from quantization various macro operators sensitive to quantization, such as Layer Norm. If layernorm is specified in the --quantize-operator-types-except flag, the lowered pattern of Layer Norm is identified and completely excluded from quantization.

Note:

  • The except and only options can be combined for both operator types and instances.

  • Operator identification: Use operator index or the name of the first output tensor (operators don’t have standalone names)

Debugging and Validation Options

  • --run-after-quantize - Validates the quantized model by running it with the TFLite interpreter using dummy input data. This option only works with standard quantization configurations supported by TFLite interpreter (e.g., INT8 with default parameters). Options: True, False. Default is False.

  • --verbose - Enables detailed console output including quantization constraints applied to the graph. Options: True, False. Default is False.

Example

tflite-quantizer --input model.tflite --profile model_profile.csv --output model_quantized.tflite --verbose True

This command quantizes model.tflite using the profile information from model_profile.csv, saves the result as model_quantized.tflite, and enables verbose output for debugging.