Atriva Atriva
Documentation

Atriva Edge AI Platform

Guides, tutorials, and API references to help you build and deploy Edge AI applications.

Preparation

This section describes how to convert ONNX models into OpenVINO IR format, quantize for CPU/NPU, and validate compatibility.

OpenVINO IR is required for:

  • Optimal CPU performance
  • iGPU/NPU execution
  • INT8 hardware acceleration

1. Export Model to ONNX

PyTorch example:

torch.onnx.export(
    model,
    dummy_input,
    "model.onnx",
    opset_version=17,
    input_names=["input"],
    output_names=["output"]
)

TensorFlow:

python -m tf2onnx.convert --saved-model ./saved --output model.onnx

2. Convert ONNX → OpenVINO IR

Use Model Optimizer:

mo --input_model model.onnx --output_dir ./ir --compress_to_fp16

Produces:

  • model.xml
  • model.bin

FP16 recommended for:

  • CPU
  • iGPU
  • NPU

For FP32:

mo --input_model model.onnx --data_type FP32

3. Quantization (INT8)

Use POT (Post-Training Optimization Tool):

from openvino.tools.pot import DataLoader, IEEngine, load_model, save_model

CLI version:

pot -m model.xml -d config.json -o output/

Benefits:

  • Faster CPU execution
  • Required for optimal NPU performance

4. Validate the IR Model

from openvino.runtime import Core

ie = Core()
model = ie.read_model("model.xml")
compiled = ie.compile_model(model, "CPU")

5. Check Input/Output Shapes

for inp in model.inputs:
    print(inp.get_any_name(), inp.shape, inp.element_type)

Ensure:

  • NCHW or NHWC matches your preprocessing
  • Dynamic shapes are allowed (-1)

6. Layout Conversion

If your model expects NHWC but your pipeline is NCHW:

mo --input_model model.onnx --layout=input(NHWC)

7. Batch Size

Static batch:

mo --input_model model.onnx --batch 4

Dynamic batch:

mo --input_model model.onnx --input_shape "[?,3,224,224]"

8. Best Practices

Prefer FP16 for GPU/NPU Prefer INT8 for CPU (if accuracy drop acceptable) Always validate the model with a real input Use dynamic input shapes for video analytics pipelines