diff --git a/src/routes/components/winarm.svelte b/src/routes/components/winarm.svelte index fa121e6c3ea94..2f1e42f602469 100644 --- a/src/routes/components/winarm.svelte +++ b/src/routes/components/winarm.svelte @@ -28,16 +28,12 @@

Optimizing models for the NPU

ONNX is a standard format for representing ML models authored in frameworks like PyTorch, TensorFlow, and others. ONNX Runtime can run any ONNX model, however to make use of the NPU, - you currently need to use the following steps: -
    -
  1. Run the tools provided in the SNPE SDK on your model to generate a binary file.
  2. -
  3. Include the contents of the binary file as a node in the ONNX graph.
  4. + you currently need to quantize the ONNX model to QDQ model.
    See our C# tutorial for an example of how this is done. -

Many models can be optimized for the NPU using this process. Even if a model cannot be optimized - for NPU by the SNPE SDK, it can still be run by ONNX Runtime on the CPU. + for the NPU, it can still be run by ONNX Runtime on the CPU.

Getting Help

For help with ONNX Runtime, you can start a discussion on GitHub or file an issue.