From c08de75d667cc5387e1e6d67e72d9e330669811a Mon Sep 17 00:00:00 2001 From: Jeff Mendenhall <61433971+jeffmend@users.noreply.github.com> Date: Mon, 11 Dec 2023 20:01:05 -0800 Subject: [PATCH] Update winarm.svelte update winarm instruuctions to be current --- src/routes/components/winarm.svelte | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/src/routes/components/winarm.svelte b/src/routes/components/winarm.svelte index fa121e6c3ea94..2f1e42f602469 100644 --- a/src/routes/components/winarm.svelte +++ b/src/routes/components/winarm.svelte @@ -28,16 +28,12 @@

Optimizing models for the NPU

ONNX is a standard format for representing ML models authored in frameworks like PyTorch, TensorFlow, and others. ONNX Runtime can run any ONNX model, however to make use of the NPU, - you currently need to use the following steps: -
    -
  1. Run the tools provided in the SNPE SDK on your model to generate a binary file.
  2. -
  3. Include the contents of the binary file as a node in the ONNX graph.
  4. + you currently need to quantize the ONNX model to QDQ model.
    See our C# tutorial for an example of how this is done. -

Many models can be optimized for the NPU using this process. Even if a model cannot be optimized - for NPU by the SNPE SDK, it can still be run by ONNX Runtime on the CPU. + for the NPU, it can still be run by ONNX Runtime on the CPU.

Getting Help

For help with ONNX Runtime, you can start a discussion on GitHub or file an issue.