Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation for ovep rel-5.4 #21909

Merged
merged 1 commit into from
Sep 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions docs/build/eps.md
Original file line number Diff line number Diff line change
Expand Up @@ -260,13 +260,13 @@ See more information on the OpenVINO™ Execution Provider [here](../execution-p
### Prerequisites
{: .no_toc }

1. Install the OpenVINO™ offline/online installer from Intel<sup>®</sup> Distribution of OpenVINO™<sup>TM</sup> Toolkit **Release 2024.1** for the appropriate OS and target hardware:
* [Windows - CPU, GPU, NPU](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/download.html?VERSION=v_2023_1_0&OP_SYSTEM=WINDOWS&DISTRIBUTION=ARCHIVE).
* [Linux - CPU, GPU](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/download.html?VERSION=v_2023_1_0&OP_SYSTEM=LINUX&DISTRIBUTION=ARCHIVE)
1. Install the OpenVINO™ offline/online installer from Intel<sup>®</sup> Distribution of OpenVINO™<sup>TM</sup> Toolkit **Release 2024.3** for the appropriate OS and target hardware:
* [Windows - CPU, GPU, NPU](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/download.html?PACKAGE=OPENVINO_BASE&VERSION=v_2024_3_0&OP_SYSTEM=WINDOWS&DISTRIBUTION=ARCHIVE).
* [Linux - CPU, GPU](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/download.html?PACKAGE=OPENVINO_BASE&VERSION=v_2024_3_0&OP_SYSTEM=LINUX&DISTRIBUTION=ARCHIVE)

Follow [documentation](https://docs.openvino.ai/2024/home.html) for detailed instructions.

*2024.1 is the current recommended OpenVINO™ version. [OpenVINO™ 2023.1](https://docs.openvino.ai/archive/2023.1/home.html) is minimal OpenVINO™ version requirement.*
*2024.3 is the current recommended OpenVINO™ version. [OpenVINO™ 2023.3](https://docs.openvino.ai/2023.3/home.html) is minimal OpenVINO™ version requirement.*

2. Configure the target hardware with specific follow on instructions:
* To configure Intel<sup>®</sup> Processor Graphics(GPU) please follow these instructions: [Windows](https://docs.openvino.ai/latest/openvino_docs_install_guides_configurations_for_intel_gpu.html#gpu-guide-windows), [Linux](https://docs.openvino.ai/latest/openvino_docs_install_guides_configurations_for_intel_gpu.html#linux)
Expand Down
37 changes: 28 additions & 9 deletions docs/execution-providers/OpenVINO-ExecutionProvider.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Accelerate ONNX models on Intel CPUs, GPUs, NPU with Intel OpenVINO™ Execution
## Install

Pre-built packages and Docker images are published for OpenVINO™ Execution Provider for ONNX Runtime by Intel for each release.
* OpenVINO™ Execution Provider for ONNX Runtime Release page: [Latest v5.2 Release](https://github.com/intel/onnxruntime/releases)
* OpenVINO™ Execution Provider for ONNX Runtime Release page: [Latest v5.4 Release](https://github.com/intel/onnxruntime/releases)
* Python wheels Ubuntu/Windows: [onnxruntime-openvino](https://pypi.org/project/onnxruntime-openvino/)
* Docker image: [openvino/onnxruntime_ep_ubuntu20](https://hub.docker.com/r/openvino/onnxruntime_ep_ubuntu20)

Expand All @@ -30,10 +30,9 @@ ONNX Runtime OpenVINO™ Execution Provider is compatible with three lastest rel

|ONNX Runtime|OpenVINO™|Notes|
|---|---|---|
|1.19.0|2024.3|[Details](https://github.com/intel/onnxruntime/releases/tag/v5.4)|
|1.18.0|2024.1|[Details](https://github.com/intel/onnxruntime/releases/tag/v5.3)|
|1.17.1|2023.3|[Details](https://github.com/intel/onnxruntime/releases/tag/v5.2)|
|1.16.0|2023.1|[Details](https://github.com/intel/onnxruntime/releases/tag/v5.1)|
|1.15.0|2023.0|[Details](https://github.com/intel/onnxruntime/releases/tag/v5.0.0)|
|1.14.0|2022.3|[Details](https://github.com/intel/onnxruntime/releases/tag/v4.3)|

## Build

Expand Down Expand Up @@ -200,8 +199,30 @@ For more information on Multi-Device plugin of OpenVINO™, please refer to the
[Intel OpenVINO™ Multi Device Plugin](https://docs.openvino.ai/latest/openvino_docs_OV_UG_Running_on_multiple_devices.html).

### Export OpenVINO Compiled Blob
Export the OpenVINO compiled blob as an ONNX model. Using this ONNX model for subsequent inferences avoids model recompilation and could have a positive impact on Session creation time. The exported model is saved to the same directory as the source model with the suffix -ov_{device}_blob.onnx where device can be one of the supported like CPU or NPU. This feature is currently enabled for fully supported models only.
Refer to [Configuration Options](#configuration-options) for more information about using these runtime options.
Export the OpenVINO compiled blob as an ONNX model. Using this ONNX model for subsequent inferences avoids model recompilation and could have a positive impact on Session creation time. This feature is currently enabled for fully supported models only. It complies with the ORT session config keys
```
Ort::SessionOptions session_options;

// Enable EP context feature to dump the partitioned graph which includes the EP context into Onnx file.
// "0": disable. (default)
// "1": enable.

session_options.AddConfigEntry(kOrtSessionOptionEpContextEnable, "1");

// Flag to specify whether to dump the EP context into single Onnx model or pass bin path.
// "0": dump the EP context into separate file, keep the file name in the Onnx model.
// "1": dump the EP context into the Onnx model. (default).

session_options.AddConfigEntry(kOrtSessionOptionEpContextEmbedMode, "1");

// Specify the file path for the Onnx model which has EP context.
// Defaults to <actual_model_path>/original_file_name_ctx.onnx if not specified

session_options.AddConfigEntry(kOrtSessionOptionEpContextFilePath, ".\ov_compiled_epctx.onnx");

sess = onnxruntime.InferenceSession(<path_to_model_file>, session_options)
```
Refer to [Session Options](https://github.com/microsoft/onnxruntime/blob/main/include/onnxruntime/core/session/onnxruntime_session_options_config_keys.h) for more information about session options.

### Enable QDQ Optimizations Passes
Optimizes ORT quantized models for the NPU device to only keep QDQs for supported ops and optimize for performance and accuracy.Generally this feature will give better performance/accuracy with ORT Optimizations disabled.
Expand Down Expand Up @@ -239,8 +260,7 @@ The session configuration options are passed to SessionOptionsAppendExecutionPro

```
OrtOpenVINOProviderOptions options;
options.device_type = "GPU";
options.precision = "FP32";
options.device_type = "GPU_FP32";
options.num_of_threads = 8;
options.cache_dir = "";
options.context = 0x123456ff;
Expand Down Expand Up @@ -277,7 +297,6 @@ The following table lists all the available configuration options for API 2.0 an
| context | string | OpenCL Context | void* | This option is only available when OpenVINO EP is built with OpenCL flags enabled. It takes in the remote context i.e the cl_context address as a void pointer.|
| enable_opencl_throttling | string | True/False | boolean | This option enables OpenCL queue throttling for GPU devices (reduces CPU utilization when using GPU). |
| enable_qdq_optimizer | string | True/False | boolean | This option enables QDQ Optimization to improve model performance and accuracy on NPU. |
| export_ep_ctx_blob | string | True/False | boolean | This options enables exporting the OpenVINO Compiled Blob as an ONNX Operator EPContext. |


Valid Hetero or Multi or Auto Device combinations:
Expand Down
Loading