-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QNN EP] Make QNN EP a shared library #23120
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…osed by the provider bridge.
…evert this in favor of doing the transpose manually in QNN EP
…entType(), DataTypeImpl::TensorTypeFromONNXEnum()
…tions not available in the provider bridge.
…hat does not need to add new functionality to the provider bridge
…d bug in qnn_configs_helper
HectorSVC
previously approved these changes
Jan 10, 2025
onnxruntime/core/providers/shared_library/provider_wrappedtypes.h
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/shared_library/provider_wrappedtypes.h
Outdated
Show resolved
Hide resolved
…ternal qnn ep apis (can no longer do that)
adrianlizarraga
added a commit
that referenced
this pull request
Jan 17, 2025
…23402) ### Description - Fixes segfault when the function that cleans up HTP memory handles uses an invalid Logger. - Fixes unit test that compares output from QNN EP with exact float values. QNN HTP runs float32 models with float16 precision, so need to use a tolerance in the comparison. ### Motivation and Context Fixes issues with using QNN HTP memory sharing on Windows ARM64. This is also needed to test HTP shared memory with #23120.
jywu-msft
approved these changes
Jan 22, 2025
adrianlizarraga
added a commit
that referenced
this pull request
Jan 23, 2025
#23467) ### Description Fixes QNN EP builds due to missing function in provider bridge API: `logging::LoggingManager::HasDefaultLogger()` ### Motivation and Context A [recent PR](#23120) made QNN EP a shared library. A [different PR](#23435) added use of a new function to QNN EP that was not part of the provider bridge API. The CI did not catch it because main was not merged into the first PR before merging.
ashrit-ms
pushed a commit
that referenced
this pull request
Jan 23, 2025
…23402) ### Description - Fixes segfault when the function that cleans up HTP memory handles uses an invalid Logger. - Fixes unit test that compares output from QNN EP with exact float values. QNN HTP runs float32 models with float16 precision, so need to use a tolerance in the comparison. ### Motivation and Context Fixes issues with using QNN HTP memory sharing on Windows ARM64. This is also needed to test HTP shared memory with #23120.
ashrit-ms
pushed a commit
that referenced
this pull request
Jan 23, 2025
### Description - Makes QNN EP a shared library **by default** when building with `--use_qnn` or `--use_qnn shared_lib`. Generates the following build artifacts: - **Windows**: `onnxruntime_providers_qnn.dll` and `onnxruntime_providers_shared.dll` - **Linux**: `libonnxruntime_providers_qnn.so` and `libonnxruntime_providers_shared.so` - **Android**: Not supported. Must build QNN EP as a static library. - Allows QNN EP to still be built as a static library with `--use_qnn static_lib`. This is primarily for the Android QNN AAR package. - Unit tests run for both the static and shared QNN EP builds. ### Detailed changes - Updates Java bindings to support both shared and static QNN EP builds. - Provider bridge API: - Adds logging sink ETW to the provider bridge. Allows EPs to register ETW callbacks for ORT logging. - Adds a variety of methods for onnxruntime objects that are needed by QNN EP. - QNN EP: - Adds `ort_api.h` and `ort_api.cc` that encapsulates the API provided by ORT in a manner that allows the EP to be built as either a shared or static library. - Adds custom function to transpose weights for Conv and Gemm (instead of adding util to provider bridge API). - Adds custom function to quantize data for LeakyRelu (instead of adding util to provider bridge API). - Adds custom ETW tracing for QNN profiling events: - shared library: defines its own TraceLogging provider handle - static library: uses ORT's TraceLogging provider handle and existing telemetry provider. - ORT-QNN Packages: - **Python**: Pipelines build QNN EP as a shared library by default. User can build a local python wheel with QNN EP as a static library by passing `--use_qnn static_lib`. - **NuGet**: Pipelines build QNN EP as a shared library by default. `build.py` currently enforces QNN EP to be built as a shared library. Can add support for building a QNN NuGet package with static later if deemed necessary. - **Android**: Pipelines build QNN EP as a **static library**. `build.py` enforces QNN EP to be built as a static library. Packaging multiple shared libraries into an Android AAR package is not currently supported due to the added need to also distribute a shared libcpp.so library. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
ashrit-ms
pushed a commit
that referenced
this pull request
Jan 23, 2025
#23467) ### Description Fixes QNN EP builds due to missing function in provider bridge API: `logging::LoggingManager::HasDefaultLogger()` ### Motivation and Context A [recent PR](#23120) made QNN EP a shared library. A [different PR](#23435) added use of a new function to QNN EP that was not part of the provider bridge API. The CI did not catch it because main was not merged into the first PR before merging.
ashrit-ms
added a commit
that referenced
this pull request
Jan 23, 2025
### Description This PR is to update the win-ort-main branch to the tip main branch as of 2025-01-23. ### PR List ddf0d37 [QNN EP] Add LoggingManager::HasDefaultLogger() to provider bridge API (#23467) 05fbbdf [QNN EP] Make QNN EP a shared library (#23120) 1336566 Add custom vcpkg ports (#23456) 2e1173c Update the compile flags for vcpkg packages (#23455) 1f628a9 [Mobile] Add BrowserStack Android MAUI Test (#23383) 009cae0 [js/webgpu] Optimize ConvTranspose (Continue) (#23429) 04a4a69 Use onnx_protobuf.h to suppress some GCC warnings (#23453) 2e3b62b Suppress some strict-aliasing related warnings in WebGPU EP (#23454) b708f9b Bump ruff from 0.9.1 to 0.9.2 (#23427) c0afc66 [WebNN] Remove workarounds for TFLite backend (#23406) 8a821ff Bump vite from 6.0.7 to 6.0.11 in /js/web/test/e2e/exports/testcases/vite-default (#23446) 220c1a2 Make ORT and Dawn use the same protobuf/abseil source code (#23447) b7b5792 Change MacOS-13 to ubuntu on for android-java-api-aar-test.yml. (#23444) 19d0d2a WIP: Dp4MatMulNBits accuracy level 4 matmul for WebGPU EP (#23365) 95b8eff [QNN EP]: Clean up QNN logging resources if an error occurs during initialization (#23435) 626134c Bump clang-format from 19.1.6 to 19.1.7 (#23428) 0cf9753 Fix eigen external deps (#23439) f9440ae Moving RN_CI Android Testing to Linux (#23422) 1aa5902 [QNN EP] workaround for QNN validation bug for Tanh with uint16 quantized output (#23432) 7f5582a Seperate RN andriod and IOS into 2 separated Stages. (#23400) 73deac2 Implement some missing element wise Add/Sub/Mul/Div/Neg operations for CPU and CUDA EPs (#23090) 949fe42 Upgrade Java version from react-native/android to Java 17 (#23066) 0892c23 Update Qnn SDK default version to 2.30 (#23411) 94c099b Fix type cast build error (#23423) d633e57 [WebNN EP] Fix AddInitializersToSkip issues (#23354) e988ef0 [QNN EP] Fix regression for MatMul with two quantized/dynamic uint16 inputs (#23419) 7538795 Update onnxruntime binary size checks ci pipeline's docker image (#23405) 6c5ea41 Revert "[QNN EP] Clean up correctly from a partial setup (#23320)" (#23420) e866804 Enable comprehension simplification in ruff rules (#23414) 0a5f1f3 bugfix: string_view of invalid memory (#23417) 4cc38e0 fix crash when first input of BatchNormalization is 1-D (#23387) 0334414 Target py310 and modernize codebase with ruff (#23401) 87341ac [QNN EP] Fix segfault when unregistering HTP shared memory handles (#23402) ### Motivation and Context This update includes the change to make QNN-EP a shared library. --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Adrian Lizarraga <[email protected]> Co-authored-by: Justin Chu <[email protected]> Co-authored-by: Yulong Wang <[email protected]> Co-authored-by: Edward Chen <[email protected]> Co-authored-by: Changming Sun <[email protected]> Co-authored-by: Peishen Yan <[email protected]> Co-authored-by: Tianlei Wu <[email protected]> Co-authored-by: Hector Li <[email protected]> Co-authored-by: Jian Chen <[email protected]> Co-authored-by: Alexis Tsogias <[email protected]> Co-authored-by: junchao-zhao <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: sushraja-msft <[email protected]> Co-authored-by: Wanming Lin <[email protected]> Co-authored-by: Jiajia Qin <[email protected]> Co-authored-by: Caroline Zhu <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
--use_qnn
or--use_qnn shared_lib
. Generates the following build artifacts:onnxruntime_providers_qnn.dll
andonnxruntime_providers_shared.dll
libonnxruntime_providers_qnn.so
andlibonnxruntime_providers_shared.so
--use_qnn static_lib
. This is primarily for the Android QNN AAR package.Detailed changes
ort_api.h
andort_api.cc
that encapsulates the API provided by ORT in a manner that allows the EP to be built as either a shared or static library.--use_qnn static_lib
.build.py
currently enforces QNN EP to be built as a shared library. Can add support for building a QNN NuGet package with static later if deemed necessary.build.py
enforces QNN EP to be built as a static library. Packaging multiple shared libraries into an Android AAR package is not currently supported due to the added need to also distribute a shared libcpp.so library.Motivation and Context