forked from microsoft/onnxruntime
-
Notifications
You must be signed in to change notification settings - Fork 23
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Enable QNN HTP spill fill buffer setting to save RAM usage. (microsof…
…t#22853) ### Description Enable QNN HTP spill fill buffer setting to save RAM usage. This feature is available after QNN 2.28. Need to re-generate QNN context binary. https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/htp_backend.html#qnn-htp-backend-api Requirements: 1. Need to re-generate the Onnx model with QNN context binary by set the EP option enable_htp_spill_fill_buffer = 1. 2. Works for a model with multiple Context binaries. Need manually merge 2 Onnx model with context binary into 1 Onnx model. 3. Requires Linux platform if generate the context binary offline since QnnSystem lib is not available for Windows x86_64 platform. No need to do extra thing while running the model inference. The generated EPContext node will have a max_size attribute with the maximum spill fill buffer size for the context binary <img width="353" alt="image" src="https://github.com/user-attachments/assets/a3bf48be-a8da-4381-8a1d-3f2558eea37d"> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
- Loading branch information
Showing
12 changed files
with
208 additions
and
50 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.