-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Mobile] NNAPI on android does not accelerate with device GPU #18107
Comments
Depends on the model as to what is possible. If the model has dynamic input shapes NNAPI can't be used. I'd suggest setting the logging severity to VERBOSE and checking the output for 'Node placements' to see which nodes if any are on NNAPI. |
Thank you so much for your quick reply! Yes, the model indeed has dynamic input shapes. If this is the case, is there an alternative way for the hardware acceleration? The reason that I set dynamic input shapes is that I'm deploying LLM on android with onnxruntime now since the input to the LLM could have different size during runtime. |
Unfortunately we don't have any way currently if the input shapes can't be made fixed (tools for that here). |
Thank you so much! I have changed the model to the static input shapes. However, when I'm running the inference, the NNAPI is automatically selecting the google-edgetpu which is not supported by onnxruntime currently. However, I do have google-armnn. My I know how to switch the nnapi to use armnn only? Or I have to use the ArmNN execution provider? The log is shown below
Again, thanks for your support! |
Because you're telling NNAPI to not use CPU it's failing. We don't control what NNAPI chooses beyond forcing it to not use CPU. It will pick what it thinks best. Note that the model isn't going to run well with NNAPI due to there being so many different partitions. If you run https://onnxruntime.ai/docs/tutorials/mobile/helpers/model-usability-checker.html it will tell you which operator/s are not currently supported. There is an ARM NN execution provider that you could maybe do a custom build with, but as that hasn't been updated for 3 years it a) may not build (IIRC the original usage when it was added did not involve Android) and b) not clear it would provide any value as-is. https://onnxruntime.ai/docs/build/eps.html#armnn |
Thanks! |
Describe the issue
Recently, I attempted to integrate the NNAPIExecutionProvider into my ORT session using the
NNAPI_FLAG_CPU_DISABLED
flag. I previously tested the inference with theNNAPI_FLAG_CPU_ONLY
flag. Unfortunately, I observed no GPU acceleration during the inference.Below is the log output when I initiate my Android app:
The device I'm currently using is Google Pixel 7pro.
To reproduce
The session with NNAPI provider is created with following code:
Urgency
This is urgent.
Platform
Android
OS Version
13
ONNX Runtime Installation
Released Package
Compiler Version (if 'Built from Source')
No response
Package Name (if 'Released Package')
onnxruntime-android
ONNX Runtime Version or Commit ID
1.16.0
ONNX Runtime API
C++/C
Architecture
ARM64
Execution Provider
NNAPI
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: