Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mobile] NNAPI on android does not accelerate with device GPU #18107

Closed
zjc664656505 opened this issue Oct 26, 2023 · 6 comments
Closed

[Mobile] NNAPI on android does not accelerate with device GPU #18107

zjc664656505 opened this issue Oct 26, 2023 · 6 comments
Labels
ep:ArmNN issues related to Arm NN execution provider platform:mobile issues related to ONNX Runtime mobile; typically submitted using template

Comments

@zjc664656505
Copy link

Describe the issue

Recently, I attempted to integrate the NNAPIExecutionProvider into my ORT session using the NNAPI_FLAG_CPU_DISABLED flag. I previously tested the inference with the NNAPI_FLAG_CPU_ONLY flag. Unfortunately, I observed no GPU acceleration during the inference.

Below is the log output when I initiate my Android app:

023-10-25 22:39:46.710 26548-26548 Compatibil...geReporter com.example.llm_generative           D  Compat change id reported: 171979766; UID 10255; state: ENABLED
2023-10-25 22:39:46.719 26548-26548 ziparchive              com.example.llm_generative           W  Unable to open '/data/app/~~LGMsFvnWmfEV_S9dsvEivg==/com.example.llm_generative-d8JOVP0l6g1n8LJRMMrzfw==/base.dm': No such file or directory
2023-10-25 22:39:46.719 26548-26548 ziparchive              com.example.llm_generative           W  Unable to open '/data/app/~~LGMsFvnWmfEV_S9dsvEivg==/com.example.llm_generative-d8JOVP0l6g1n8LJRMMrzfw==/base.dm': No such file or directory
2023-10-25 22:39:46.868 26548-26548 GraphicsEnvironment     com.example.llm_generative           V  ANGLE Developer option for 'com.example.llm_generative' set to: 'default'
2023-10-25 22:39:46.868 26548-26548 GraphicsEnvironment     com.example.llm_generative           V  ANGLE GameManagerService for com.example.llm_generative: false
2023-10-25 22:39:46.868 26548-26548 GraphicsEnvironment     com.example.llm_generative           V  Neither updatable production driver nor prerelease driver is supported.
2023-10-25 22:39:46.870 26548-26548 NetworkSecurityConfig   com.example.llm_generative           D  No Network Security Config specified, using platform default
2023-10-25 22:39:46.871 26548-26548 NetworkSecurityConfig   com.example.llm_generative           D  No Network Security Config specified, using platform default
2023-10-25 22:39:46.880 26548-26581 vulkan                  com.example.llm_generative           D  searching for layers in '/data/app/~~LGMsFvnWmfEV_S9dsvEivg==/com.example.llm_generative-d8JOVP0l6g1n8LJRMMrzfw==/lib/arm64'
2023-10-25 22:39:46.880 26548-26581 vulkan                  com.example.llm_generative           D  searching for layers in '/data/app/~~LGMsFvnWmfEV_S9dsvEivg==/com.example.llm_generative-d8JOVP0l6g1n8LJRMMrzfw==/base.apk!/lib/arm64-v8a'
2023-10-25 22:39:47.031 26548-26548 Tokenizer Load          com.example.llm_generative           D  Loading Tokenizer
2023-10-25 22:39:49.134 26548-26548 Tokenizer Load          com.example.llm_generative           D  Tokenizer Loaded!
2023-10-25 22:39:57.926 26548-26548 Module Path             com.example.llm_generative           D  /data/user/0/com.example.llm_generative/cache/module_0.onnx
2023-10-25 22:40:00.642 26548-26548 Module Path             com.example.llm_generative           D  /data/user/0/com.example.llm_generative/cache/module_1.onnx
2023-10-25 22:40:09.543 26548-26548 Module Path             com.example.llm_generative           D  /data/user/0/com.example.llm_generative/cache/module_2.onnx
2023-10-25 22:40:10.775 26548-26548 Manager                 com.example.llm_generative           I  DeviceManager::DeviceManager
2023-10-25 22:40:10.775 26548-26548 ServerFlag              com.example.llm_generative           W  Failed to parse result of GetServerConfigurableFlag, errno=34
2023-10-25 22:40:10.775 26548-26548 Manager                 com.example.llm_generative           I  findAvailableDevices
2023-10-25 22:40:10.783 26548-26548 Manager                 com.example.llm_generative           I  Found interface google-edgetpu (version = 2.0)
2023-10-25 22:40:10.783 26548-26548 Manager                 com.example.llm_generative           I  Found interface google-armnn (version = ArmNN)
2023-10-25 22:40:10.785 26548-26548 libc                    com.example.llm_generative           W  Access denied finding property "ro.mediatek.platform"
2023-10-25 22:40:10.785 26548-26548 libc                    com.example.llm_generative           W  Access denied finding property "ro.chipname"
2023-10-25 22:40:10.785 26548-26548 libc                    com.example.llm_generative           W  Access denied finding property "ro.hardware.chipname"

The device I'm currently using is Google Pixel 7pro.

To reproduce

The session with NNAPI provider is created with following code:

// class for loading inference model
struct ArtifactPaths {
    std::string inference_model_path;

    ArtifactPaths(const std::string &inference_model_path) :
            inference_model_path(inference_model_path) {}
};

// class for creating onnx session
struct SessionCache {
    ArtifactPaths artifact_paths;
    Ort::Env ort_env;
    Ort::SessionOptions session_options;
    Ort::Session inference_session;

    SessionCache(const std::string &inference_model_path) :
            artifact_paths(inference_model_path),
            ort_env(ORT_LOGGING_LEVEL_WARNING, "distributed inference demo"),
            session_options(),
            inference_session(CreateInferenceSession(inference_model_path)) {
    }

private:
    Ort::Session CreateInferenceSession(const std::string &model_path) {
        // Set NNAPI execution provider
        uint32_t nnapi_flag = 0;
        nnapi_flag |= NNAPI_FLAG_CPU_DISABLED;
        Ort::ThrowOnError(OrtSessionOptionsAppendExecutionProvider_Nnapi(session_options, nnapi_flag));

        // Set graph optimization level
        session_options.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_ALL);

        // Return the created inference session
        return Ort::Session(ort_env, model_path.c_str(), session_options);
    }
};

Urgency

This is urgent.

Platform

Android

OS Version

13

ONNX Runtime Installation

Released Package

Compiler Version (if 'Built from Source')

No response

Package Name (if 'Released Package')

onnxruntime-android

ONNX Runtime Version or Commit ID

1.16.0

ONNX Runtime API

C++/C

Architecture

ARM64

Execution Provider

NNAPI

Execution Provider Library Version

No response

@zjc664656505 zjc664656505 added the platform:mobile issues related to ONNX Runtime mobile; typically submitted using template label Oct 26, 2023
@github-actions github-actions bot added the ep:ArmNN issues related to Arm NN execution provider label Oct 26, 2023
@skottmckay
Copy link
Contributor

Depends on the model as to what is possible. If the model has dynamic input shapes NNAPI can't be used.

I'd suggest setting the logging severity to VERBOSE and checking the output for 'Node placements' to see which nodes if any are on NNAPI.

@zjc664656505
Copy link
Author

zjc664656505 commented Oct 26, 2023

Thank you so much for your quick reply! Yes, the model indeed has dynamic input shapes. If this is the case, is there an alternative way for the hardware acceleration?

The reason that I set dynamic input shapes is that I'm deploying LLM on android with onnxruntime now since the input to the LLM could have different size during runtime.

@skottmckay
Copy link
Contributor

Unfortunately we don't have any way currently if the input shapes can't be made fixed (tools for that here).

@zjc664656505
Copy link
Author

zjc664656505 commented Oct 26, 2023

Thank you so much!

I have changed the model to the static input shapes. However, when I'm running the inference, the NNAPI is automatically selecting the google-edgetpu which is not supported by onnxruntime currently. However, I do have google-armnn.

My I know how to switch the nnapi to use armnn only?

Or I have to use the ArmNN execution provider?

The log is shown below

2023-10-26 01:06:06.094  3948-3948  ServerFlag              com.example.llm_generative           W  Failed to parse result of GetServerConfigurableFlag, errno=34
2023-10-26 01:06:06.094  3948-3948  Manager                 com.example.llm_generative           I  findAvailableDevices
2023-10-26 01:06:06.146  3948-3948  Manager                 com.example.llm_generative           I  Found interface google-edgetpu (version = 2.0)
2023-10-26 01:06:06.146  3948-3948  Manager                 com.example.llm_generative           I  Found interface google-armnn (version = ArmNN)
2023-10-26 01:06:06.154  3948-3948  libc                    com.example.llm_generative           W  Access denied finding property "ro.mediatek.platform"
2023-10-26 01:06:06.154  3948-3948  libc                    com.example.llm_generative           W  Access denied finding property "ro.chipname"
2023-10-26 01:06:06.154  3948-3948  libc                    com.example.llm_generative           W  Access denied finding property "ro.hardware.chipname"
2023-10-26 01:06:06.278  3948-3948  onnxruntime             com.example.llm_generative           W   [W:onnxruntime:distributed inference demo, nnapi_execution_provider.cc:225 GetCapability] NnapiExecutionProvider::GetCapability, number of partitions supported by NNAPI: 47 number of nodes in the graph: 460 number of nodes supported by NNAPI: 397
2023-10-26 01:06:06.280  3948-3948  onnxruntime             com.example.llm_generative           W   [W:onnxruntime:distributed inference demo, nnapi_execution_provider.cc:225 GetCapability] NnapiExecutionProvider::GetCapability, number of partitions supported by NNAPI: 47 number of nodes in the graph: 460 number of nodes supported by NNAPI: 397
2023-10-26 01:06:06.290  3948-3948  TypeManager             com.example.llm_generative           I  Failed to read /vendor/etc/nnapi_extensions_app_allowlist ; No app allowlisted for vendor extensions use.
2023-10-26 01:06:06.554  3948-3948  libc++abi               com.example.llm_generative           E  terminating due to uncaught exception of type Ort::Exception: The model cannot run using the current set of target devices, [Name: [google-edgetpu], Type [4]],  ,
2023-10-26 01:06:06.564  3948-3948  libc                    com.example.llm_generative           A  Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 3948 (binder:3948_4), pid 3948 (binder:3948_4)

Again, thanks for your support!

@skottmckay
Copy link
Contributor

Because you're telling NNAPI to not use CPU it's failing. We don't control what NNAPI chooses beyond forcing it to not use CPU. It will pick what it thinks best.

Note that the model isn't going to run well with NNAPI due to there being so many different partitions. If you run https://onnxruntime.ai/docs/tutorials/mobile/helpers/model-usability-checker.html it will tell you which operator/s are not currently supported.

There is an ARM NN execution provider that you could maybe do a custom build with, but as that hasn't been updated for 3 years it a) may not build (IIRC the original usage when it was added did not involve Android) and b) not clear it would provide any value as-is. https://onnxruntime.ai/docs/build/eps.html#armnn

@zjc664656505
Copy link
Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:ArmNN issues related to Arm NN execution provider platform:mobile issues related to ONNX Runtime mobile; typically submitted using template
Projects
None yet
Development

No branches or pull requests

2 participants