[Mobile] QNN-EP graph preparation failed #21800

edupuis-psee · 2024-08-20T13:17:14Z

Describe the issue

I'm struggling with the inference of an ONNX model using QNN EP on an android device, the graph preparation fails with the following trace:

[2022-06-15 22:54:47.876] [trace] graph_prepare.cc:204:ERROR:could not create op: q::flat_from_vtcm
[2022-06-15 22:54:47.876] [trace] graph_prepare.cc:1377:ERROR:Op 0x102b4800000023 preparation failed with err:-1
[2022-06-15 22:54:47.876] [trace]  <E> "GridSample" generated: could not create op
[2022-06-15 22:54:47.876] [trace]  <E> RouterFastRPC graph prepare failed 12
[2022-06-15 22:54:47.876] [trace]  <V> Async property not supported. Skipping register Async context
[2022-06-15 22:54:47.876] [trace]  <E> Failed to finalize graph (id: 1) with err 1002
[2022-06-15 22:54:47.876] [trace]  <V> Wake up free backend (id: 1)'s thread(s)
[2022-06-15 22:54:47.876] [trace]  <I> QnnGraph_finalize done. status 0x3ea
[2022-06-15 22:54:47.876] [error] Failed to finalize QNN graph.

The model:

The QNN EP config:

    std::unordered_map<std::string, std::string> qnn_options;
    qnn_options["backend_path"] = "libQnnHtp.so";
    qnn_options["profiling_level"] = "basic";
    qnn_options["profiling_file_path"] = qnn_profiling.string();
    qnn_options["htp_graph_finalization_optimization_mode"] = "3";
    qnn_options["htp_performance_mode"] = "burst";
    qnn_options["rpc_control_latency"] = "100";
    qnn_options["htp_arch"] = "69";
    qnn_options["soc_model"] = "36";
    qnn_options["vtcm_mb"] = "8";
    qnn_options["qnn_context_priority"] = "high";

Tested with both QNN 2.24 & 25 and ORT 1.18.1 & 1.19

The shape is large (4k grids) so it might be a VTCM memory issue with the tiling, but I have no way to make sure of that. Does anyone know how can I check.

Interestingly enough, removing the multiplier and subtraction operation helps with this.

I wonder if there is any way to use another EP for this specific op but I'm very new to the use of EP and I haven't (yet) found the corresponding doc.

Thank you in advance

To reproduce

to obtain a minimalistic model that reproduce the issue:

import torch
import torch.nn as nn
import torch.onnx

# Define the model that includes grid_sample operation
class GridSampleModel(nn.Module):
    def forward(self, x, grid):
        return 0.5 * nn.functional.grid_sample(x * 3, grid - 0.5, mode='bilinear', padding_mode='zeros', align_corners=False)

# Create example tensors for the input and grid
x = torch.randn(1, 1, 720, 1280)  # Example input tensor (N, C, H, W)
grid = torch.randn(1, 3072, 4096, 2)  # Example grid tensor (N, H_out, W_out, 2)

# Initialize the model
model = GridSampleModel()

# Set the model to evaluation mode
model.eval()

# Path to save the ONNX model
onnx_path = "grid_sample_model.onnx"

# Export the model
torch.onnx.export(
    model,                        # model being run
    (x, grid),                    # model input (or a tuple for multiple inputs)
    onnx_path,                    # where to save the model (can be a file or file-like object)
    export_params=True,           # store the trained parameter weights inside the model file
    opset_version=16,             # the ONNX version to export the model to
    do_constant_folding=True,     # whether to execute constant folding for optimization
    input_names=['input', 'grid'],   # the model's input names
    output_names=['output'],      # the model's output names
)

onnx_path

Urgency

No response

Platform

Android

OS Version

12

ONNX Runtime Installation

Built from Source

Compiler Version (if 'Built from Source')

ndk26c

Package Name (if 'Released Package')

onnxruntime-android

ONNX Runtime Version or Commit ID

1.19.0

ONNX Runtime API

C++/C

Architecture

X64

Execution Provider

Other / Unknown

Execution Provider Library Version

qnn-v2.25.0.240728104910_97711

The text was updated successfully, but these errors were encountered:

HectorSVC · 2024-08-26T16:55:17Z

Are you trying to run fp32 model on HTP backend? Try set enable_htp_fp16_precision. HTP doesn't really support fp32, it's only for functionality verification.
Try with the settings below.
qnn_options["backend_path"] = "libQnnHtp.so";
qnn_options["profiling_level"] = "basic";
qnn_options["profiling_file_path"] = qnn_profiling.string();
qnn_options["htp_graph_finalization_optimization_mode"] = "3";
qnn_options["htp_performance_mode"] = "burst";
qnn_options["rpc_control_latency"] = "100";
qnn_options["soc_model"] = "36";
qnn_options["enable_htp_fp16_precision"] = "1";
qnn_options["qnn_context_priority"] = "high";

I don't have device with soc_model=36. But I tried the offline context binary generation. It worked You can try that also. on Linux x86 or Win32 system, run command below (Windows x86 for example):
onnxruntime_perf_test.exe -n -e qnn -i "backend_path|QnnHtp.dll soc_model|36 htp_graph_finalization_optimization_mode|3 enable_htp_fp16_precision|1" -C "ep.context_enable|1" -m times -r 1 -I .\QNN_issues\grid_sample_model.onnx
It will create a Onnx model with QNN context binary inside grid_sample_model.onnx_ctx.onnx. You can run that model on the device. Attached the one I generated.
grid_sample_model.onnx_ctx.zip

HectorSVC · 2024-08-28T16:44:43Z

@edupuis-psee Any updates?

edupuis-psee · 2024-08-30T06:48:13Z

Thank you for your help, I was indeed capable of inferring the model on device thanks to the fp16 precision.

edupuis-psee added the platform:mobile issues related to ONNX Runtime mobile; typically submitted using template label Aug 20, 2024

github-actions bot added the ep:QNN issues related to QNN exeution provider label Aug 20, 2024

skottmckay assigned HectorSVC Aug 26, 2024

edupuis-psee closed this as completed Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Mobile] QNN-EP graph preparation failed #21800

[Mobile] QNN-EP graph preparation failed #21800

edupuis-psee commented Aug 20, 2024

HectorSVC commented Aug 26, 2024

HectorSVC commented Aug 28, 2024

edupuis-psee commented Aug 30, 2024

[Mobile] QNN-EP graph preparation failed #21800

[Mobile] QNN-EP graph preparation failed #21800

Comments

edupuis-psee commented Aug 20, 2024

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

Compiler Version (if 'Built from Source')

Package Name (if 'Released Package')

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

HectorSVC commented Aug 26, 2024

HectorSVC commented Aug 28, 2024

edupuis-psee commented Aug 30, 2024