Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ONNXRuntimeError] How to compare the precision layer by layer with engine(for example, tensorRT) if I have a custom operator in my onnx (and a corresponding plugin in TensorRT)? #22938

Open
MyraYu2022 opened this issue Nov 25, 2024 · 2 comments
Labels
build build issues; typically submitted using template ep:TensorRT issues related to TensorRT execution provider

Comments

@MyraYu2022
Copy link

Describe the issue

Description

I used to check my engine(tensorRT) layer by layer with onnx using polygraphy tool. But I have a custom operator writen by myself in my onnx, and a corresponding plugin in my tensorRT now and how can I compare them layer by layer using polygraphy?

I write the .sh script like this:

 polygraphy run my_model_with_custom_operator.onnx  \
                --onnxrt --trt  \
                --trt-outputs mark all  \
                --onnx-outputs mark all  \
                --atol 1e-3  --rtol 1e-3  \
                --fail-fast  \
                --val-range [0,1]  \
                --verbose 

And I get error as below:
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Fatal error: customPlugin(-1) is not a registered function/op

Thank you very much.

Environment

TensorRT Version:
8.4.0.6

NVIDIA Driver Version:
535.183.01

CUDA Version:
11.3

CUDNN Version:
8.3.2.44

Operating System:

Python Version (if applicable):
3.9.7

PyTorch Version (if applicable):
1.13.0

Urgency

No response

Target platform

onnxruntime

Build script

 polygraphy run my_model_with_custom_operator.onnx  \
                --onnxrt --trt  \
                --trt-outputs mark all  \
                --onnx-outputs mark all  \
                --atol 1e-3  --rtol 1e-3  \
                --fail-fast  \
                --val-range [0,1]  \
                --verbose 

Error / output

onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Fatal error: customPlugin(-1) is not a registered function/op

Visual Studio Version

No response

GCC / Compiler Version

No response

@MyraYu2022 MyraYu2022 added the build build issues; typically submitted using template label Nov 25, 2024
@github-actions github-actions bot added the ep:TensorRT issues related to TensorRT execution provider label Nov 25, 2024
@skottmckay
Copy link
Contributor

I don't know what polygraphy is, but you need to register the custom operator library in the session options before creating the inference session.

https://onnxruntime.ai/docs/api/python/api_summary.html#onnxruntime.SessionOptions.register_custom_ops_library

@MyraYu2022
Copy link
Author

I don't know what polygraphy is, but you need to register the custom operator library in the session options before creating the inference session.

https://onnxruntime.ai/docs/api/python/api_summary.html#onnxruntime.SessionOptions.register_custom_ops_library

Thank you very much, it really helps a lot. But when I use the python API register_custom_ops_library within onnxruntime.SessionOptions, I got an error.

I wrote my python code like this:

options = onnxruntime.SessionOptions()
options.register_custom_ops_library("path of mylib.so")

And I got error:
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Faild to load library mylib.so with error: mylib.so: undefined symbol: OrtGetApiBase

I would like to know is it something wrong with my library.so file?
I wrote and build my .so file reference to the example in onnxruntime/test/testdata/custom_op_library,
my custom_op_library.cc file is like this:

static const char* cust_lib = "riscv_test";
OrtStatus* ORT_API_CALL RegisterCustomOps(OrtSessionOptions* options, const OrtApiBase* api) {
   Ort::Global<void>::api_ = api->GetApi(ORT_API_VERSION);
   Ort::CustomOpDomain domain{cust_lib};

   std::unique_ptr<Ort::Custom::OrtLiteCustomOp> custom{Ort::Custom::CreateLiteCustomOp("Custom", "CPUExecutionProvider", 
        CustomKernel)};
   domain.Add(custom.get());

   Ort::UnownedSessionOptions session_options(options);
   session_options.Add(domain);
   AddOrtCustomOpDomainToContainer(std::move(domain));
   return nullptr;
}

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build issues; typically submitted using template ep:TensorRT issues related to TensorRT execution provider
Projects
None yet
Development

No branches or pull requests

2 participants