Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAC can't find libonnxruntime_providers_shared.dylib #17532

Closed
pfeatherstone opened this issue Sep 13, 2023 · 14 comments
Closed

MAC can't find libonnxruntime_providers_shared.dylib #17532

pfeatherstone opened this issue Sep 13, 2023 · 14 comments
Labels
platform:mobile issues related to ONNX Runtime mobile; typically submitted using template

Comments

@pfeatherstone
Copy link

pfeatherstone commented Sep 13, 2023

Describe the issue

I'm trying to run a model using CoreML. Some of the ops aren't supported and I guess is trying to run parts of the model on CPU or something. It tries to call dlopen(libonnxruntime_providers_shared.dylib) and fails.

The full trace:

2023-09-13 12:38:46.849 Engine[29033:276472] 2023-09-13 12:38:46.849410 [W:onnxruntime:, helper.cc:66 IsInputSupported] Dynamic shape is not supported for now, for input:/Split_output_0
2023-09-13 12:38:46.849 Engine[29033:276472] 2023-09-13 12:38:46.849932 [W:onnxruntime:, helper.cc:66 IsInputSupported] Dynamic shape is not supported for now, for input:/Split_output_1
2023-09-13 12:38:46.849 Engine[29033:276472] 2023-09-13 12:38:46.849961 [W:onnxruntime:, helper.cc:66 IsInputSupported] Dynamic shape is not supported for now, for input:/Squeeze_output_0
2023-09-13 12:38:46.849 Engine[29033:276472] 2023-09-13 12:38:46.849981 [W:onnxruntime:, helper.cc:66 IsInputSupported] Dynamic shape is not supported for now, for input:/Squeeze_1_output_0
2023-09-13 12:38:46.850 Engine[29033:276472] 2023-09-13 12:38:46.849999 [W:onnxruntime:, helper.cc:66 IsInputSupported] Dynamic shape is not supported for now, for input:/Gather_9_output_0
2023-09-13 12:38:46.850 Engine[29033:276472] 2023-09-13 12:38:46.850017 [W:onnxruntime:, helper.cc:66 IsInputSupported] Dynamic shape is not supported for now, for input:/Reshape_2_output_0
2023-09-13 12:38:46.850 Engine[29033:276472] 2023-09-13 12:38:46.850040 [W:onnxruntime:, helper.cc:66 IsInputSupported] Dynamic shape is not supported for now, for input:/Range_output_0
2023-09-13 12:38:46.850 Engine[29033:276472] 2023-09-13 12:38:46.850058 [W:onnxruntime:, helper.cc:66 IsInputSupported] Dynamic shape is not supported for now, for input:/Reshape_4_output_0
2023-09-13 12:38:46.850 Engine[29033:276472] 2023-09-13 12:38:46.850076 [W:onnxruntime:, helper.cc:66 IsInputSupported] Dynamic shape is not supported for now, for input:/Unsqueeze_10_output_0
2023-09-13 12:38:46.850 Engine[29033:276472] 2023-09-13 12:38:46.850096 [W:onnxruntime:, helper.cc:66 IsInputSupported] Dynamic shape is not supported for now, for input:/Split
2023-09-13 12:38:46.850 Engine[29033:276472] 2023-09-13 12:38:46.850114 [W:onnxruntime:, helper.cc:66 IsInputSupported] Dynamic shape is not supported for now, for input:/Split_token_2
2023-09-13 12:38:46.850 Engine[29033:276472] 2023-09-13 12:38:46.850317 [W:onnxruntime:, coreml_execution_provider.cc:91 GetCapability] CoreMLExecutionProvider::GetCapability, number of partitions supported by CoreML: 14 number of nodes in the graph: 336 number of nodes supported by CoreML: 251
2023-09-13 12:38:48.099 Engine[29033:276472] 2023-09-13 12:38:48.099533 [W:onnxruntime:, session_state.cc:1169 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2023-09-13 12:38:48.099 Engine[29033:276472] 2023-09-13 12:38:48.099576 [W:onnxruntime:, session_state.cc:1171 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
libc++abi: terminating with uncaught exception of type Ort::Exception: /Users/runner/work/1/s/onnxruntime/core/session/provider_bridge_ort.cc:1080 void onnxruntime::ProviderSharedLibrary::Ensure() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_shared.dylib with error: dlopen(libonnxruntime_providers_shared.dylib, 0x000A): tried: 'libonnxruntime_providers_shared.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OSlibonnxruntime_providers_shared.dylib' (no such file), '/libonnxruntime_providers_shared.dylib' (no such file), '/System/libonnxruntime_providers_shared.dylib' (no such file), '/usr/lib/libonnxruntime_providers_shared.dylib' (no such file, not in dyld cache), 'libonnxruntime_providers_shared.dylib' (no such file), '/usr/local/lib/libonnxruntime_providers_shared.dylib' (no such file), '/usr/lib/libonnxruntime_providers_shared.dylib' (no such file, not in dyld cache)

Interestingly, if I run on CPU only, I don't get this error...

To reproduce

I'm running yolov5n.onnx

Urgency

ASAP

Platform

Mac

OS Version

Ventura 13.2.1

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.15.1

ONNX Runtime API

C++

Architecture

ARM64

Execution Provider

CoreML

Execution Provider Library Version

CoreML 1.0.0

@github-actions github-actions bot added the platform:mobile issues related to ONNX Runtime mobile; typically submitted using template label Sep 13, 2023
@pfeatherstone
Copy link
Author

I've built onnxruntime from source on a mac and it doens't produce the library libonnxruntime_providers_shared.dylib. I think this is a bug in the code. It shouldn't be looking for such a library as it looks like everything is statically linked on MAC.

@snnn
Copy link
Member

snnn commented Sep 13, 2023

onnxruntime_providers_coreml is a static library. It should not need to use libonnxruntime_providers_shared.dylib.

@snnn
Copy link
Member

snnn commented Sep 13, 2023

@skottmckay could you please help?

@pfeatherstone
Copy link
Author

It looks like there is some code buried somewhere that calls dlopen("libonnxruntime_providers_shared") and that should not be executed on IOS

@snnn
Copy link
Member

snnn commented Sep 13, 2023

In onnxruntime_providers.cmake we have:

if (NOT onnxruntime_MINIMAL_BUILD AND NOT onnxruntime_EXTENDED_MINIMAL_BUILD
                                  AND NOT ${CMAKE_SYSTEM_NAME} MATCHES "Darwin|iOS"
                                  AND NOT CMAKE_SYSTEM_NAME STREQUAL "Android"
                                  AND NOT CMAKE_SYSTEM_NAME STREQUAL "Emscripten")

onnxruntime_add_shared_library(onnxruntime_providers_shared ...)

endif()

So the library would never be built on macOS. However, in setup.py we have things like:

    libs.extend(["libonnxruntime_providers_shared.dylib"])
    libs.extend(["libonnxruntime_providers_dnnl.dylib"])
    libs.extend(["libonnxruntime_providers_tensorrt.dylib"])
    libs.extend(["libonnxruntime_providers_cuda.dylib"])

They probably would not work. I know macOS doesn't have CUDA and certainly not TensorRT. But, how about dnnl? @jywu-msft , should we remove this code?

Or, should we make it work on macOS? @RyanUnderhill , is there a reason why building onnxruntime_providers_shared library is disabled on macOS?

@RyanUnderhill
Copy link
Member

@snnn I think the change wasn't to disable it but just to only enable it on supported platforms. That logic was added in this change to invert the logic from 'do it on these platforms' to 'don't do it on these platforms': 1fa6d8f

Is there any use for it on MacOS?

@pfeatherstone
Copy link
Author

Is there a reason why it only looks for it when using CoreML but doesn't when using CPU only ?

@pfeatherstone
Copy link
Author

Is CoreML officially supported or is it a beta feature ?

@snnn
Copy link
Member

snnn commented Sep 13, 2023

Since you built it from source, can you setup a break point at onnxruntime/core/session/provider_bridge_ort.cc:1080 and give us the call stack?

@pfeatherstone
Copy link
Author

I'll have to rebuild in Debug but sure will do

@snnn
Copy link
Member

snnn commented Sep 13, 2023

Thank you!

@pfeatherstone
Copy link
Author

* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000110ef6a20 libonnxruntime.1.15.1.dylib`onnxruntime::ProviderSharedLibrary::Ensure(this=0x0000000114035c68) at provider_bridge_ort.cc:1080:5
   1077
   1078     auto full_path = Env::Default().GetRuntimePath() +
   1079                      PathString(LIBRARY_PREFIX ORT_TSTR("onnxruntime_providers_shared") LIBRARY_EXTENSION);
-> 1080     ORT_THROW_IF_ERROR(Env::Default().LoadDynamicLibrary(full_path, true /*shared_globals on unix*/, &handle_));
   1081
   1082     void (*PProvider_SetHost)(void*);
   1083     ORT_THROW_IF_ERROR(Env::Default().GetSymbolFromLibrary(handle_, "Provider_SetHost", (void**)&PProvider_SetHost));
Target 0: (Engine) stopped.

@pfeatherstone
Copy link
Author

Is this helpful ?

@pfeatherstone
Copy link
Author

Ok, this is all my fault. My code was accidentally trying to use CUDA. My code chooses an execution provider at runtime depending on some args. I had some arguments the wrong way round which basically went down the CUDA path. My fault. It all works now. Sorry for the confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:mobile issues related to ONNX Runtime mobile; typically submitted using template
Projects
None yet
Development

No branches or pull requests

3 participants