-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Mobile] Segmentation fault after repeated inference #21082
Comments
Hard to say without a stack trace. with symbol names ORT will do most allocations during model initialization and the first inference. After that it's using a cache for memory so segfaults would typically be an out-of-memory scenario or bad input (e.g. input tensor is freed while ORT is using it). If you're building from source can you build a debug version? May need to ensure the Android build doesn't strip the binary of symbols though as typically it. Does the issue happen if you run on the Android emulator? Would be easier to debug if it did. Another option would be to copy onnxruntime_perf_test using adb to the phone (use /data/local/tmp), along with the model, and run. you can specify the number of iterations or amount of time to run for, and it can generate dummy input data. |
Hi @skottmckay thanks for your response. I have created an MRE in the form of a demo app that has the bug. Please check out this repo. The bug is reproducible on Android emulator, it will crash anywhere in the range of 100-1000 inference runs, which should only take a few minutes to reach. Does this help in debugging? I would like to provide a stack trace of the crash also, but I don't know how to get that on the native layer. Any pointers you can give me for that? In any case, I appreciate the help :) |
this issue same with : which I solved by including generated header files. |
@laurenspriem is it reproducible by running onnxruntime_perf_test in a shell on the emulator? If so that would rule out the issue being in the flutter plugin you're using (which we don't own). Use May be possible to get symbols using ndk-stack: https://developer.android.com/ndk/guides/ndk-stack.html |
I am trying to run onnxruntime_perf_test in the emulator as you suggested. However, it stops and gives me the following text back: /onnxruntime/onnxruntime/test/onnx/TestCase.cc:705 OnnxTestCase::OnnxTestCase(const std::string &, std::unique_ptr<TestModelInfo>, double, double) test case dir doesn't exist Any clue what is going wrong? |
Are you running with Otherwise you need to create a test case directory with input data in serialized protobuf files which is the same input format as onnx_test_runner requires. |
Thanks for the help! In the end the issue indeed seemed to be around the package we were using and not ONNX Runtime. We have since switched to using ONNX Runtime for mobile directly, through Flutter Platform Channels, which has resolved the issue. |
Describe the issue
I am getting a segmentation fault (SIGSEGV) after repeated inference runs on mobile, crashing the app. The issue only comes up after running inference for more than 300 times, but it consistently comes up after that. For context, I am using ORT in a Flutter app through FFI.
Error logs
To reproduce
The issue is reproducible by letting the app continuously run inference. Since this is happening in the app, it's a bit hard to give a clear and easy MRE. If nothing comes up from the error logs alone I'll try to create a dummy app that reproduces the issue and share the code for it here.
Urgency
I don't know how urgent this issue is to ORT, but for our app it's quite urgent.
Platform
Android
OS Version
Android 14 (and other versions)
ONNX Runtime Installation
Built from Source
Compiler Version (if 'Built from Source')
No response
Package Name (if 'Released Package')
None
ONNX Runtime Version or Commit ID
v1.15.0
ONNX Runtime API
C++/C
Architecture
ARM64
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: