-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Onnxruntime Memory [Web] #18165
Comments
Hello, there is an InferenceSession release method which you can call once you've completed the predictions. |
Hi @carzh, Also the entire 300 MB is occupied during session creation in wasm. I tried to run memory profile for this process and my overall memory profile size is just around 20 MB which is same as my ONNX model size (20 MB). When I list my chrome task manager, I could see this tab occupies >300 MB of memory. So, I debugged the code and found that the memory shooted up at |
That's strange -- does the issue persist when using ORT 1.16.1? |
If this issue happens inside "_OrtCreateSession", it indicates that the memory growth happens inside ONNX Runtime model initialization step. ONNX Runtime does a lot of things in the initialization steps - loading the model graph, applying graph optimizers and transformers, allocating tensors, prepacking weights and initializing kernels. There might be a few possible ways to reduce the memory consumption:
|
Hello @fs-eire, I appreciate your valuable suggestions, but unfortunately, none of the steps I tried seemed to reduce the memory usage. Additionally, I encountered an issue while making predictions without using ZipMap,
I'm seeking further guidance to address this problem. |
could you share your 2KB model? (which consumes 300MB memory at runtime) |
This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details. |
Hi @fs-eire , But the memory is not released after the prediction in any model. I have also tried to remove the model instance in JS. Is there a way to free up this memory once my prediction process is complete |
WebAssembly memory cannot shrink. This means the wasm memory will keep the size of the peak memory usage, even if the code marks some part as "free". Regarding removing "ZipMap", I am not an expert on ONNX exporter but I assume there are a few options the control the model export with no ZipMap operator and also with no non-tensor inputs/outputs. You can try to find if there are such options in the exporter that you use. |
Describe the issue
While my Onnx model functions excellently in Onnxruntime Web, I've encountered an issue where creating an InferenceSession results in a substantial memory usage increase of approximately 300 MB, and this memory is not released. I'm monitoring this memory usage through my Chrome task manager.
This behavior persists even with a simple Onnx model that's just 2 KB in size. I'm curious about the reason behind this memory increase and looking for a way to clear this memory or release the session object once I've completed my predictions.
To reproduce
Urgency
High Priority. Need to fix this issue!
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.15.1
Execution Provider
'wasm'/'cpu' (WebAssembly CPU)
The text was updated successfully, but these errors were encountered: