-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Increased memory usage when loading from bytes #21165
Comments
Version 1.15.1 is rather old. Is this still an issue with the latest release? |
That is a pretty sizable regression in terms of memory usage in any case! Was there a particular version between 1.15.1 and 1.18.0 that caused the even worse memory usage? |
Well, we jumped directly from the 1.15.1 we were using to the 1.18.0 for this test, but I just did a quick check and I can already see this increased memory usage with 1.16.1 |
We have tested it with version 1.18.1, but it shows the same memory profile. |
This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details. |
Thanks for all the additional information @ignogueiras ! I'm afraid I don' have a good guess what the origin of your problem might be. But maybe you can try it again with the latest release from today? |
Hello again @cbourjau I did some more tests today and I am starting to doubt my previous results as I am unable to reproduce they now. I keep seeing the same memory profile loading from bytes and from filepath. I am using a different machine right now, so could it be related to the different hardware? I'll keep doing some more tests, maybe I am forgetting some steps of my old runs. What I can still see is a regression with the last versions respect v1.15.1. As you can see, the profiles have an almost identical form. First there is a resource loading and then some memory is released, approximately the size of the model. But in the new version, before this release, more memory is allocated, again the size of the model, negating the subsequent memory release. |
Describe the issue
Until now we were creating our Ort::Session object by passing it the path of our model (.onnx file).
Now we are trying to create the Session object from the bytes already read in a std::vector. Although everything seems to work correctly, we have detected a higher memory consumption, approximately the size of the model.
We are reasonably sure that the vector is being released correctly, so we have the impression that creating the Session is making a copy that is not being released.
Is this expected? Or are we doing something incorrectly?
To reproduce
We observe a much bigger memory usage when doing this:
rather than this:
session = std::make_shared<Ort::Session>(env, "/path/to/model/file.onnx", session_options);
Urgency
Not really urgent, just curious about this case as we want to load the models from memory eventually.
Platform
Linux
OS Version
Ubuntu 22.04
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.15.1
ONNX Runtime API
C++
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Model File
No response
Is this a quantized model?
Yes
The text was updated successfully, but these errors were encountered: