-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deserialize bytes to array on the GPU directly #436
Comments
For this use case, it might be worth looking at KvikIO's There are some examples at the top of this PR: #135 Admittedly this doesn't answer the question as to why things are not working in your case, but maybe it provides a path forward |
Thanks for the pointer. My objective is to understand the differences and compare the performance of I believe the However, when profiling the code using both Profiler output for
The Profiler output for
Looks like most of the time is spent inside the To summarize, my questions are:
|
Could you please share the code used in the second case? It is hard to comment on what is happening there without knowing what was done |
When reading a binary file, If GDS isn't available, Now, if GDS is available and the data is larger than However, even when GDS isn't available, |
Thanks Mads! 🙏 Do we document somewhere how to check whether KvikIO is able to use GDS? Think this might be a useful diagnostic test for Akshay (and future users) to run through to confirm they have a working configuration |
I am using
kvikio
to read an array stored on a file on the disk, directly onto the GPU. On the GPU I want to deserialize the file content into an array.My understanding is that the deserialization should happen on the GPU itself, i.e., there is no host CPU involved.
However when I profile the above code using
nsys
, I don't see any activity on the GPU corresponding to the deserialization.Also when looking at the CPU utilization of my code, it seems that that CPU is doing the work of deserialization.
Why is this case?
The text was updated successfully, but these errors were encountered: