-
Notifications
You must be signed in to change notification settings - Fork 74
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refcount Error in Graph Code? #166
Comments
As further evidence that this is a refcount bug, I've updated the example code to isolate the bug to the Buffer that is passed to the memcpy graph node as an argument. If you set |
@vzhurba01 could you take a look? My guess is it is our implicit requirement for the Python bindings that Python objects such as |
Yeah this would be one of our implicit requirements. |
Thanks, Vlad! I've created #175 to track the need to document our requirements. @apowers313 does this answer your question? |
btw forgot to say, we are building pythonic abstraction over the low-level bindings (#70), and CUDA graphs will be covered in a future beta release (#111). @apowers313 it'd be nice if you can also share your use cases with us so that we can understand better your needs, and @vzhurba01 we should probably take lifetime management into account in the API design 🙂 |
Thanks for the quick reply and hopefully documentation helps future users avoid this footgun. I'm building a system that strings together multiple feature extractors that depend on inputs / outputs from each other and it's a lot more efficient to build a graph of feature extractors rather than to do memory transfers for each of them. Along the way I realized that there's a simple architecture for automatically handling inputs, outputs, and kernel dependencies so I built a FFI library to automatically build graphs for kernel calls that hides a lot of details from the users: https://github.com/atoms-org/cuda-ffi (still a work in progress, but ~80% complete) |
@apowers313 Thanks a lot for sharing, I wish we could have learned about your needs sooner to help you save some time 😅 (cc @aterrel for vis) I am pleased to share with you that an official solution for exactly your needs is being built and it's called As mentioned earlier we'll cover CUDA graphs in a future release. If possible please give it a try and let us know if you have any feedbacks/questions! |
In the meanwhile if you are OK with a field-tested, 3rd-party solution (but with NVIDIA support) for executing your C++ kernels in Python, I would encourage you to try out CuPy's |
(Let me move this issue to the Discussion board.) |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
I'm running into a problem with creating a graph. A recreation of the problem is here and the quick summary is that the code does something like:
The first graph execution works. I can also instantiate and execute the graph multiple times before the function returns and it works fine and prints out the correct string.
The second graph execution has the exact same memory address as the first instantiation. The
this is a test
message prints fine, the pointer address points fine, and then the final line ispassed argument was:
followed by garbage.My best guess is that there is a refcount that gets decremented when the function returns and the graph isn't hanging on to a copy of the memory, so it's freed or something? Is this a bug in the CUDA Python code or is there something I'm missing?
The text was updated successfully, but these errors were encountered: