-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(nvidia): Add cupti support #323
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file need comments. I have no idea what is happening here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document, what the functions bufferRequested
, bufferCompleted
, and callbackHandler
do and when they will be called from cupti.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After disccussing the ringbuffer a bit with @tilsche, we think it would be better to not let individual events split up on the wrap up, thus removing the need for all the mallocs, tracking thereof and simplifying the code at the same time. Also, fill
shouldn't be necessary.
0dc5946
to
826026d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have these comments still pending. Don't know if they still apply.
f5b30d8
to
3701440
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I blame vscode
08da629
to
1bddf27
Compare
732d27c
to
c950a74
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document, what the functions bufferRequested
, bufferCompleted
, and callbackHandler
do and when they will be called from cupti.
7884ac2
to
4485359
Compare
4485359
to
b50f989
Compare
2d7e02c
to
e9c2193
Compare
This commit adds a --nvidia option, which injects a library into the program under measurement, which records entry and exit into CUDA kernels via CUPTI
e9c2193
to
9ae4917
Compare
This commit adds a --nvidia option, which injects a library into the program under measurement, which records entry and exit into CUDA kernels via CUPTI
We might think about bumping the CMake requirement to 3.24 with this version, as older FindCUDAToolkit.cmake fail to correctly detect CUPTI headers[1].
This implements #294
[1] https://gitlab.kitware.com/cmake/cmake/-/merge_requests/7608