-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMD] remove driver C compilation steps #5228
base: main
Are you sure you want to change the base?
[AMD] remove driver C compilation steps #5228
Conversation
if launch_enter_hook or launch_exit_hook: | ||
raise NotImplementedError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably these are actually used somehow? But I couldn't tell from the currently emitted trampoline. Happy to add support for this as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These hooks are used by proton to record profile scope information.
ce1b20e
to
b65d8bf
Compare
b65d8bf
to
a607c85
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you checked if this impacts launch latency? IIRC ctypes is more for convenience than performance.
if launch_enter_hook or launch_exit_hook: | ||
raise NotImplementedError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These hooks are used by proton to record profile scope information.
Python can directly call C APIs via ctypes. Since the HIP APIs we use are indeed C APIs, the AMD driver does not need to compile anything (neither the kernel launch trampoline nor the utils).
The added file
hip.py
is generated from HIP headers using clang2py (and then trimmed down to include only our used APIs).Note, I've tested this on our
mi3002x
.