-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have runtime args update directly into device cmd for FD #8504
Conversation
Right now with my changes, the direct updating of unique rt args into the device cmd is straightforward, and basically no api change for the user. For common rt args, this is only 1 vector for the user/kernel side, but this maps to multiple device cmd locations since we mcast per core range in the set, so user can't just update 1 location in place. Don't think this is very user friendly considering we want to disable SetRuntimeArgs when hitting cache for the unique rt args so that users aren't doing unnecessary copies, so for unique args users only call Get and update in place, but for common args they need to do Get + Set. |
* | kernel_id | ID of the kernel that will receive the runtime args | KernelHandle (uint64_t) | | Yes | | ||
* | runtime_args | The runtime args to be written | const RuntimeArgsData & | | Yes | | ||
*/ | ||
void SetCommonRuntimeArgs(const Program &program, KernelHandle kernel_id, const RuntimeArgsData &runtime_args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to add this to docs/source/tt-metalium/tt_metal/apis/host_apis/runtime_args/runtime_args.rst
…r fast dispatch. TODO: Fix/uplift common runtime args update to remove looping in assemble_runtime_args
… device command Currently due to common args having multiple locations in device cmd due to one copy being added per mcast/core group, users must call Get, then Set for common args to update all locations. TODO is to have device cmd only retain one copy of the data, so that it is one to one and we can remove the Set requirement.
No description provided.