-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract variable allocation strategy to trait to allow backends to use more optimal strategies #139
Extract variable allocation strategy to trait to allow backends to use more optimal strategies #139
Conversation
}; | ||
|
||
let out_var = *out; | ||
let output = context.create_local_binding(item); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I do the following:
let mut out = x + y;
Do we call this function? Would it be problematic since the out value may be reused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a mutable local calls Init::init
on the initializer, which assigns it to a new mutable local variable. So in wgpu
this would be
let _10 = _8 + _9;
l_1_6 = _10;
or in SPIR-V
%10 = AddOp %8 %9
StoreOp %10 %out_var
See this code
pub fn init_expand<F>(context: &mut CubeContext, input: ExpandElement, func: F) -> ExpandElement
where
F: Fn(UnaryOperator) -> Operator,
{
if input.can_mut() {
return input;
}
let input_var: Variable = *input;
let item = input.item();
let out = context.create_local_variable(item);
let out_var = *out;
let op = func(UnaryOperator {
input: input_var,
out: out_var,
});
context.register(op);
out
}
can_mut
will always return false for plain variables (i.e. local bindings).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, that answers my question :)
Extracts the variable allocation/reuse strategy into a trait to allow backends to generate variables in a way that's beneficial to the output format.
create_local_binding
andcreate_local_variable
for intermediates and mutable variables respectively.ReuseAllocator
, which replicates the current behaviour, as well asHybridAllocator
which uses SSA for intermediates, and the normal reuse allocator for variablesLocalBinding
generation forwgpu
only for now. CUDA is a future goal, which would also enable switching local allocators per kernel, but I want theburn
test sweet to work on CUDA before I can comfortably portwgpu
toHybridAllocator
, which performs as well as or better thanReusingAllocator
and produces more optimized SPIR-VTesting
Both
cubecl
andburn
fully pass their test suites. Behaviour ofHybridAllocator
is validated with runtime tests.