Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract variable allocation strategy to trait to allow backends to use more optimal strategies #139

Merged
merged 17 commits into from
Sep 23, 2024

Conversation

wingertge
Copy link
Contributor

Extracts the variable allocation/reuse strategy into a trait to allow backends to generate variables in a way that's beneficial to the output format.

  • Splits local creation into create_local_binding and create_local_variable for intermediates and mutable variables respectively.
  • Implements ReuseAllocator, which replicates the current behaviour, as well as HybridAllocator which uses SSA for intermediates, and the normal reuse allocator for variables
  • Implements LocalBinding generation for wgpu only for now. CUDA is a future goal, which would also enable switching local allocators per kernel, but I want the burn test sweet to work on CUDA before I can comfortably port
  • Switches the default local allocator for wgpu to HybridAllocator, which performs as well as or better than ReusingAllocator and produces more optimized SPIR-V

Testing

Both cubecl and burn fully pass their test suites. Behaviour of HybridAllocator is validated with runtime tests.

};

let out_var = *out;
let output = context.create_local_binding(item);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I do the following:

let mut out = x + y;

Do we call this function? Would it be problematic since the out value may be reused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a mutable local calls Init::init on the initializer, which assigns it to a new mutable local variable. So in wgpu this would be

let _10 = _8 + _9;
l_1_6 = _10;

or in SPIR-V

%10 = AddOp %8 %9
StoreOp %10 %out_var

See this code

pub fn init_expand<F>(context: &mut CubeContext, input: ExpandElement, func: F) -> ExpandElement
where
    F: Fn(UnaryOperator) -> Operator,
{
    if input.can_mut() {
        return input;
    }

    let input_var: Variable = *input;
    let item = input.item();

    let out = context.create_local_variable(item);
    let out_var = *out;

    let op = func(UnaryOperator {
        input: input_var,
        out: out_var,
    });

    context.register(op);

    out
}

can_mut will always return false for plain variables (i.e. local bindings).

Copy link
Member

@nathanielsimard nathanielsimard Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, that answers my question :)

@nathanielsimard nathanielsimard merged commit 0f77be9 into tracel-ai:main Sep 23, 2024
5 checks passed
@wingertge wingertge deleted the variable_allocator branch September 24, 2024 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants