[FIX] Resolve #19 by fixing broken import #20

EtienneChollet · 2024-08-09T22:05:37Z

No description provided.

balbasty · 2024-08-13T09:23:23Z

I am not sure what to do here....

My functions take multiple input tensors, so it seems we do need custom_fwd/custom_bwd, even though I only use pure PyTorch functions under the hood (see here)
But I don't love having the device hardcoded like you currently have. What if cuda is available but we are running some piece code on the CPU? Is it going to crash?
We anyway need a wrapper around custom_fwd/custom_bwd to support both the "old version" (which does not have the device_type argument) and the "new version" (which requires the device_type argument).
Does it need we must define different Function classes for cpu and gpu? This sounds so weird.... Maybe I can defined a generic wrapper that takes a Function class without decorators and returns two different classes decorated with the cuda and cpu decorators?
Anyway, I need tests for this mixed-precision stuff. I don't think I actually ever tried running the code in mixed-precision

@EtienneChollet do you want to help with this? (you don't have to!)

balbasty · 2024-09-13T07:08:01Z

I am closing this PR as the issue was fixed (differently) in #21 .

[FIX] Resolve balbasty#19 by fixing broken import

3aed216

balbasty closed this Sep 13, 2024

Provide feedback