Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I would like to know if it can be combined with the torch for training? #102

Open
WindCanDie opened this issue Oct 10, 2024 · 3 comments
Open

Comments

@WindCanDie
Copy link

I would like to know if it can be combined with the torch for training?

@2niuhe
Copy link

2niuhe commented Oct 11, 2024

I have the same question. It seems that this project primarily focuses on translating the described computation graph into code, with all the examples provided illustrating the forward inference process. For the backward, one would need to describe the corresponding backward graph's computation process on their own. This is just my understanding, so it may not be correct.

@daneren
Copy link

daneren commented Oct 12, 2024

I also want to know this question.

@Edenzzzz
Copy link

Edenzzzz commented Oct 22, 2024

Even if you use Triton (not torch.compile), you will need to write your own backward. Torch.compile doesn't work for complex operators with transposes etc. such as Flash Attn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants