[ops] Add a sharded attention operation for SDPA #381

rsuderman · 2024-10-30T20:18:48Z

Existing implementation invokes the torch sdpa operator directly. Rewired to invoke via the ops system for a sharded sdpa operation. This includes a sharded implementation that compares the two versions.

Existing implementation invokes the `torch` sdpa operator directly. Rewired to invoke via the `ops` system for a sharded sdpa operation. This includes a sharded implementation that compares the two versions.

rsuderman requested a review from KyleHerndon October 30, 2024 20:18

KyleHerndon approved these changes Oct 30, 2024

View reviewed changes

rsuderman force-pushed the sharded_attention branch 2 times, most recently from f5a2327 to 8d5cb89 Compare October 30, 2024 22:13

[ops] Add a sharded attention operation for SDPA

cb15e76

Existing implementation invokes the `torch` sdpa operator directly. Rewired to invoke via the `ops` system for a sharded sdpa operation. This includes a sharded implementation that compares the two versions.

rsuderman force-pushed the sharded_attention branch from 8d5cb89 to cb15e76 Compare October 31, 2024 18:25

rsuderman merged commit 2d46caa into nod-ai:main Oct 31, 2024
4 checks passed

rsuderman deleted the sharded_attention branch October 31, 2024 18:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ops] Add a sharded attention operation for SDPA #381

[ops] Add a sharded attention operation for SDPA #381

rsuderman commented Oct 30, 2024

[ops] Add a sharded attention operation for SDPA #381

[ops] Add a sharded attention operation for SDPA #381

Conversation

rsuderman commented Oct 30, 2024