*: integrate with collective instances when ready #116

rohany · 2022-05-01T19:12:26Z

As soon as the collective instance branch is ready for testing, we need to move to it and introduce new concepts in the lowerer and mapper to correctly handle creating them. This is a three phase process.

Simple "replicated tensor" computations, and remove all of the hard-coded manual replication things. Codes that come to mind are

SpMV weak scale
TTV
TTMC
MTTKRP

More complicated launch patterns where subsets of launches need pieces of tensors

Johnson's Algorithm
COSMA

2D matrix computations that do lock-step broadcast communcations, such as SUMMA. This step will be the hardest, as it requires changing alot of code. The problem with the current approach is that it does 1 2D launch, and then each launched sub-task launches a bunch more tasks. It's likely that we will need to convert this into a 3D launch with a projection functor that understands the ordering between tasks (also generated by DISTAL), and then chooses collectives to use for each row/column.

PUMMA
SUMMA
2.5D MatMul

rohany · 2022-05-02T16:39:50Z

It's possible that for the third case, we can do something with the outer partitioning, rather than adjusting the launch space. We can just move the k loop to the outside, and do launches of index space tasks over the machine that use collectives!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

*: integrate with collective instances when ready #116

*: integrate with collective instances when ready #116

rohany commented May 1, 2022

rohany commented May 2, 2022

*: integrate with collective instances when ready #116

*: integrate with collective instances when ready #116

Comments

rohany commented May 1, 2022

rohany commented May 2, 2022