Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DML EP EinSum make more generic to avoid EP fallback #21114

Merged
merged 6 commits into from
Jun 21, 2024
Merged

Conversation

fdwr
Copy link
Contributor

@fdwr fdwr commented Jun 20, 2024

Problem

Newer models using more novel equations (e.g. bhwc,hkc->bhwk in Segment Anything's encoder or bqc,bchw->bqhw) cause fallback from DML to CPU, yielding performance issues. The EP had some pattern matching to map more common equations to existing DML operators, but the number of permutations was prohibitive and could not catch them all.

Solution

So, ditch the static mapping, and instead handle any 1-input or 2-input cases via remapped strides and a mini-graph of elementwise multiplication & sum reduction (as if DML had a DML_OPERATOR_DOT_PRODUCT that took axes). A subset of mappings still exist for performance (GEMM, pure reduction, transpose...), but they are identified generally rather than via a pattern table. Also...

  • ✅ Diagonals are supported now (e.g. iji->i).
  • ✅ Removes any remaining DML-specific EinSum GTEST_SKIP statements.
  • ✅ Handles any cases up to 8 unique labels (DML dimension limit is 8D).
  • ⚠️ >= 3 inputs and arbitrary size inputs via ellipsis are not handled, but we have yet to come across a model.

ℹ️ Note that even with this change such that all nodes are assigned to DML (no CPU fallback), we still end up with multiple partitions because the ONNX EinSum shape inference logic by @peishenyan isn't in ORT yet ⏳.

@fdwr fdwr requested review from PatriceVignola and smk2007 June 20, 2024 07:20
PatriceVignola
PatriceVignola previously approved these changes Jun 20, 2024
@fdwr fdwr added the ep:DML issues related to the DirectML execution provider label Jun 20, 2024
@smk2007
Copy link
Member

smk2007 commented Jun 20, 2024

void TensorDesc::EnsureStridesExist()

nit: noexcept


Refers to: onnxruntime/core/providers/dml/DmlExecutionProvider/src/TensorDesc.cpp:367 in 17b9461. [](commit_id = 17b9461, deletion_comment = False)

smk2007
smk2007 previously approved these changes Jun 21, 2024
@fdwr
Copy link
Contributor Author

fdwr commented Jun 21, 2024

/azp run Big Models,orttraining-amd-gpu-ci-pipeline (Linux_Build_manylinux)

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@fdwr
Copy link
Contributor Author

fdwr commented Jun 21, 2024

The Linux errors and orttraining appear unrelated. Merging...

@fdwr fdwr merged commit ac21626 into main Jun 21, 2024
95 of 99 checks passed
@fdwr fdwr deleted the users/dwayner/EinSum branch June 21, 2024 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:DML issues related to the DirectML execution provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants