Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distinguish between DML and the generic 'GPU' term. This is needed for packaging DML EP in the same ORT GPU pkg. #22597

Closed
wants to merge 20 commits into from

Conversation

pranavsharma
Copy link
Contributor

@pranavsharma pranavsharma commented Oct 25, 2024

Description

We want to package DML EP in the same pkg as ORT GPU pkg. This is one of the changes required to do so. Other 2 changes are

Motivation and Context

Users want DML, TensorRT and CUDA EP in the same ORT GPU pkg so that they can easily switch between them.

snnn
snnn previously approved these changes Oct 25, 2024
fdwr
fdwr previously approved these changes Oct 25, 2024
Copy link
Contributor

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay to me. FYI @sumitsays.

@pranavsharma pranavsharma dismissed stale reviews from fdwr and snnn via 381f55f October 25, 2024 19:36
fdwr
fdwr previously approved these changes Oct 25, 2024
Copy link
Contributor

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Sumit/Patrice, do you have any concerns?

@pranavsharma
Copy link
Contributor Author

👍 Sumit/Patrice, do you have any concerns?

Looks like there are some DML test failures. I'm investigating. @sumitsays/ @fdwr / @PatriceVignola - if something looks obvious, let me know. thanks!

@tianleiwu
Copy link
Contributor

Do we plan to support using DML and CUDA EP in same session? If so, some part like gpu data transfer need support transfer tensors between DML and GPU memory. If not, we might need add some check to prevent the usage.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

onnxruntime/python/onnxruntime_pybind_ortvalue.cc Outdated Show resolved Hide resolved
@snnn
Copy link
Member

snnn commented Oct 29, 2024

Do we plan to support using DML and CUDA EP in same session? If so, some part like gpu data transfer need support transfer tensors between DML and GPU memory. If not, we might need add some check to prevent the usage.

No. They cannot be both enabled in the same process.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@pranavsharma pranavsharma force-pushed the package_dml branch 2 times, most recently from 737a9c6 to a5e52fa Compare October 29, 2024 06:40
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

onnxruntime/python/onnxruntime_pybind_ortvalue.cc Outdated Show resolved Hide resolved
fs-eire and others added 6 commits October 29, 2024 04:01
### Description

This change resolves issue No.3 described in #22615
)

### Description
<!-- Describe your changes. -->
* Leverage template `common-variables.yml` and reduce usage of hardcoded
trt_version

https://github.com/microsoft/onnxruntime/blob/8391b24447fcca4c01599b3270255fbf76ac8a21/tools/ci_build/github/azure-pipelines/templates/common-variables.yml#L2-L7
* Among all CI yamls, this PR reduces usage of hardcoding trt_version
from 40 to 6, by importing trt_version from `common-variables.yml`
* Apply TRT 10.5 and re-enable control flow op test


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
- Reduce usage of hardcoding trt_version among all CI ymls

### Next refactor PR 
will work on reducing usage of hardcoding trt_version among
`.dockerfile`, `.bat` and remaining 2 yml files
(download_win_gpu_library.yml & set-winenv.yml, which are step-template
yaml that can't import variables)
### Description
<!-- Describe your changes. -->
Allow some classes to be default constructed.
The effect is the same as constructing it with nullptr.
Make default ctor visible from the base classes.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Multiple customers complained that when storing Ort::Value
in an instance of std::vector, vector can not be resized.

We enable that with allowing it default constructed.
### Description
Issue can happen with multiple sessions and when ETW captureState /
rundown is triggered.

Resolves use after free issue.

Tested with local unit test creating/destroying multiple sessions while
continually enabling & disabling ETW. This currently requires Admin
prompt so not checking in

### Motivation and Context
ORT should not crash
…mers/models/whisper (#22641)

Bumps [onnx](https://github.com/onnx/onnx) from 1.16.1 to 1.17.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/onnx/onnx/releases">onnx's
releases</a>.</em></p>
<blockquote>
<h2>v1.17.0</h2>
<p>ONNX v1.17.0 is now available with exciting new features! We would
like to thank everyone who contributed to this release!
Please visit <a href="https://onnx.ai/">onnx.ai</a> to learn more about
ONNX and associated projects.</p>
<h1>Key Updates</h1>
<h2>ai.onnx Opset 22</h2>
<ul>
<li>Update to support bfloat16:
<ul>
<li><a
href="https://onnx.ai/onnx/operators/onnx__Acos.html#acos-22">Acos</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Acosh.html#acosh-22">Acosh</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Asin.html#asin-22">Asin</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Asinh.html#asinh-22">Asinh</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Atan.html#atan-22">Atan</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Atanh.html#atanh-22">Atanh</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__AveragePool.html#averagepool-22">AveragePool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Bernoulli.html#bernoulli-22">Bernoulli</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Conv.html#conv-22">Conv</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__ConvTranspose.html#convtranspose-22">ConvTranspose</a>,
<a href="https://onnx.ai/onnx/operators/onnx__Cos.html#cos-22">Cos</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Cosh.html#cosh-22">Cosh</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__DeformConv.html#deformconv-22">DeformConv</a>,
<a href="https://onnx.ai/onnx/operators/onnx__Det.html#det-22">Det</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Dropout.html#dropout-22">Dropout</a>,
<a href="https://onnx.ai/onnx/operators/onnx__Elu.html#elu-22">Elu</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__EyeLike.html#eyelike-22">EyeLike</a>,
<a href="https://onnx.ai/onnx/operators/onnx__GRU.html#gru-22">GRU</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__GlobalAveragePool.html#globalaveragepool-22">GlobalAveragePool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__GlobalLpPool.html#globallppool-22">GlobalLpPool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__GlobalMaxPool.html#globalmaxpool-22">GlobalMaxPool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__GridSample.html#gridsample-22">GridSample</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__HardSigmoid.html#hardsigmoid-22">HardSigmoid</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__HardSwish.html#hardswish-22">HardSwish</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__InstanceNormalization.html#instancenormalization-22">InstanceNormalization</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__LSTM.html#lstm-22">LSTM</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__LpNormalization.html#lpnormalization-22">LpNormalization</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__LpPool.html#lppool-22">LpPool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__MaxPool.html#maxpool-22">MaxPool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__MaxRoiPool.html#maxroipool-22">MaxRoiPool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__MaxUnpool.html#maxunpool-22">MaxUnpool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Mish.html#mish-22">Mish</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Multinomial.html#multinomial-22">Multinomial</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__NegativeLogLikelihoodLoss.html#negativeloglikelihoodloss-22">NegativeLogLikelihoodLoss</a>,
<a href="https://onnx.ai/onnx/operators/onnx__RNN.html#rnn-22">RNN</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__RandomNormal.html#randomnormal-22">RandomNormal</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__RandomNormalLike.html#randomnormallike-22">RandomNormalLike</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__RandomUniform.html#randomuniform-22">RandomUniform</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__RandomUniformLike.html#randomuniformlike-22">RandomUniformLike</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__RoiAlign.html#roialign-22">RoiAlign</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Round.html#round-22">Round</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Selu.html#selu-22">Selu</a>,
<a href="https://onnx.ai/onnx/operators/onnx__Sin.html#sin-22">Sin</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Sinh.html#sinh-22">Sinh</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Softplus.html#softplus-22">Softplus</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Softsign.html#softsign-22">Softsign</a>,
<a href="https://onnx.ai/onnx/operators/onnx__Tan.html#tan-22">Tan</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__ThresholdedRelu.html#thresholdedrelu-22">ThresholdedRelu</a></li>
</ul>
</li>
</ul>
<h2>Python Changes</h2>
<ul>
<li>Support for numpy &gt;= 2.0</li>
</ul>
<h1>Bug fixes and infrastructure improvements</h1>
<ul>
<li>Fix Check URLs errors <a
href="https://redirect.github.com/onnx/onnx/pull/5972">5972</a></li>
<li>Use CMAKE_PREFIX_PATH in finding libprotobuf <a
href="https://redirect.github.com/onnx/onnx/pull/5975">5975</a></li>
<li>Bump main VERSION_NUMBER to 1.17.0 <a
href="https://redirect.github.com/onnx/onnx/pull/5968">5968</a></li>
<li>Fix source and pip tar.gz builds on s390x systems <a
href="https://redirect.github.com/onnx/onnx/pull/5984">5984</a></li>
<li>Fix unique_name <a
href="https://redirect.github.com/onnx/onnx/pull/5992">5992</a></li>
<li>Fix SegFault bug in shape inference <a
href="https://redirect.github.com/onnx/onnx/pull/5990">5990</a></li>
<li>Fix onnx.compose when connecting subgraphs <a
href="https://redirect.github.com/onnx/onnx/pull/5991">5991</a></li>
<li>Fix conversion from split 11 to split 18 <a
href="https://redirect.github.com/onnx/onnx/pull/6020">6020</a></li>
<li>Update error messages for NegativeLogLikelihoodLoss inference
function <a
href="https://redirect.github.com/onnx/onnx/pull/6021">6021</a></li>
<li>Generalize input/output number check in shape inference <a
href="https://redirect.github.com/onnx/onnx/pull/6005">6005</a></li>
<li>Replace rank inference with shape inference for Einsum op <a
href="https://redirect.github.com/onnx/onnx/pull/6010">6010</a></li>
<li>build from source instruction with latest cmake change <a
href="https://redirect.github.com/onnx/onnx/pull/6038">6038</a></li>
<li>Handle OneHot's depth value during shape inference <a
href="https://redirect.github.com/onnx/onnx/pull/5963">5963</a></li>
<li>Not to install cmake in pyproject.toml on Windows <a
href="https://redirect.github.com/onnx/onnx/pull/6045">6045</a></li>
<li>fix a skipped shape infer code <a
href="https://redirect.github.com/onnx/onnx/pull/6049">6049</a></li>
<li>Include the &quot;.onnxtext&quot; extension in supported
serialization format <a
href="https://redirect.github.com/onnx/onnx/pull/6051">6051</a></li>
<li>Allow ReferenceEvaluator to return intermediate results <a
href="https://redirect.github.com/onnx/onnx/pull/6066">6066</a></li>
<li>Fix 1 typo in numpy_helper.py <a
href="https://redirect.github.com/onnx/onnx/pull/6041">6041</a></li>
<li>Remove benchmarking code <a
href="https://redirect.github.com/onnx/onnx/pull/6076">6076</a></li>
<li>Prevent crash on import after GCC 8 builds <a
href="https://redirect.github.com/onnx/onnx/pull/6048">6048</a></li>
<li>Check graph outputs are defined <a
href="https://redirect.github.com/onnx/onnx/pull/6083">6083</a></li>
<li>Enable additional ruff rules <a
href="https://redirect.github.com/onnx/onnx/pull/6032">6032</a></li>
<li>Add missing shape inference check for DequantizeLinear <a
href="https://redirect.github.com/onnx/onnx/pull/6080">6080</a></li>
<li>Add bfloat16 to all relevant ops <a
href="https://redirect.github.com/onnx/onnx/pull/6099">6099</a></li>
<li>fix(ci): install python dependencies with --only-binary :all: in
manylinux <a
href="https://redirect.github.com/onnx/onnx/pull/6120">6120</a></li>
<li>fix: install google-re2 with --only-binary option <a
href="https://redirect.github.com/onnx/onnx/pull/6129">6129</a></li>
<li>Specify axis parameter for DequantizeLinear when input rank is 1 <a
href="https://redirect.github.com/onnx/onnx/pull/6095">6095</a></li>
<li>Pin onnxruntime to 1.17.3 for release CIs <a
href="https://redirect.github.com/onnx/onnx/pull/6143">6143</a></li>
<li>Fix INT4 TensorProto byte size is 5x larger than expected with
negative values <a
href="https://redirect.github.com/onnx/onnx/pull/6161">6161</a></li>
<li>Mitigate tarball directory traversal risks <a
href="https://redirect.github.com/onnx/onnx/pull/6164">6164</a></li>
<li>Fix reference implementation for ScatterND with 4D tensors <a
href="https://redirect.github.com/onnx/onnx/pull/6174">6174</a></li>
<li>Addition of group &gt; 1 in test and in backend for ConvTranspose <a
href="https://redirect.github.com/onnx/onnx/pull/6175">6175</a></li>
<li>Support for bfloat16 for binary, unary operators in reference
implementation <a
href="https://redirect.github.com/onnx/onnx/pull/6166">6166</a></li>
<li>Refactor windows workflow to work on standard windows <a
href="https://redirect.github.com/onnx/onnx/pull/6190">6190</a></li>
<li>Fix a few crashes while running shape inference <a
href="https://redirect.github.com/onnx/onnx/pull/6195">6195</a></li>
<li>Update onnx to work with numpy&gt;=2.0 <a
href="https://redirect.github.com/onnx/onnx/pull/6196">6196</a></li>
<li>Use sets to improve performance of dfs search <a
href="https://redirect.github.com/onnx/onnx/pull/6213">6213</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/onnx/onnx/commit/b8baa8446686496da4cc8fda09f2b6fe65c2a02c"><code>b8baa84</code></a>
Set version 1.17.0 for official release (<a
href="https://redirect.github.com/onnx/onnx/issues/6405">#6405</a>)</li>
<li><a
href="https://github.com/onnx/onnx/commit/6d77b808217f442170d105131836aa4820c0f43f"><code>6d77b80</code></a>
[Cherry-Pick] Fix main url checks (<a
href="https://redirect.github.com/onnx/onnx/issues/6312">#6312</a>) (<a
href="https://redirect.github.com/onnx/onnx/issues/6327">#6327</a>)</li>
<li><a
href="https://github.com/onnx/onnx/commit/174938d8b7d48f27b5c491626c6a474f5f5b829a"><code>174938d</code></a>
[Cherry-Pick] Fix protobuf pkg 5.28.0 failing on Windows (<a
href="https://redirect.github.com/onnx/onnx/issues/6342">#6342</a>) (<a
href="https://redirect.github.com/onnx/onnx/issues/6347">#6347</a>)</li>
<li><a
href="https://github.com/onnx/onnx/commit/f18d5931adc7b44ae5a2afd74e21ed51bcf2bc63"><code>f18d593</code></a>
[Cherry-Pick] Remove unused variables (<a
href="https://redirect.github.com/onnx/onnx/issues/6303">#6303</a>) (<a
href="https://redirect.github.com/onnx/onnx/issues/6324">#6324</a>)</li>
<li><a
href="https://github.com/onnx/onnx/commit/c58890537f466b9b294f6dd038dd826f9907e03d"><code>c588905</code></a>
Set version in rel-1.17.0 to 1.17.0rc1 (<a
href="https://redirect.github.com/onnx/onnx/issues/6317">#6317</a>)</li>
<li><a
href="https://github.com/onnx/onnx/commit/4392c2c9ae30cd10d199bd31fc7b272a6f842824"><code>4392c2c</code></a>
Prepare for rel-1.17.0 (<a
href="https://redirect.github.com/onnx/onnx/issues/6281">#6281</a>)</li>
<li><a
href="https://github.com/onnx/onnx/commit/cb54169e4f2b52861cf5ec546d244ea4b2d09964"><code>cb54169</code></a>
Update ort filter to 1.20.0 to skip tests known to fail with ort 1.19.0
(<a
href="https://redirect.github.com/onnx/onnx/issues/6306">#6306</a>)</li>
<li><a
href="https://github.com/onnx/onnx/commit/99e1fd352c05c3176770080824fd7a8c474c97c0"><code>99e1fd3</code></a>
Bump reviewdog/action-misspell from 1.21.0 to 1.23.0 (<a
href="https://redirect.github.com/onnx/onnx/issues/6268">#6268</a>)</li>
<li><a
href="https://github.com/onnx/onnx/commit/19205655059e1654ba2d44478bc3a1c75af7830f"><code>1920565</code></a>
Bump ossf/scorecard-action from 2.3.3 to 2.4.0 (<a
href="https://redirect.github.com/onnx/onnx/issues/6273">#6273</a>)</li>
<li><a
href="https://github.com/onnx/onnx/commit/2e8f2289b91d5670e1c661ab9119178b24197219"><code>2e8f228</code></a>
Bump mypy from 1.10.1 to 1.11.1 (<a
href="https://redirect.github.com/onnx/onnx/issues/6275">#6275</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/onnx/onnx/compare/v1.16.1...v1.17.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=onnx&package-manager=pip&previous-version=1.16.1&new-version=1.17.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/microsoft/onnxruntime/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
BUG #22031

Optimize below two situations:
1. Increase workgroupSize if only one workgroup is dispatched.
2. Avoid transpose if not necessary.

The overall time of demucs model becomes 106.36 ms from 154.60 ms on my
dGPUs with this PR and PR #22577
indygit and others added 10 commits October 29, 2024 17:13
### Description
[DML EP] Update DML to 1.15.4



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
We want the customer to use the latest DirectML.
### JSEP Ops that need updating

- [x] Cast
- [x] ReduceMax
- [x] ReduceMin
- [x] Squeeze
- [x] Unsqueeze
- [x] Transpose
- [x] AveragePool
- [x] Flatten
- [x] Pad
- [x] If
### Description

This PR adds the actual implementation of the WebGPU EP based on
#22318.

This change includes the following:

<details>
<summary><b>core framework of WebGPU EP</b></summary>

  - WebGPU EP factory classes for:
    - handling WebGPU options
    - creating WebGPU EP instance
    - creating WebGPU context
  - WebGPU Execution Provider classes
    - GPU Buffer allocator
    - data transfer
  - Buffer management classes
    - Buffer Manager
    - BufferCacheManager
      - DisabledCacheManager
      - SimpleCacheManager
      - LazyReleaseCacheManager
      - BucketCacheManager
  - Program classes
    - Program (base)
    - Program Cache Key
    - Program Manager
  - Shader helper classes
    - Shader Helper
    - ShaderIndicesHelper
    - ShaderVariableHelper
  - Utils
    - GPU Query based profiler
    - compute context
    - string utils
  - Miscs
    - Python binding webgpu support (basic)
 
</details>

<details>
<summary><b>Kernel implementation</b></summary>


  - onnx.ai (default opset):
- Elementwise (math): Abs, Neg, Floor, Ceil, Reciprocal, Sqrt, Exp, Erf,
Log, Sin, Cos, Tan, Asin, Acos, Atan, Sinh, Cosh, Asinh, Acosh, Atanh,
Tanh, Not, Cast
- Elementwise (activation): Sigmoid, HardSigmoid, Clip, Elu, Relu,
LeakyRelu, ThresholdedRelu, Gelu
- Binary (math): Add, Sub, Mul, Div, Pow, Equal, Greater,
GreaterOrEqual, Less, LessOrEqual
    - (Tensors): Shape, Reshape, Squeeze, Unsqueeze
    - Where
    - Transpose
    - Concat
    - Expand
    - Gather
    - Tile
    - Range
    - LayerNormalization
  - com.microsoft
    - FastGelu
    - MatMulNBits
    - MultiHeadAttention
    - RotaryEmbedding
    - SkipLayerNormalization
    - LayerNormalization
    - SimplifiedLayerNormalization
    - SkipSimplifiedLayerNormalization

</details>

<details>
<summary><b>Build, test and CI pipeline integration</b></summary>

  - build works for Windows, macOS and iOS
  - support onnxruntime_test_all and python node test
  - added a new unit test for `--use_external_dawn` build flag.
  - updated MacOS pipeline to build with WebGPU support
  - added a new pipeline for WebGPU Windows

</details>

This change does not include:

- Node.js binding support for WebGPU (will be a separate PR)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@pranavsharma pranavsharma requested a review from a team as a code owner October 30, 2024 06:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.