Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MoE] Add TP and Mixtral MoE #19945

Merged
merged 20 commits into from
Mar 20, 2024
Merged

[MoE] Add TP and Mixtral MoE #19945

merged 20 commits into from
Mar 20, 2024

Conversation

wangyems
Copy link
Contributor

@wangyems wangyems commented Mar 16, 2024

Description

1.Support Tensor Parallelism in ShardedMoE.
2.Make necessary code changes to support Mixtral MoE.
3.Fix a bug related to using IOBinding in test script.
4.Fix the input size limitation

Motivation and Context

from onnx import TensorProto, helper
from torch import nn

import onnxruntime

Check notice

Code scanning / CodeQL

Module is imported with 'import' and 'import from' Note test

Module 'onnxruntime' is imported with both 'import' and 'import from'.
@wangyems wangyems requested a review from tianleiwu March 19, 2024 20:21
@wangyems wangyems merged commit 6ff31e0 into main Mar 20, 2024
95 checks passed
@wangyems wangyems deleted the wangye/moe_tp branch March 20, 2024 04:28
TedThemistokleous pushed a commit to TedThemistokleous/onnxruntime that referenced this pull request May 7, 2024
### Description
<!-- Describe your changes. -->

1.Support Tensor Parallelism in ShardedMoE.
2.Make necessary code changes to support Mixtral MoE.
3.Fix a bug related to using IOBinding in test script.
4.Fix the input size limitation

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants