Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[js/webgpu] Allow binary ops with scalar to use the vectorize path #17589

Merged
merged 2 commits into from
Sep 22, 2023

Conversation

qjia7
Copy link
Contributor

@qjia7 qjia7 commented Sep 18, 2023

Description

  1. For binary ops, the components is always 4. So the dispatchGroup should be : {x: Math.ceil(outputSize / 64 /* workgroup size */ / 4 /* component size */)} instead of {x: Math.ceil(outputSize / 64 /* workgroup size */ / (vectorize ? 4 : 1) /* vec size */)}.

  2. If any of a or b only has one element, we still can use the vectorize path since the same value will be broadcasted.

@qjia7
Copy link
Contributor Author

qjia7 commented Sep 18, 2023

@fs-eire @guschmue Please take a look, thanks.

@fs-eire
Copy link
Contributor

fs-eire commented Sep 22, 2023

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

@fs-eire
Copy link
Contributor

fs-eire commented Sep 22, 2023

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-python-checks-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

1 similar comment
@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@qjia7 qjia7 requested a review from fs-eire September 22, 2023 01:17
@fs-eire
Copy link
Contributor

fs-eire commented Sep 22, 2023

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

@fs-eire
Copy link
Contributor

fs-eire commented Sep 22, 2023

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-python-checks-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

1 similar comment
@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@fs-eire fs-eire merged commit cd3fb37 into microsoft:main Sep 22, 2023
kleiti pushed a commit to kleiti/onnxruntime that referenced this pull request Mar 22, 2024
…icrosoft#17589)

### Description
1. For binary ops, the components is always 4. So the dispatchGroup
should be : `{x: Math.ceil(outputSize / 64 /* workgroup size */ / 4 /*
component size */)}` instead of `{x: Math.ceil(outputSize / 64 /*
workgroup size */ / (vectorize ? 4 : 1) /* vec size */)}`.

2. If any of a or b only has one element, we still can use the vectorize
path since the same value will be broadcasted.
siweic0 pushed a commit to siweic0/onnxruntime-web that referenced this pull request May 9, 2024
…icrosoft#17589)

### Description
1. For binary ops, the components is always 4. So the dispatchGroup
should be : `{x: Math.ceil(outputSize / 64 /* workgroup size */ / 4 /*
component size */)}` instead of `{x: Math.ceil(outputSize / 64 /*
workgroup size */ / (vectorize ? 4 : 1) /* vec size */)}`.

2. If any of a or b only has one element, we still can use the vectorize
path since the same value will be broadcasted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants