Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Add Wasm Relaxed SIMD support and integer dot product instructions for ONNX Runtime Web #22533

Open
jing-bao opened this issue Oct 22, 2024 · 3 comments
Labels
feature request request for unsupported feature or enhancement platform:web issues related to ONNX Runtime web; typically submitted using template

Comments

@jing-bao
Copy link

Describe the feature request

Wasm Relaxed SIMD includes integer dot product instructions, which will map to VNNI instructions on X86-64 platforms with AVX-VNNI (on ARM maybe SDOT, but I haven't tested), and can greatly improve the QGemm performance. And there may be more optimizations in the future if Relaxed SIMD is supported.

I have some local patches to add Wasm Relaxed SIMD build and VNNI dispatch for QGemmU8X8 to MLAS, and they improve Segment Anything Model performance to ~1.15x. Are such modifications welcome?

Describe scenario use case

Many Web models are quantized, and can benefit from the integer dot product instructions.

@jing-bao jing-bao added the feature request request for unsupported feature or enhancement label Oct 22, 2024
@github-actions github-actions bot added the platform:web issues related to ONNX Runtime web; typically submitted using template label Oct 22, 2024
@fs-eire
Copy link
Contributor

fs-eire commented Oct 22, 2024

Hi @jing-bao, it is definitely welcome if you can help contributing to MLAS to support relaxed SIMD for WebAssembly!

I have a question regarding the Relaxed SIMD support: if the browser/Nodejs version does not support Relexed SIMD, will it just failed to load the WebAssembly or there is still a chance to fallback to old code?

@jing-bao
Copy link
Author

We definitely don't want it to fail when Relaxed SIMD is not supported. A possible solution in my mind:

We can test a small js+wasm code snippet to see if the browser/Nodejs supports Relaxed SIMD, like https://github.com/GoogleChromeLabs/wasm-feature-detect, then we need extra logic in onnx js code to choose between ort-wasm-relaxedsimd-threaded.wasm and ort-wasm-simd-threaded.wasm.

From your knowledge about onnx, is the right way?

@fs-eire
Copy link
Contributor

fs-eire commented Oct 23, 2024

I think that should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request request for unsupported feature or enhancement platform:web issues related to ONNX Runtime web; typically submitted using template
Projects
None yet
Development

No branches or pull requests

2 participants