[Feature Request] Add Wasm Relaxed SIMD support and integer dot product instructions for ONNX Runtime Web #22533
Labels
feature request
request for unsupported feature or enhancement
platform:web
issues related to ONNX Runtime web; typically submitted using template
Describe the feature request
Wasm Relaxed SIMD includes integer dot product instructions, which will map to VNNI instructions on X86-64 platforms with AVX-VNNI (on ARM maybe SDOT, but I haven't tested), and can greatly improve the QGemm performance. And there may be more optimizations in the future if Relaxed SIMD is supported.
I have some local patches to add Wasm Relaxed SIMD build and VNNI dispatch for QGemmU8X8 to MLAS, and they improve Segment Anything Model performance to ~1.15x. Are such modifications welcome?
Describe scenario use case
Many Web models are quantized, and can benefit from the integer dot product instructions.
The text was updated successfully, but these errors were encountered: