-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WebNN EP] TFLite backend only supports limit ranges for Clip #20863
Conversation
It looks to me that the change is taking considerations of implementation details on more priority than standard specification. Although WebNN is using TFLite, it does not necessarily mean that (technically) it will always use it on every environment, and even if so, it is still possible that a future version of TFLite may support arbitrary min/max attribute of Clip/Clamp. It is hard to control the browser version and platforms users are using, so they may use a different version. This is why a standard is important and in my understanding the code should stick to the spec as much as possible, instead of the status of a particular underlying implementation. There are a few PRs that are doing similar and I assume it's because in Chromium WebNN's underlying CPU engine is switched from XNNPACK to TFLite. I understand that there are technical reasons to do this in order to make it E2E working for WebNN. However, in long term, it would probably be better to introduce a way to allow user to know the gap between the spec and the implementation. It can be feature detection via some WebNN API or introducing WebNN version. A developer should have a better way to know whether a node can be supported in the current WebNN environment other than knowing how the detailed status of the underlying engine, which is supposed to be transparent. |
Thanks @fs-eire, very good point! It's really painful for users at current stage, as the spec and the implementation of WebNN is still under very active evolution. @fujunwei is working on these unsupported constraints by emulating or some other methods to fill up the gaps. We will finally close the gap, and this table will keep up-to-date to maintain the latest status of op support status and constraints. For long term, once everything is stable and the gap is small (e.g. WebNN passes the Origin Trial in Chromium), we can note users more change info along with Chrome version. WDYT? cc/ @huningxin, @ibelem |
Thank you for the information. Is there existing discussion or proposal about:
I think the gap may always exist because everything is moving: spec will update and accept new operators (or maybe deprecate less used operators too), and implementation will upgrade too. So I am not optimistic on "everything is stable" and I think a standard way to manage the gap is more reasonable to me. |
This issue webmachinelearning/webnn#463 is talking about exposing operators/types status for each backend.
I am not aware of discussion about versioning. @huningxin, do you know?
You are right, I mean we can mark more change info once the gap is smaller and the change is not so frequent. We have a WebNN Status page to maintain the op impl status for each backend with Chrome version info. Maybe we can add Chrome and ORT-Web version info to this table in future. |
BTW, @fs-eire, do you know how does WebGPU EP manage versioning? |
@fs-eire , thanks for your feeback!
+1
Before underlying runtime like TFLite gets that support, WebNN implementation can emulate by composition. In WebNN spec, for each operators that can be decomposed, there is an emulation sample code. For example, if (options.minValue === undefined) {
if (options.maxValue === undefined) {
return input;
} else {
return builder.min(input, builder.constant(options.maxValue));
}
} else {
if (options.maxValue === undefined) {
return builder.max(input, builder.constant(options.minValue));
} else {
return builder.min(
builder.max(input, builder.constant(options.minValue)),
builder.constant(options.maxValue));
}
} Chromium implementation of [TFLite] Support other range for Clamp operator can refer to above sample code for decomposition. So frameworks should stick to spec for consistency.
Correct, webmachinelearning/webnn#463 is the right one for feature detection discussion.
Generally, Web spec doesn't have versioning, WebNN spec follows the same principle. WebNN spec is now in CR status for browser prototyping and developer preview. This kind of interop feedback would be great input to WG. Once the spec moves into more stable status, browser implementation should stick to the latest released version and maintain the backward compatibility. |
@Honry @huningxin thank you for your detailed explanation. This link is helpful. Perhaps I can add this into somewhere in ORT documentation
Technically, WebGPU is doing feature detection based on "capabilities". For example, the f16 support can be detected by using this API. Good to know that WebNN has this discussion: webmachinelearning/webnn#463. |
I think it is clear that the versioning for webnn needs work. |
/azp run ONNX Runtime Web CI Pipeline,Windows GPU CI Pipeline,Linux Android Emulator QNN CI Pipeline |
/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline |
/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,Windows x64 QNN CI Pipeline,Big Models |
Azure Pipelines successfully started running 3 pipeline(s). |
Azure Pipelines successfully started running 9 pipeline(s). |
Azure Pipelines successfully started running 7 pipeline(s). |
Wanming, can we implement the min/max decomposition in ORT when using the WebNN CPU backend, then remove it when Junwei implements it in Chromium's CPU backend? Then we'd have fewer fragmented partitions, and it would be closer to the expected end result. In any case, I'm glad you added the Chromium todo comment, and I still approve knowing that proper support in the web API is coming soon. |
👍Good point, I will follow up. |
No description provided.