[WebNN EP] TFLite backend only supports limit ranges for Clip #20863

Honry · 2024-05-30T05:45:34Z

No description provided.

Honry · 2024-05-30T05:45:58Z

@guschmue, @fs-eire, @fdwr, PTAL, thanks!

fs-eire · 2024-05-30T07:38:47Z

It looks to me that the change is taking considerations of implementation details on more priority than standard specification. Although WebNN is using TFLite, it does not necessarily mean that (technically) it will always use it on every environment, and even if so, it is still possible that a future version of TFLite may support arbitrary min/max attribute of Clip/Clamp.

It is hard to control the browser version and platforms users are using, so they may use a different version. This is why a standard is important and in my understanding the code should stick to the spec as much as possible, instead of the status of a particular underlying implementation.

There are a few PRs that are doing similar and I assume it's because in Chromium WebNN's underlying CPU engine is switched from XNNPACK to TFLite. I understand that there are technical reasons to do this in order to make it E2E working for WebNN. However, in long term, it would probably be better to introduce a way to allow user to know the gap between the spec and the implementation. It can be feature detection via some WebNN API or introducing WebNN version. A developer should have a better way to know whether a node can be supported in the current WebNN environment other than knowing how the detailed status of the underlying engine, which is supposed to be transparent.

Honry · 2024-05-30T07:55:00Z

It looks to me that the change is taking considerations of implementation details on more priority than standard specification. Although WebNN is using TFLite, it does not necessarily mean that (technically) it will always use it on every environment, and even if so, it is still possible that a future version of TFLite may support arbitrary min/max attribute of Clip/Clamp.

It is hard to control the browser version and platforms users are using, so they may use a different version. This is why a standard is important and in my understanding the code should stick to the spec as much as possible, instead of the status of a particular underlying implementation.

There are a few PRs that are doing similar and I assume it's because in Chromium WebNN's underlying CPU engine is switched from XNNPACK to TFLite. I understand that there are technical reasons to do this in order to make it E2E working for WebNN. However, in long term, it would probably be better to introduce a way to allow user to know the gap between the spec and the implementation. It can be feature detection via some WebNN API or introducing WebNN version. A developer should have a better way to know whether a node can be supported in the current WebNN environment other than knowing how the detailed status of the underlying engine, which is supposed to be transparent.

Thanks @fs-eire, very good point! It's really painful for users at current stage, as the spec and the implementation of WebNN is still under very active evolution. @fujunwei is working on these unsupported constraints by emulating or some other methods to fill up the gaps.

We will finally close the gap, and this table will keep up-to-date to maintain the latest status of op support status and constraints.

For long term, once everything is stable and the gap is small (e.g. WebNN passes the Origin Trial in Chromium), we can note users more change info along with Chrome version. WDYT?

cc/ @huningxin, @ibelem

fs-eire · 2024-05-30T08:04:10Z

It looks to me that the change is taking considerations of implementation details on more priority than standard specification. Although WebNN is using TFLite, it does not necessarily mean that (technically) it will always use it on every environment, and even if so, it is still possible that a future version of TFLite may support arbitrary min/max attribute of Clip/Clamp.
It is hard to control the browser version and platforms users are using, so they may use a different version. This is why a standard is important and in my understanding the code should stick to the spec as much as possible, instead of the status of a particular underlying implementation.
There are a few PRs that are doing similar and I assume it's because in Chromium WebNN's underlying CPU engine is switched from XNNPACK to TFLite. I understand that there are technical reasons to do this in order to make it E2E working for WebNN. However, in long term, it would probably be better to introduce a way to allow user to know the gap between the spec and the implementation. It can be feature detection via some WebNN API or introducing WebNN version. A developer should have a better way to know whether a node can be supported in the current WebNN environment other than knowing how the detailed status of the underlying engine, which is supposed to be transparent.

Thanks @fs-eire, very good point! It's really painful for users at current stage, as the spec and the implementation of WebNN is still under very active evolution. @fujunwei is working on these unsupported constraints by emulating or some other methods to fill up the gaps.

We will finally close the gap, and this table will keep up-to-date to maintain the latest status of op support status and constraints.

For long term, once everything is stable and the gap is small (e.g. WebNN passes the Origin Trial in Chromium), we can note users more change info along with Chrome version. WDYT?

cc/ @huningxin, @ibelem

Thank you for the information.

Is there existing discussion or proposal about:

APIs that allows users to query the implementation status
versioning

I think the gap may always exist because everything is moving: spec will update and accept new operators (or maybe deprecate less used operators too), and implementation will upgrade too. So I am not optimistic on "everything is stable" and I think a standard way to manage the gap is more reasonable to me.

Honry · 2024-05-30T08:19:26Z

Is there existing discussion or proposal about:

APIs that allows users to query the implementation status

This issue webmachinelearning/webnn#463 is talking about exposing operators/types status for each backend.

versioning

I am not aware of discussion about versioning. @huningxin, do you know?

I think the gap may always exist because everything is moving: spec will update and accept new operators (or maybe deprecate less used operators too), and implementation will upgrade too. So I am not optimistic on "everything is stable" and I think a standard way to manage the gap is more reasonable to me.

You are right, I mean we can mark more change info once the gap is smaller and the change is not so frequent.

We have a WebNN Status page to maintain the op impl status for each backend with Chrome version info.

Maybe we can add Chrome and ORT-Web version info to this table in future.

Honry · 2024-05-30T08:21:41Z

BTW, @fs-eire, do you know how does WebGPU EP manage versioning?

js/web/docs/webnn-operators.md

huningxin · 2024-05-30T15:40:02Z

@fs-eire , thanks for your feeback!

This is why a standard is important and in my understanding the code should stick to the spec as much as possible, instead of the status of a particular underlying implementation.

+1

it is still possible that a future version of TFLite may support arbitrary min/max attribute of Clip/Clamp.

Before underlying runtime like TFLite gets that support, WebNN implementation can emulate by composition. In WebNN spec, for each operators that can be decomposed, there is an emulation sample code. For example, clamp can be emulated by min and max operators as following code sample

if (options.minValue === undefined) {
  if (options.maxValue === undefined) {
    return input;
  } else {
    return builder.min(input, builder.constant(options.maxValue));
  }
} else {
  if (options.maxValue === undefined) {
    return builder.max(input, builder.constant(options.minValue));
  } else {
    return builder.min(
        builder.max(input, builder.constant(options.minValue)),
        builder.constant(options.maxValue));
  }
}

Chromium implementation of [TFLite] Support other range for Clamp operator can refer to above sample code for decomposition. So frameworks should stick to spec for consistency.

APIs that allows users to query the implementation status

This issue webmachinelearning/webnn#463 is talking about exposing operators/types status for each backend.

Correct, webmachinelearning/webnn#463 is the right one for feature detection discussion.

versioning

I am not aware of discussion about versioning. @huningxin, do you know?

Generally, Web spec doesn't have versioning, WebNN spec follows the same principle. WebNN spec is now in CR status for browser prototyping and developer preview. This kind of interop feedback would be great input to WG. Once the spec moves into more stable status, browser implementation should stick to the latest released version and maintain the backward compatibility.

fs-eire · 2024-05-31T07:17:58Z

@Honry @huningxin thank you for your detailed explanation.

https://webmachinelearning.github.io/webnn-status/

This link is helpful. Perhaps I can add this into somewhere in ORT documentation

how does WebGPU EP manage versioning?

Technically, WebGPU is doing feature detection based on "capabilities". For example, the f16 support can be detected by using this API. Good to know that WebNN has this discussion: webmachinelearning/webnn#463.

guschmue · 2024-06-03T16:06:43Z

I think it is clear that the versioning for webnn needs work.
Not sure if comparing webgpu is right since webgpu operates on a lower level and the surface there is more on a feature level that can be used by the ep while webnn has N ops where each op can be subject to changes between versions.

guschmue · 2024-06-03T16:07:08Z

/azp run ONNX Runtime Web CI Pipeline,Windows GPU CI Pipeline,Linux Android Emulator QNN CI Pipeline

guschmue · 2024-06-03T16:07:15Z

/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline

guschmue · 2024-06-03T16:07:22Z

/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,Windows x64 QNN CI Pipeline,Big Models

azure-pipelines · 2024-06-03T16:07:25Z

Azure Pipelines successfully started running 3 pipeline(s).

azure-pipelines · 2024-06-03T16:07:50Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-06-03T16:07:51Z

Azure Pipelines successfully started running 7 pipeline(s).

fdwr · 2024-06-04T02:50:00Z

Wanming, can we implement the min/max decomposition in ORT when using the WebNN CPU backend, then remove it when Junwei implements it in Chromium's CPU backend? Then we'd have fewer fragmented partitions, and it would be closer to the expected end result. In any case, I'm glad you added the Chromium todo comment, and I still approve knowing that proper support in the web API is coming soon.

Honry · 2024-06-04T03:18:33Z

Wanming, can we implement the min/max decomposition in ORT when using the WebNN CPU backend, then remove it when Junwei implements it in Chromium's CPU backend? Then we'd have fewer fragmented partitions, and it would be closer to the expected end result. In any case, I'm glad you added the Chromium todo comment, and I still approve knowing that proper support in the web API is coming soon.

👍Good point, I will follow up.

[WebNN EP] TFLite backend only supports limit ranges for Clip

fbaa38f

huningxin reviewed May 30, 2024

View reviewed changes

js/web/docs/webnn-operators.md Outdated Show resolved Hide resolved

Add link to Chromium issue

9362776

guschmue added the ep:WebNN WebNN execution provider label Jun 3, 2024

guschmue approved these changes Jun 3, 2024

View reviewed changes

fdwr approved these changes Jun 4, 2024

View reviewed changes

guschmue merged commit da1f8f9 into microsoft:main Jun 6, 2024
70 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WebNN EP] TFLite backend only supports limit ranges for Clip #20863

[WebNN EP] TFLite backend only supports limit ranges for Clip #20863

Honry commented May 30, 2024

Honry commented May 30, 2024

fs-eire commented May 30, 2024

Honry commented May 30, 2024

fs-eire commented May 30, 2024

Honry commented May 30, 2024

Honry commented May 30, 2024

huningxin commented May 30, 2024 •

edited

Loading

fs-eire commented May 31, 2024

guschmue commented Jun 3, 2024

guschmue commented Jun 3, 2024

guschmue commented Jun 3, 2024

guschmue commented Jun 3, 2024

azure-pipelines bot commented Jun 3, 2024

azure-pipelines bot commented Jun 3, 2024

azure-pipelines bot commented Jun 3, 2024

fdwr commented Jun 4, 2024

Honry commented Jun 4, 2024

[WebNN EP] TFLite backend only supports limit ranges for Clip #20863

[WebNN EP] TFLite backend only supports limit ranges for Clip #20863

Conversation

Honry commented May 30, 2024

Honry commented May 30, 2024

fs-eire commented May 30, 2024

Honry commented May 30, 2024

fs-eire commented May 30, 2024

Honry commented May 30, 2024

Honry commented May 30, 2024

huningxin commented May 30, 2024 • edited Loading

fs-eire commented May 31, 2024

guschmue commented Jun 3, 2024

guschmue commented Jun 3, 2024

guschmue commented Jun 3, 2024

guschmue commented Jun 3, 2024

azure-pipelines bot commented Jun 3, 2024

azure-pipelines bot commented Jun 3, 2024

azure-pipelines bot commented Jun 3, 2024

fdwr commented Jun 4, 2024

Honry commented Jun 4, 2024

huningxin commented May 30, 2024 •

edited

Loading