Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve "underlying platform" references in spec #462

Open
inexorabletash opened this issue Sep 8, 2023 · 4 comments
Open

Improve "underlying platform" references in spec #462

inexorabletash opened this issue Sep 8, 2023 · 4 comments

Comments

@inexorabletash
Copy link
Member

inexorabletash commented Sep 8, 2023

The spec currently uses algorithms of the form (just examples copy/pasted from different places

  1. Make a request to the underlying platform to:
    1. Let opImpl be an implementation-defined platform operator for the slice operation, given starts and sizes.
    2. ...
    3. Create an implementation-defined platform operand outputImpl to represent the output, given output and opImpl.
    4. ...
  2. Connect input.[[operand]] as input to opImpl.
  3. Connect output.[[operand]] as output to opImpl.
  • "implementation-defined" isn't really helpful here, since it never explains what this does.
  • "platform operator" is never defined
  • "the slice operation" isn't defined
  • "connect" isn't defined

Ideally this would all be defined. Here's one way to approach it, imagining a much simpler interface:

First we define the abstract notion of an op:

An op represents a mathematical operation as part of a graph. An op has zero or more inputs (ops) and one output (an op), and an algorithm which, when invoked, takes a corresponding number of input values and produces an output value.

In a dedicated section of the spec, we'd rigorously define each op's algorithm:

An addition op is an op with two inputs (both numbers) and the following algorithm:

  1. Let a be the first input value.
  2. Let b be the second input value.
  3. Return a + b.

An average op is an op with one input (a list) and the following algorithm:

  1. Let a be the input.
  2. Let size be a's size.
  3. Assert: size is greater than 0.
  4. Let sum be 0.
  5. For each value of a:
    1. Let sum be sum + value.
  6. Return sum / size.

Obviously these are overly simple, but in the spec right now the more complex ops are currently only defined by links. I'd tackle this incrementally, but it should be in the spec itself! See Web Audio for examples e.g. https://webaudio.github.io/web-audio-api/#fft-windowing-and-smoothing-over-time. These could be inline with the builder methods or as a separate section of the spec.

And then in a builder method for add() it'd be something like:

  1. Let output be the result of creating an MLOperand given this and desc.
  2. Let opImpl be a new addition op.
  3. Set opImpl's inputs to inputs.
  4. Set opImpl's outputs to output.[[operand]].
  5. Set output.[[operator]] to opImpl.

That's probably not quite correct but hopefully the intention is clear.

More notes:

  • In the above I'm mixing the [[internal slot]] notation with the "a has a b" notation. Sorry about that.
  • Execution for a graph can be defined with an algorithm now - walking over the graph and passing input values/output values around.
  • The input/output hookup is very similar for most builder methods, so maybe that can be factored out into a common algorithm.
  • The current builder methods all have the common form of (1) validate inputs (2) construct and hook up the operator. The second part is always inside a clause like: "If any of the following sub-steps fail, throw an "OperationError" DOMException." - but it's not clear to me what is expected to fail. If implementations will actually be calling into native APIs at the builder step and that's expected to fail, we should define the reasons - e.g. is it memory, is it that the platform might not support an op, etc. That could be specified like:
  1. Let opImpl be a new addition op. If this fails for implementation-defined reasons, throw an "OperationError" DOMException.

Okay, that's a lot of rambling and as noted I may be missing important details, so I'll stop here. Let's iterate (here or a PR?) until we come up with a good approach that balances readability, maintainability, and precision.

cc: @huningxin

@zolkis
Copy link
Collaborator

zolkis commented Sep 8, 2023

Thanks @inexorabletash for formalizing these suggestions. I've been also thinking about what can we factor out from the quite repetitive texts in the algorithms. The intent was to first finish the first approach to the algorithms separately for each op, and then do exactly what you are suggesting now. It's good time for it.

The challenge is that in most algorithms we have similarly looking prose, but the underlying implementation can be quite different from one case to the other. It might be misleading to unify at syntactic level, and miss the semantic differences.

A first target to formalize could be how to express in a common formalized way the sequence specified for ops. For operators, it could be reduced by defining the create MLActivation algorithm, then refer to it from e.g. the elu() steps:

  1. Let op be the result of creating an MLActivation given this, "elu" and options.
  2. Return op.

Whereas for the ops, it refers to the underlying platform specific to the elu() implementation. There the question is how to parametrize these steps given the op_name , while binding the impl's op-specific internal slot for operator and output and create a "macro", meta-algorithm for that. It should be doable, and we can attempt it in a way similar you described above.

I was thinking about defining a map for the op algorithm name --> impl internal slot, and use that from the prose. But it would be quite verbose and futile: it should also be possible to just say that in prose, inputting the op_alg_name as a macro-input (expressed with prose).

@huningxin
Copy link
Contributor

@inexorabletash , thanks for initiating this thread! I'd agree this is an area needs to be improved.

In the current spec, the "underlying platform" / "platform op" is a kind of abstraction from native APIs used in implementation, such as XNNPACK and DirectML. We assume the "platform op" would hold the WebNN op semantics, as you mentioned, usually defined by links rather than defined by algorithm steps. For some cases, it would cause ambiguity, for example, the issue 358: Please clarify interpolation algorithm for resample2d.

IIUC, the op abstraction sounds like to implement WebNN op without relying on any (native) ML APIs. I'd agree it would make the spec to be more precise and "self-contained". It reminds me the webnn-baseline project which is a pure JavaScript implementation for WPT test data generation. It doesn't rely on any ML libraries (e.g., TensorFlow.js, BTW, which is used by webnn-polyfill for performance reason) either. All ops have a JavaScript implementation, for example: https://github.com/webmachinelearning/webnn-baseline/blob/main/src/binary.js.

function binary(inputA, inputB, binaryFunc) {
  const outputShape = getBroadcastShape(inputA.shape, inputB.shape);
  const inputABroadcast = broadcast(inputA, outputShape);
  const inputBBroadcast = broadcast(inputB, outputShape);
  const outputSize = sizeOfShape(outputShape);
  const output = new Tensor(outputShape);
  for (let i = 0; i < outputSize; ++i) {
    const a = inputABroadcast.getValueByIndex(i);
    const b = inputBBroadcast.getValueByIndex(i);
    const c = binaryFunc(a, b);
    output.setValueByIndex(i, c);
  }
  return output;
}

This would add a few more things into your addition op algorithm to make it element-wise addition, some initial thoughts:

  1. We may need to support tensor (multi-dimensional array with a uniform type). The element-wise addition op should operate on tensor instead of numbers. In current spec, Operand interface already has the dimensions and data type in the [[descriptor]] internal slot, we may extend it with an [[array]] internal slot that holds the actual tensor data. With that, the implementation-defined platform [[operand]] internal slot won't be required anymore. We may also need to define additional algorithms for accessing the tensor data, such as getValueByIndex, setValueByIndex, getValueByLocation and setValueByLocation used in above element-wise binary and other op implementations. For example: https://github.com/webmachinelearning/webnn-baseline/blob/main/src/lib/tensor.js
  2. We may need to define the getBroadcastShape and broadcast algorithms instead of referring to NumPy doc. For example: https://github.com/webmachinelearning/webnn-baseline/blob/main/src/lib/broadcast.js
  3. The binaryFunc is a mathematical operation working on numbers, it could invoke addition op algorithm.
  4. The algorithm of "high-level" ops may be written by composing the algorithms of "low-level" ops, for example: https://github.com/webmachinelearning/webnn-baseline/blob/main/src/batch_normalization.js. That would also align with the pseudo decomposing code in the spec, such as for batchNormalization

Loop in @wchao1115 for comments.

@inexorabletash
Copy link
Member Author

IIUC, the op abstraction sounds like to implement WebNN op without relying on any (native) ML APIs.

Correct. We would expect that real implementations would optimize heavily using platform APIs, parallel instructions, hardware accelerators, etc etc. Using the JS code to help craft the algorithm implementations seems like a great idea!

This would add a few more things into your addition op algorithm to make it element-wise addition,

To be clear, I was really just showing the style I'd follow rather than giving a concrete example for inclusion in the spec. Tensors etc. definitely need to be used rather than just primitive inputs.

Speaking of style, I wrote:

  1. Let a be the first input value.
  2. Let b be the second input value.

But we could use a style introducing the algorithm to simplify this, e.g. "takes two inputs (a and b, both numbers)".

inexorabletash added a commit to inexorabletash/webnn that referenced this issue Jan 26, 2024
This change introduces a new section for Algorithms, following APIs,
to collect algorithms referenced throughout the specification.

A section for Broadcasting is introduced, which defines broadcasting
shapes and gives an explicit algorithm matching WebNN implementations
of NumPy's General Broadcasting Rules. Definitions for "broadcastable"
and "unidirectionally broadcastable" are introduced. The previous
definition of "broadcast-shapes" is removed in favor of these new
algorithms.

For webmachinelearning#324, webmachinelearning#462, and potentially webmachinelearning#523.
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Jan 26, 2024
This change introduces a new section for Algorithms, following APIs,
to collect algorithms referenced throughout the specification.

A section for Broadcasting is introduced, which defines broadcasting
shapes and gives an explicit algorithm matching WebNN implementations
of NumPy's General Broadcasting Rules. Definitions for "broadcastable"
and "unidirectionally broadcastable" are introduced. The previous
definition of "broadcast-shapes" is removed in favor of these new
algorithms.

For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Jan 27, 2024
This change introduces a new section for Algorithms, following APIs,
to collect algorithms referenced throughout the specification.

A section for Broadcasting is introduced, which defines broadcasting
shapes and gives an explicit algorithm matching WebNN implementations
of NumPy's General Broadcasting Rules. Definitions for "broadcastable"
and "unidirectionally broadcastable" are introduced. The previous
definition of "broadcast-shapes" is removed in favor of these new
algorithms.

For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Jan 27, 2024
This change introduces a new section for Algorithms, following APIs,
to collect algorithms referenced throughout the specification.

A section for Broadcasting is introduced, which defines broadcasting
shapes and gives an explicit algorithm matching WebNN implementations
of NumPy's General Broadcasting Rules. Definitions for "broadcastable"
and "unidirectionally broadcastable" are introduced. The previous
definition of "broadcast-shapes" is removed in favor of these new
algorithms.

For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Jan 30, 2024
The default linking for [=object=] is to FileAPI's blob URL entry
object member[1], which is definitely wrong.

Replace such references with:

* Infra's ordered map[2], when a dictionary is intended, as that's
  what a JavaScript object is translated into via bindings.

* Links to new definitions for "platform operator" and "platform
  operand" which include the "implementation-defined" phrase so links
  to the terms can be greatly simplified. These definitions should
  be further expanded to include "connect", "input" and "output" but
  that can be tackled later.

This reduces the number of "implementation-defined" phrases to just
a handful in the spec, which helps with webmachinelearning#462.

[1] https://www.w3.org/TR/FileAPI/#blob-url-entry-object

[2] https://infra.spec.whatwg.org/#ordered-map
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Jan 30, 2024
The default linking for [=object=] is to FileAPI's blob URL entry
object member[1], which is definitely wrong.

Replace such references with:

* Infra's ordered map[2], when a dictionary is intended, as that's
  what a JavaScript object is translated into via bindings.

* Links to new definitions for "platform operator" and "platform
  operand" which include the "implementation-defined" phrase so links
  to the terms can be greatly simplified. These definitions should
  be further expanded to include "connect", "input" and "output" but
  that can be tackled later.

This reduces the number of "implementation-defined" phrases to just
a handful in the spec, which helps with webmachinelearning#462.

[1] https://www.w3.org/TR/FileAPI/#blob-url-entry-object

[2] https://infra.spec.whatwg.org/#ordered-map
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Jan 31, 2024
The default linking for [=object=] is to FileAPI's blob URL entry
object member[1], which is definitely wrong.

Replace such references with:

* Infra's ordered map[2], when a dictionary is intended, as that's
  what a JavaScript object is translated into via bindings.

* Links to new definitions for "platform operator" and "platform
  operand" which include the "implementation-defined" phrase so links
  to the terms can be greatly simplified. These definitions should
  be further expanded to include "connect", "input" and "output" but
  that can be tackled later.

This reduces the number of "implementation-defined" phrases to just
a handful in the spec, which helps with webmachinelearning#462.

[1] https://www.w3.org/TR/FileAPI/#blob-url-entry-object

[2] https://infra.spec.whatwg.org/#ordered-map

Co-authored-by: Anssi Kostiainen <[email protected]>
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Feb 2, 2024
The default linking for [=object=] is to FileAPI's blob URL entry
object member[1], which is definitely wrong.

Replace such references with:

* Infra's ordered map[2], when a dictionary is intended, as that's
  what a JavaScript object is translated into via bindings.

* Links to new definitions for "platform operator" and "platform
  operand" which include the "implementation-defined" phrase so links
  to the terms can be greatly simplified. These definitions should
  be further expanded to include "connect", "input" and "output" but
  that can be tackled later.

This reduces the number of "implementation-defined" phrases to just
a handful in the spec, which helps with webmachinelearning#462.

[1] https://www.w3.org/TR/FileAPI/#blob-url-entry-object

[2] https://infra.spec.whatwg.org/#ordered-map

Co-authored-by: Anssi Kostiainen <[email protected]>
Co-authored-by: Ningxin Hu <[email protected]>
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Feb 5, 2024
This change introduces a new section for Algorithms, following APIs,
to collect algorithms referenced throughout the specification.

A section for Broadcasting is introduced, which defines broadcasting
shapes and gives an explicit algorithm matching WebNN implementations
of NumPy's General Broadcasting Rules. Definitions for "broadcastable"
and "unidirectionally broadcastable" are introduced. The previous
definition of "broadcast-shapes" is removed in favor of these new
algorithms.

For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Feb 6, 2024
This change introduces a new section for Algorithms, following APIs,
to collect algorithms referenced throughout the specification.

A section for Broadcasting is introduced, which defines broadcasting
shapes and gives an explicit algorithm matching WebNN implementations
of NumPy's General Broadcasting Rules. Definitions for "broadcastable"
and "unidirectionally broadcastable" are introduced. The previous
definition of "broadcast-shapes" is removed in favor of these new
algorithms.

For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Feb 7, 2024
This change introduces a new section for Algorithms, following APIs,
to collect algorithms referenced throughout the specification.

A section for Broadcasting is introduced, which defines broadcasting
shapes and gives an explicit algorithm matching WebNN implementations
of NumPy's General Broadcasting Rules. Definitions for "broadcastable"
and "unidirectionally broadcastable" are introduced. The previous
definition of "broadcast-shapes" is removed in favor of these new
algorithms.

For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Feb 12, 2024
This change introduces a new section for Algorithms, following APIs,
to collect algorithms referenced throughout the specification.

A section for Broadcasting is introduced, which defines broadcasting
shapes and gives an explicit algorithm matching WebNN implementations
of NumPy's General Broadcasting Rules. Definitions for "broadcastable"
and "unidirectionally broadcastable" are introduced. The previous
definition of "broadcast-shapes" is removed in favor of these new
algorithms.

Use broadcasting definition in expand(), rather than bespoke steps

For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.

Co-authored-by: Dwayne Robinson <[email protected]>
inexorabletash added a commit to inexorabletash/webnn that referenced this issue Feb 12, 2024
This change introduces a new section for Algorithms, following APIs,
to collect algorithms referenced throughout the specification.

A section for Broadcasting is introduced, which defines broadcasting
shapes and gives an explicit algorithm matching WebNN implementations
of NumPy's General Broadcasting Rules. Definitions for "broadcastable"
and "unidirectionally broadcastable" are introduced. The previous
definition of "broadcast-shapes" is removed in favor of these new
algorithms.

Use broadcasting definition in expand(), rather than bespoke steps

For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.

Co-authored-by: Dwayne Robinson <[email protected]>
fdwr added a commit that referenced this issue Feb 13, 2024
* New content: Add definition for shape broadcasting

This change introduces a new section for Algorithms, following APIs,
to collect algorithms referenced throughout the specification.

A section for Broadcasting is introduced, which defines broadcasting
shapes and gives an explicit algorithm matching WebNN implementations
of NumPy's General Broadcasting Rules. Definitions for "broadcastable"
and "unidirectionally broadcastable" are introduced. The previous
definition of "broadcast-shapes" is removed in favor of these new
algorithms.

Use broadcasting definition in expand(), rather than bespoke steps

For #324, #378, #462, and potentially #523.

Co-authored-by: Dwayne Robinson <[email protected]>

* Fix prelu parameter order

---------

Co-authored-by: Dwayne Robinson <[email protected]>
@inexorabletash
Copy link
Member Author

Restating this issue:

  1. The processing model for graph execution should be made more rigorous. At an extreme, it would look like pseudocode for an implementation of compute().

  2. The behavior of each operator should be explained rigorously; even for basic math ops there are cases to consider around overflow/underflow. I don't think we want to go as far as pseudocode for each, but in some cases that might be appropriate. Some example issues that fall into this second category:

And commentary: although spec purity is great, I don't think this is particularly high priority; tackling the second category of issues on a case-by-case basis as needed to resolve interop issues seems fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants