Synchronously validate input operands/activations #572

inexorabletash · 2024-02-15T23:56:30Z

When calling a builder method on MLGraphBuilder that takes an MLOperand or MLActivation as an input (i.e. anything but input() and constant()), there is no check that the input is from the same builder.

The only place in the spec that does this is validate MLOperand steps that take an operand and builder.

This algorithm is only called in:

concat() with "If validating MLOperand given input and this returns false, then throw a "DataError" DOMException." which is bogus - it should pass input and this.[[builder]].
build() asynchronously validates that the output operands are from the same builder, with "If validating MLOperand given operand and this returns false, then reject promise with a TypeError, and abort these steps."

@a-sully raised this in #552 and it was previously mentioned (but without specifics) in #234

Should failure here be synchronous? This is implied by the behavior of concat() and the open ended "If any of the following sub-steps fail, ..." text in the builder methods. I don't see a reason for async failure (except in build itself, since you can't synchronously fail a method that returns a promise) so I'd strongly prefer synchronous failure. (If async, the build() should be augmented to traverse the whole graph.)

Assuming we want synchronous failure we could spec this in a few ways:

Add explicit steps like concat() to all builder methods, e.g.: If validating MLOperand given input and this.[[builder]] returns false, then throw a "DataError" DOMException. - repeated for every input (required and optional if given).
Define steps for connect which validate and return failure; currently the builder method steps all already include If any of the following sub-steps fail, throw an "OperationError" DOMException.

Either way, concat() needs to be aligned with the other methods (fix to pass this.[[builder]] or remove redundant steps)

This also raises the question of what exception type it should be:

"DataError" DOMException, like concat(), or
"OperationError" DOMException, like the substeps which include "connect", or
TypeError, like build()

Additionally... is there anything other than "same builder" that's not currently validated, but that should be?

inexorabletash · 2024-02-16T19:08:13Z

If we wanted to go the "define connect" route, it might be something like:

To connect an MLOperand input as an input to MLOperand target:

If input.[[builder]] is not the same as target.[[builder]], return failure.

(any other validation?)

Add input.[[operand]] to target.[[operand]]'s inputs.

To connect an MLOperand output as an output to MLOperand target:

If output.[[builder]] is not the same as target.[[builder]], return failure.

(any other validation?)

Add output.[[operand]] to target.[[operand]]'s outputs.

Some "connect" phrases currently reference the MLOperand, and some reference the platform operand instead; referencing the MLOperands consistently would simplify things.

inexorabletash · 2024-02-16T19:29:08Z

We'll want something similar for MLActivations, i.e. either an explicit step or by defining steps for "register it as activation".

huningxin · 2024-02-18T03:43:22Z

Some "connect" phrases currently reference the MLOperand, and some reference the platform operand instead

When "connect" references the platform operand, such as relu(input) "3. Connect input.[[operand]] as input to opImpl.".

I suppose here it means a process of calling the native platform API to "connect" the platform operand to platform operator as input (or output). Actually, I think it should be a sub-step of "2. Make a request to the underlying platform to:".

referencing the MLOperands consistently would simplify things.

Agreed. If we go this route, I think we should also define the "operator" concept within the spec. For a WebNN graph, MLOperands are edges, they need to be connected to nodes (operators), to construct a graph.

a-sully · 2024-02-20T19:14:34Z

I think it might be useful to think of these validation steps in tiers:

Platform-agnostic, op-agnostic checks

Currently this is specified as validate MLOperand and includes checks like that MLOperands must be created from the same MLGraphBuilder to be connected; though as @inexorabletash hinted at, there may be more steps we want to throw into this bucket

Platform-agnostic, op-specific checks

These steps should also be run (synchronously) for each MLGraphBuilder method. This includes checks like ensuring that constraints such as broadcasting requirements are met and the passed-in MLActivation is valid for the given op, as well as that the shapes expected by the MLOperands being connected are valid.

Platform-specific, op-specific checks

This is where we finally talk to the platform API:

I suppose here it means a process of calling the native platform API to "connect" the platform operand to platform operator as input (or output)

I propose that (1) and (2) should be run (synchronously) for each MLGraphBuilder method (and should throw a TypeError), while (3) should be deferred until (asynchronously) on build(). If we agree on this, then we should update each MLGraphBuilder method to follow this (very rough) patten:

The foo(input, options) method steps are:

Validate an MLOperand with a builder + input + this (else reject with TypeError)
Validate foo given options:
1. Check options are valid (else reject with TypeError)
2. etc...
Connect with input
1. Check shapes are valid (else reject with TypeError)
2. etc...
3. Add to input's outputs

Or maybe:

The foo(input, options) method steps are:

Validate an MLOperand with a builder + input + this (else reject with TypeError)
Validate foo given input and options:
1. Check options are valid (else reject with TypeError)
2. Check shapes are valid (else reject with TypeError)
3. etc...
Connect with input
1. etc...
2. Add to input's outputs

Ideally (though there might be some quirks which make this not feasible) none of the MLGraphBuilder methods would require implementation-defined behavior, and all the "underlying platform" steps are pushed to build() (which currently also contains "connect" steps)

Thoughts?

huningxin · 2024-02-21T08:12:18Z

@a-sully , your proposal sounds good to me.

We may also consider validating the input' data type synchronously at build method, that should be op-specific check, as being captured by #283.

inexorabletash · 2024-02-22T21:25:46Z

FYI, we asked around for precedent for labeling blocks of substeps. https://html.spec.whatwg.org/multipage/webappapis.html#update-the-rendering has some examples (italics before a clause), and our handy spec experts are not opposed to doing so if it improves readability. I like the idea of the "validate" and "connect" blocks, maybe "calculate output shape" as a block as well.

huningxin · 2024-02-23T01:46:38Z

maybe "calculate output shape" as a block as well.

+1

inexorabletash · 2024-02-24T01:16:30Z

Building on the conversation above, here's a sketch of something concrete. If this looks like a decent starting point I can put up a PR for detailed nitpicks.

These define operator as internal concept
MLOperand and MLActivation now hold operator, not platform operator
MLOperand does not need to define/hold platform operand - that's deferred to build - Complete the build algorithms #457
new structure for MLOperand-vending methods that separates out validation, output shape, and connection
"connect" goes away as a term.
"register" goes away as a term for activations

Programming Model / Overview

Replace bits about operands with the following:

The MLGraph interface represents a compiled computational graph that is immutable (that is, a model).

The MLGraphBuilder interface serves as a builder (factory) to construct the computation graph that is then compiled to create an MLGraph.

In WebNN, the computational graph is composed of operators which act on data, and are the nodes of the graph. An operator's input is one or more MLOperands representing the data flow for the computation, and are the edges of the graph. Operands include input values, constants (including trained weights), and intermediate values (often referred to as activations). An operator's output is one or more MLOperands, representing the output values of the computation. Operators have operator-specific parameters that control their behavior, which can include zero or more activation functions, which are MLActivations. The operations have functional semantics, with no side effects. Each operation invocation conceptually returns a distinct new value, without changing the value of any other MLOperand.

A key part of the MLGraphBuilder interface are methods such as gemm() and softmax() which create an operator which represents actual operation to perform on the input data when the computation is run, and return a new MLOperand or MLActivation holding the operator. Methods that create an MLOperand connect any inputs and activations to the operator.

Insert the following near the build() description:

The MLGraph is composed of platform operators and platform operands which correspond to the operators and MLOperands, but are not script-visible and MAY be compositions or decompositions of the graph as constructed by script.

MLActivation interface

Update to slot description:

[[operator]] of type operator
Reference to MLActivation's corresponding operator.

MLOperand interface

Update internal slot description:

[[operator]] of type operator
Reference to MLOperand's corresponding operator.

And delete [[operand]]

Standard MLGraphBuilder Operand-vending Method Structure

The whatever(input, options) method steps are:

Validate arguments:
1. If input.[[builder]] is not this, then throw a "DataError" DOMException.
2. If options.scales does not exist, set it to the list « 1.0, 1.0 ».
3. …
4. If options.activation exists and options.activation.[[builder]] is not this, then throw a "DataError" DOMException.
5. …
Calculate the output shape:
1. Let outputDescriptor be a new MLOperandDescriptor.
2. Set outputDescriptor.dataType to input’s dataType.
3. Set outputDescriptor.dimensions to the result of unidirectionally broadcasting the shapes input’s shape and options.newShape. If that returns failure, then throw a "DataError" DOMException.
4. …
Make graph connections:
1. Let op be an operator for the "whatever" operation, given options.
2. Let output be the result of creating an MLOperand given this and outputDescriptor.
3. If options.activation exists, then add it to op's activation functions.
4. Set output.[[operator]] to op.
5. Set op's input to input.
6. Set op's output to output.
Return output.

a-sully · 2024-02-24T02:26:15Z

This looks great to me!

There are a couple of things I notice about this framework which we should follow up on elsewhere:

I think everything in steps 1 and 2 which may reject should do so with a TypeError, rather than a DataError (I can file a follow-up issue)
This proposal removes all platform-specific behavior from all MLGraphBuilder operand-vending methods. All platform-specific checks are deferred until build(). This is related to Supported operations query mechanism #504, so discussions related to this topic should continue on that issue

huningxin · 2024-02-25T09:11:01Z

Thanks @inexorabletash ! The proposal looks great to me.

These define operator as internal concept

Could you please elaborate a bit how operator internal concept is defined? Is it an internal interface definition for a specific operation? For example, something like Chromium prototype's struct Conv2d mojo definition?

inexorabletash · 2024-02-26T17:43:03Z

Could you please elaborate a bit how operator internal concept is defined? Is it an internal interface definition for a specific operation? For example, something like Chromium prototype's struct Conv2d mojo definition?

As sketched above, it's just describing the abstract concept of an operator. Like algorithms and internal slots, these spec concept objects are useful for defining behavior, but aren't script visible so only matching the observable behavior is required. But they very often do map 1:1 with implementations. That Conv2d example from the Chromium prototype impl is what this would map to.

As discussed in webmachinelearning#572: - Define an "operator" concept, with inputs, outputs, activations. - Defer "platform operator" and "platform operand" to build. - Standardize the graph connection steps across builder methods. - Simplify activation, input and constant steps. Not covered in this change: - Erroring if input's [[builder]] doesn't match `this` - Build algorithm - covered by webmachinelearning#448 and webmachinelearning#457 - gru() is missing steps to populate output. Added "Issue" note. - Introducing "Validate arguments" section for each method. - Introducing "Calculate output shape" section for each method. For webmachinelearning#549 and webmachinelearning#572.

* Content: Define operand concept, simplify graph connection steps As discussed in #572: - Define an "operator" concept, with inputs, outputs, activations. - Defer "platform operator" and "platform operand" to build. - Standardize the graph connection steps across builder methods. - Simplify activation, input and constant steps. Not covered in this change: - Erroring if input's [[builder]] doesn't match `this` - Build algorithm - covered by #448 and #457 - gru() is missing steps to populate output. Added "Issue" note. - Introducing "Validate arguments" section for each method. - Introducing "Calculate output shape" section for each method. For #549 and #572. * Trivial fixes - missing spaces, wrong operation name * lstm(): Fix output2 shape calculation In tip-of-tree, the "desc" MLOperandDescriptor is populated, used, then modified and conditionally used again (if `returnSequence` is true) when creating "output2". This was broken in previous commits in this branch. Restore the intent, using a separate "desc2" MLOperandDescriptor only conditionally populated and then conditionally used. * Update index.bs Co-authored-by: Ningxin Hu <[email protected]> * Initial review feedback: * Add other operand inputs * PreLU -> PReLU * Define split() output shapes calculations * Introduce definition for computational graph with inputs and constants * Revise Programming Model section --------- Co-authored-by: Ningxin Hu <[email protected]>

inexorabletash · 2024-03-13T18:41:03Z

Concrete remaining work following #591

~~Erroring if input's [[builder]] doesn't match this Synchronously validate input operands/validations #605~~
~~Rewording methods that vend MLActivations. details - Bugfix: Drop "re-throw" in MLActivation creation steps #619~~
Standard phrasing for "be an operator for ..." details - anyone else want to take this?

And then for these:

Introducing "Validate arguments" section for each method.
Introducing "Calculate output shape" section for each method (maybe "output descriptor" is better?)

... in practice, these end up being empty for many ops, and very intertwined for others; i.e. it's hard to separate argument validation from output shape calculation. Is this actually valuable? Opinions (and PRs) welcome!

Previously the spec had a "validate `MLOperand`" helper that (1) ensured the operand was from the passed `MLGraphBuilder` and (2) that the operand was internally consistent, and this was called during (3) `build()` and (4) only `concat()` among the vending methods. - (1) is needed but can be done when the `MLOperand` is created, giving better feedback, so (3) isn't needed. - (2) is not needed - `MLOperands` are immutable so they can't be created in a bad state. - (4) should be expanded to all `MLOperand` creations that take input `MLOperand`s. This renames the helper, ensures it is called by every `MLOperand` vending method that takes `MLOperand` inputs, and drops it from `build()` (although that will probably collide with PR webmachinelearning#603 which should land first). Similar validation is added for `MLActivation`s. For webmachinelearning#572

* Synchronously validate input operands/validations Previously the spec had a "validate `MLOperand`" helper that (1) ensured the operand was from the passed `MLGraphBuilder` and (2) that the operand was internally consistent, and this was called during (3) `build()` and (4) only `concat()` among the vending methods. - (1) is needed but can be done when the `MLOperand` is created, giving better feedback, so (3) isn't needed. - (2) is not needed - `MLOperands` are immutable so they can't be created in a bad state. - (4) should be expanded to all `MLOperand` creations that take input `MLOperand`s. This renames the helper, ensures it is called by every `MLOperand` vending method that takes `MLOperand` inputs, and drops it from `build()` (although that will probably collide with PR #603 which should land first). Similar validation is added for `MLActivation`s. For #572 * Apply suggestions from code review Missed a few "if it exists" clauses Co-authored-by: Ningxin Hu <[email protected]> --------- Co-authored-by: Ningxin Hu <[email protected]>

Places that create ops use an inconsistent mixed of simple phrases, camelCase, Title Case, ACRONYMS, and "quoted strings". The most common was camelCase, but the wording can be weird, and the bulk-defined binary/unary/logical/pooling/reduction ops and activations use a "quotedCamelCase", so I went with that. See webmachinelearning#591 (comment) for the most commentary. Resolves webmachinelearning#572

inexorabletash · 2024-05-10T16:05:53Z

For the "remaining work", namely:

Introducing "Validate arguments" section for each method.
Introducing "Calculate output shape" section for each method (maybe "output descriptor" is better?)

My suggestion is: add as needed, e.g. when a method's steps could be broken up to improve readability. But adding to all methods would just be clutter.

inexorabletash added bug question labels Feb 15, 2024

inexorabletash mentioned this issue Feb 24, 2024

Update the programming model section #549

Closed

a-sully mentioned this issue Feb 24, 2024

Consider throwing TypeError for invalid input #583

Closed

huningxin mentioned this issue Feb 26, 2024

Content: Specify output size calculations #582

Merged

philloooo mentioned this issue Feb 26, 2024

Consider adding node labels for more diagnosable error messages for async errors. #585

Closed

inexorabletash mentioned this issue Feb 28, 2024

Content: Define operand concept, simplify graph connection steps #591

Merged

inexorabletash mentioned this issue Mar 16, 2024

Synchronously validate input operands/validations #605

Merged

a-sully mentioned this issue Mar 27, 2024

Should MLBufferView + MLOperandDescriptor be strongly typed (as a MLTensor)? #275

Closed

inexorabletash mentioned this issue Apr 17, 2024

Update 2024-04-18-wg-agenda.md webmachinelearning/meetings#22

Merged

inexorabletash mentioned this issue May 3, 2024

Use consistent phrasing for operator creation #674

Merged

inexorabletash self-assigned this May 3, 2024

huningxin closed this as completed in #674 May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synchronously validate input operands/activations #572

Synchronously validate input operands/activations #572

inexorabletash commented Feb 15, 2024

inexorabletash commented Feb 16, 2024

inexorabletash commented Feb 16, 2024

huningxin commented Feb 18, 2024

a-sully commented Feb 20, 2024

huningxin commented Feb 21, 2024

inexorabletash commented Feb 22, 2024

huningxin commented Feb 23, 2024

inexorabletash commented Feb 24, 2024 •

edited

Loading

a-sully commented Feb 24, 2024

huningxin commented Feb 25, 2024

inexorabletash commented Feb 26, 2024

inexorabletash commented Mar 13, 2024 •

edited

Loading

inexorabletash commented May 10, 2024

Synchronously validate input operands/activations #572

Synchronously validate input operands/activations #572

Comments

inexorabletash commented Feb 15, 2024

inexorabletash commented Feb 16, 2024

inexorabletash commented Feb 16, 2024

huningxin commented Feb 18, 2024

a-sully commented Feb 20, 2024

huningxin commented Feb 21, 2024

inexorabletash commented Feb 22, 2024

huningxin commented Feb 23, 2024

inexorabletash commented Feb 24, 2024 • edited Loading

Programming Model / Overview

MLActivation interface

MLOperand interface

Standard MLGraphBuilder Operand-vending Method Structure

a-sully commented Feb 24, 2024

huningxin commented Feb 25, 2024

inexorabletash commented Feb 26, 2024

inexorabletash commented Mar 13, 2024 • edited Loading

inexorabletash commented May 10, 2024

inexorabletash commented Feb 24, 2024 •

edited

Loading

inexorabletash commented Mar 13, 2024 •

edited

Loading