Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should MLBufferView + MLOperandDescriptor be strongly typed (as a MLTensor)? #275

Closed
wacky6 opened this issue Jun 22, 2022 · 4 comments
Closed
Labels

Comments

@wacky6
Copy link

wacky6 commented Jun 22, 2022

Some issues with using a dict:

  • dimensions isn't validated against TypedArray.bytesLength. I can set arbitrary dimensions (like [4,4,4] to a Float32Array that only has 1 element.
  • Hard to create an overloaded WebNN dispatch() and WebML compute() for single input/output graphs.
    • WebIDL can't distinguish between dict<NodeName, dict<data, dimensions>> and dict<NodeName "data", NodeName "dimensions">, thus fails to bind JavaScript object to C++ implementation.

Discovered this during WebML prototyping for https://crbug.com/1338067

@wacky6
Copy link
Author

wacky6 commented Jun 23, 2022

Considerations:

  • MLTensor class would allow us to validate dimensions and throw errors early on (instead of at graph computation time).
  • A work-around for non-JS-standard float point types (Support for non-IEEE 754 float point types #252), by adding (or inferring) dataType to constructor?
    • new MLTensor(float32Array, dimensions) -> Infer dataType as Float32Array
    • new MLTensor(arrayBuffer, dimensions, dataType) -> dataType required for untyped array

@wacky6 wacky6 changed the title Should MLTensor be strongly typed? Should MLBufferView + MLOperandDescriptor be strongly typed (as a MLTensor)? Jun 23, 2022
@huningxin
Copy link
Contributor

Thanks @wacky6 .

  • dimensions isn't validated against TypedArray.bytesLength. I can set arbitrary dimensions (like [4,4,4] to a Float32Array that only has 1 element.

Could it be validated by checking TypedArray.length against dimensions?

  • WebIDL can't distinguish between dict<NodeName, dict<data, dimensions>> and dict<NodeName "data", NodeName "dimensions">, thus fails to bind JavaScript object to C++ implementation.

This is an issue.

I suppose this would also support other fixed point types, e.g., int4, am I correct?

  • new MLTensor(float32Array, dimensions) -> Infer dataType as Float32Array

My question is whether the content of float32Array is copied into the constructed MLTensor. As MLTesnor would be used in the busy compute loop, the cost of memory copy and GC (new objects constructed for each compute).

@wacky6
Copy link
Author

wacky6 commented Jun 24, 2022

Could it be validated by checking TypedArray.length against dimensions?

I think so. TypedArray.length === dimensions_multiplied_together. For array buffer, should be byteLength === data_type_size * dimensions_multiplied_together.

I suppose this would also support other fixed point types, e.g., int4, am I correct?

In theory yes. But we should clearly define data layout of an arrayBuffer. Things like endiness, padding, etc.

This can be a stretch goal, we can add support for them later. :)

My question is whether the content of float32Array is copied into the constructed MLTensor. As MLTesnor would be used in the busy compute loop, the cost of memory copy and GC (new objects constructed for each compute).

I don't think copy is necessary at constructor time. TypedArray have a fixed length, it can't change in size (it can only be reshaped).

MLTensor can hold a reference to the typedArray (or arrayBuffer). This also allows web applications to modify tensor's content by modifying the typedArray's content.

When MLTensor is used in compute() or dispatch(), the data can be copied (to avoid TOCTOU bugs) if necessary.

@a-sully
Copy link
Contributor

a-sully commented Mar 27, 2024

MLBufferView was removed in #569, and this point:

MLTensor class would allow us to validate dimensions and throw errors early on (instead of at graph computation time).

is covered by the proposed "Calculate the output shape" section of #572

Can we close this issue?

@wacky6 wacky6 closed this as completed Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants