Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content: Specify output size calculations #582

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
199 changes: 194 additions & 5 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -1217,7 +1217,7 @@ partial interface MLGraphBuilder {
<div class=algorithm-steps>
1. [=Assert=]: |op| is one of "argMin", "argMax".
1. If |options|.{{MLArgMinMaxOptions/axes}} [=map/exists=], if any of its elements is not in [=the range=] 0 to |input|'s [=MLOperand/rank=], exclusive, then [=exception/throw=] a "{{DataError}}" {{DOMException}}.
1. Let |outputShape| be the result of invoking the underlying implementation for calculating reduction output dimensions, given |options|.
1. Let |outputShape| be the result of [=MLGraphBuilder/calculating reduction output sizes=] given |input|'s [=MLOperand/shape=], |options|.{{MLArgMinMaxOptions/axes}} (if it [=map/exists=]), and |options|.{{MLArgMinMaxOptions/keepDimensions}}.
1. Let |desc| be a new {{MLOperandDescriptor}}.
1. Set |desc|.{{MLOperandDescriptor/dataType}} to {{MLOperandDataType/"int64"}}.
1. Set |desc|.{{MLOperandDescriptor/dimensions}} to |outputShape|.
Expand Down Expand Up @@ -1794,7 +1794,28 @@ partial interface MLGraphBuilder {
</div>

<details open algorithm>
<summary>
To <dfn for=MLGraphBuilder>calculate conv output size</dfn> given unsigned integers |inputSize|, |filterSize|, |beginningPadding|, |endingPadding|, |stride| and |dilation|, perform these steps. They return a number.
</summary>
<div class=algorithm-steps>
1. Let |effectiveFilterSize| be ( |filterSize| - 1 ) * |dilation| + 1.
1. Let |outputSize| be ( |inputSize| - |effectiveFilterSize| + |beginningPadding| + |endingPadding| ) / |stride| + 1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we handle any errors? Like |outputSize| is overflow or underflow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question... the specs I've been involved in usually operate in an abstract space with infinite precision, unlimited range, etc., with very limited checking.

I noticed the Chromium prototype impl is using checked math (i.e. testing underflow/overflow), and given the sizes that ML deals with it, it seems like this is going to be a practical concern.

We can spec this any way we want... ideas welcome. We should look at other specs for inspiration, too.

Copy link
Collaborator

@fdwr fdwr Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I've always found when implementing specs (like bidi, line breaking, vertical layout...) that I was grateful they left out that degree of fine validation and instead focused on the algorithm itself. Any validation specific to the nature of the operators (e.g. maybe an operator only supports even sizes) belongs in there, but too much otherwise muddies an already complex document. Maybe we can even call that out somewhere in a common section, that the implementation should handle overflow/underflow/safe limit checks, but not as extra prose for each operator.

1. Return |outputSize|.
</div>
</details>

<details open algorithm>
<summary>
To <dfn for=MLGraphBuilder>calculate conv2d output sizes</dfn> given unsigned integers |inputHeight|, |inputWidth|, |filterHeight| and |filterWidth|, [=/list=] of 4 unsigned integers |padding|, [=/list=] of 2 unsigned integers |strides|, and [=/list=] of 2 unsigned integers |dilations|, perform these steps. They return a [=/list=] of 2 numbers.
</summary>
<div class=algorithm-steps>
1. Let |outputHeight| be the result of [=MLGraphBuilder/calculating conv output size=] given |inputHeight|, |filterHeight|, |padding|[0], |padding|[1], |strides|[0] and |dilations|[0].
1. Let |outputWidth| be the result of [=MLGraphBuilder/calculating conv output size=] given |inputWidth|, |filterWidth|, |padding|[2], |padding|[3], |strides|[1] and |dilations|[1].
1. Return « |outputHeight|, |outputWidth| ».
</div>
</details>

<details open algorithm>
<summary>
The <dfn method for=MLGraphBuilder>conv2d(|input|, |filter|, |options|)</dfn> method steps are:
</summary>
Expand All @@ -1817,7 +1838,47 @@ partial interface MLGraphBuilder {
1. If |options|.{{MLConv2dOptions/bias}} [=map/exists=]:
1. If |options|.{{MLConv2dOptions/bias}}'s [=MLOperand/rank=] is not 1, then [=exception/throw=] a {{TypeError}}.
1. If |options|.{{MLConv2dOptions/bias}}'s [=MLOperand/dataType=] is not the same as |input|'s [=MLOperand/dataType=], then [=exception/throw=] a {{TypeError}}.
1. Let |outputShape| be the result of invoking the underlying implementation for calculating output dimensions, given |options|.
1. *Calculate the output shape:*
1. Switch on |options|.{{MLConv2dOptions/inputLayout}}:
<dl class=switch>
: {{MLInputOperandLayout/"nchw"}}
::
1. Let |batches| be |inputShape|[0].
1. Let |channels| be |inputShape|[1].
1. Let |inputHeight| be |inputShape|[2].
1. Let |inputWidth| be |inputShape|[3].
: {{MLInputOperandLayout/"nhwc"}}
::
1. Let |batches| be |inputShape|[0].
1. Let |inputHeight| be |inputShape|[1].
1. Let |inputWidth| be |inputShape|[2].
1. Let |channels| be |inputShape|[3].
</dl>
1. Let |filterShape| be |filter|'s [=MLOperand/shape=].
1. Switch on |options|.{{MLConv2dOptions/filterLayout}}:
<dl class=switch>
: {{MLConv2dFilterOperandLayout/"hwio"}}
::
1. Let |filterHeight| be |filterShape|[0].
1. Let |filterWidth| be |filterShape|[1].
: {{MLConv2dFilterOperandLayout/"ohwi"}}
: {{MLConv2dFilterOperandLayout/"ihwo"}}
::
1. Let |filterHeight| be |filterShape|[1].
1. Let |filterWidth| be |filterShape|[2].
: {{MLConv2dFilterOperandLayout/"oihw"}}
::
1. Let |filterHeight| be |filterShape|[2].
1. Let |filterWidth| be |filterShape|[3].
</dl>
1. Let |outputSizes| be the result of [=MLGraphBuilder/calculating conv2d output sizes=] given |inputHeight|, |inputWidth|, |filterHeight|, |filterWidth|, |options|.{{MLConv2dOptions/padding}}, |options|.{{MLConv2dOptions/strides}}, and |options|.{{MLConv2dOptions/dilations}}.
1. Switch on |options|.{{MLConv2dOptions/inputLayout}}:
<dl class=switch>
: {{MLInputOperandLayout/"nchw"}}
:: Let |outputShape| be « |batches|, |channels|, floor( |outputSizes|[0] ), floor( |outputSizes|[1] ) ».
: {{MLInputOperandLayout/"nhwc"}}
:: Let |outputShape| be « |batches|, floor( |outputSizes|[0] ), floor( |outputSizes|[1] ), |channels| ».
</dl>
1. If |outputShape| is not the same as the shape of |options|.{{MLConv2dOptions/bias}}'s [=MLOperand/shape=], then [=exception/throw=] a "{{DataError}}" {{DOMException}}.
1. Let |desc| be a new {{MLOperandDescriptor}}.
1. Set |desc|.{{MLOperandDescriptor/dataType}} to |input|'s [=MLOperand/dataType=].
Expand Down Expand Up @@ -1943,6 +2004,28 @@ partial interface MLGraphBuilder {
*outputSize = (inputSize - 1) ** *stride + (filterSize - 1) ** *dilation + 1 - beginningPadding - endingPadding + outputPadding*
</div>

<details open algorithm>
<summary>
To <dfn for=MLGraphBuilder>calculate convtranspose output size</dfn> given unsigned integers |inputSize|, |filterSize|, |beginningPadding|, |endingPadding|, |stride|, |dilation|, and |outputPadding|, perform these steps. They return a number.
</summary>
<div class=algorithm-steps>
1. Let |effectiveFilterSize| be ( |filterSize| - 1 ) * |dilation| + 1.
1. Let |outputSize| be ( |inputSize| - 1 ) * |stride| + |effectiveFilterSize| - |beginningPadding| - |endingPadding| + |outputPadding|.
1. Return |outputSize|.
</div>
</details>

<details open algorithm>
<summary>
To <dfn for=MLGraphBuilder>calculate convtranspose2d output sizes</dfn> given unsigned integers |inputHeight|, |inputWidth|, |filterHeight| and |filterWidth|, [=/list=] of 4 unsigned integers |padding|, [=/list=] of 2 unsigned integers |strides|, [=/list=] of 2 unsigned integers |dilations|, and [=/list=] of 2 unsigned integers |outputPadding|, perform these steps. They return a [=/list=] of 2 numbers.
</summary>
<div class=algorithm-steps>
1. Let |outputHeight| be the result of [=MLGraphBuilder/calculating convtranspose output size=] given |inputHeight|, |filterHeight|, |padding|[0], |padding|[1], |strides|[0], |dilations|[0], and |outputPadding|[0].
1. Let |outputWidth| be the result of [=MLGraphBuilder/calculating convtranspose output size=] given |inputWidth|, |filterWidth|, |padding|[2], |padding|[3], |strides|[1], |dilations|[1] and |outputPadding|[1].
1. Return « |outputHeight|, |outputWidth| ».
</div>
</details>

<details open algorithm>

<summary>
Expand Down Expand Up @@ -1971,7 +2054,46 @@ partial interface MLGraphBuilder {
1. If |options|.{{MLConvTranspose2dOptions/bias}} [=map/exists=]:
1. If |options|.{{MLConvTranspose2dOptions/bias}}'s [=MLOperand/rank=] is not 1, then [=exception/throw=] a {{TypeError}}.
1. If |options|.{{MLConvTranspose2dOptions/bias}}'s [=MLOperand/dataType=] is not the same as |input|'s [=MLOperand/dataType=], then [=exception/throw=] a {{TypeError}}.
1. Let |outputShape| be the result of invoking the underlying implementation for calculating output dimensions, given |options|.
1. *Calculate the output shape:*
1. Switch on |options|.{{MLConvTranspose2dOptions/inputLayout}}:
<dl class=switch>
: {{MLInputOperandLayout/"nchw"}}
::
1. Let |batches| be |inputShape|[0].
1. Let |channels| be |inputShape|[1].
1. Let |inputHeight| be |inputShape|[2].
1. Let |inputWidth| be |inputShape|[3].
: {{MLInputOperandLayout/"nhwc"}}
::
1. Let |batches| be |inputShape|[0].
1. Let |inputHeight| be |inputShape|[1].
1. Let |inputWidth| be |inputShape|[2].
1. Let |channels| be |inputShape|[3].
</dl>
1. Let |filterShape| be |filter|'s [=MLOperand/shape=].
1. Switch on |options|.{{MLConvTranspose2dOptions/filterLayout}}:
<dl class=switch>
: {{MLConvTranspose2dFilterOperandLayout/"iohw"}}
::
1. Let |filterHeight| be |filterShape|[2].
1. Let |filterWidth| be |filterShape|[3].
: {{MLConvTranspose2dFilterOperandLayout/"hwoi"}}
::
1. Let |filterHeight| be |filterShape|[0].
1. Let |filterWidth| be |filterShape|[1].
: {{MLConvTranspose2dFilterOperandLayout/"ohwi"}}
::
1. Let |filterHeight| be |filterShape|[1].
1. Let |filterWidth| be |filterShape|[2].
</dl>
1. Let |outputSizes| be the result of [=MLGraphBuilder/calculating convtranspose2d output sizes=] given |inputHeight|, |inputWidth|, |filterHeight|, |filterWidth|, |options|.{{MLConvTranspose2dOptions/padding}}, |options|.{{MLConvTranspose2dOptions/strides}}, |options|.{{MLConvTranspose2dOptions/dilations}}, and |options|.{{MLConvTranspose2dOptions/outputPadding}}.
1. Switch on |options|.{{MLConvTranspose2dOptions/inputLayout}}:
<dl class=switch>
: {{MLInputOperandLayout/"nchw"}}
:: Let |outputShape| be « |batches|, |channels|, floor( |outputSizes|[0] ), floor( |outputSizes|[1] ) ».
: {{MLInputOperandLayout/"nhwc"}}
:: Let |outputShape| be « |batches|, floor( |outputSizes|[0] ), floor( |outputSizes|[1] ), |channels| ».
</dl>
1. If |outputShape| is not the same as the shape of |options|.{{MLConvTranspose2dOptions/bias}}'s [=MLOperand/shape=], then [=exception/throw=] a "{{DataError}}" {{DOMException}}.
1. Let |desc| be a new {{MLOperandDescriptor}}.
1. Set |desc|.{{MLOperandDescriptor/dataType}} to |input|'s [=MLOperand/dataType=].
Expand Down Expand Up @@ -4407,6 +4529,54 @@ partial interface MLGraphBuilder {
</pre>
</div>

<details open algorithm>
<summary>
To <dfn for=MLGraphBuilder>calculate pool2d output sizes</dfn> given {{MLInputOperandLayout}} |layout|, [=/list=] of 4 unsigned integers |inputShape|, {{MLRoundingType}} |roundingType|, [=/list=] of 2 unsigned integers |windowDimensions|, [=/list=] of 4 unsigned integers |padding|, [=/list=] of 2 unsigned integers |strides|, [=/list=] of 2 unsigned integers |dilations|, and optional [=/list=] of 2 unsigned integers |outputSizes|, perform these steps. They return a [=/list=] of 4 unsigned integers.
</summary>
<div class=algorithm-steps>
1. Switch on |layout|:
<dl class=switch>
: {{MLInputOperandLayout/"nchw"}}
::
1. Let |batches| be |inputShape|[0].
1. Let |channels| be |inputShape|[1].
1. Let |inputHeight| be |inputShape|[2].
1. Let |inputWidth| be |inputShape|[3].
: {{MLInputOperandLayout/"nhwc"}}
::
1. Let |batches| be |inputShape|[0].
1. Let |inputHeight| be |inputShape|[1].
1. Let |inputWidth| be |inputShape|[2].
1. Let |channels| be |inputShape|[3].
</dl>
1. If |outputSizes| is not given, then:
inexorabletash marked this conversation as resolved.
Show resolved Hide resolved
1. Let |outputHeight| be |outputSizes|[0].
1. Let |outputWidth| be |outputSizes|[1].
1. Otherwise:
1. Let |outputSizes| be the result of [=MLGraphBuilder/calculating conv2d output sizes=] given |inputHeight|, |inputWidth|, |windowDimensions|[0], |windowDimensions|[1], |padding|, |strides|, and |dilations|.
1. Let |outputHeight| be |outputSizes|[0].
1. Let |outputWidth| be |outputSizes|[1].
1. Switch on |roundingType|
<dl class=switch>
: {{MLRoundingType/"floor"}}
::
1. Set |outputWidth| to floor(|outputWidth|).
1. Set |outputHeight| to floor(|outputHeight|).
: {{MLRoundingType/"ceil"}}
::
1. Set |outputWidth| to ceiling(|outputWidth|).
1. Set |outputHeight| to ceiling(|outputHeight|).
</dl>
1. Switch on |layout|:
<dl class=switch>
: {{MLInputOperandLayout/"nchw"}}
:: Return « |batches|, |channels|, |outputHeight|, |outputWidth| ».
: {{MLInputOperandLayout/"nhwc"}}
:: Return « |batches|, |outputHeight|, |outputWidth|, |channels| ».
</dl>
</div>
</details>

<details open algorithm>
<summary>
To <dfn for="MLGraphBuilder" data-lt="pooling-op">create pooling operation</dfn> given [=string=] |op|, {{MLOperand}} |input| and {{MLPool2dOptions}} |options|, run the following steps:
Expand All @@ -4430,7 +4600,7 @@ partial interface MLGraphBuilder {
1. Let |desc| be a copy of |input|.{{MLOperand/[[descriptor]]}}.
1. If any of the following sub-steps fail, [=exception/throw=] an "{{OperationError}}" {{DOMException}}.
1. Make a request to the underlying platform to:
1. Calculate the output dimensions given |input| and |options|. Set |desc|.{{MLOperandDescriptor/dimensions}} to that.
1. Set |desc|.{{MLOperandDescriptor/dimensions}} to the result of [=MLGraphBuilder/calculating pool2d output sizes=] given |options|.{{MLPool2dOptions/layout}}, |input|'s [=MLOperand/shape=], |options|.{{MLPool2dOptions/roundingType}}, |options|.{{MLPool2dOptions/windowDimensions}}, |options|.{{MLPool2dOptions/padding}}, |options|.{{MLPool2dOptions/strides}}, |options|.{{MLPool2dOptions/dilations}}, and |options|.{{MLPool2dOptions/outputSizes}} (if it [=map/exists=]).
1. Let |output| be the result of [=creating an MLOperand=] given [=this=] and |desc|.
1. Let |opImpl| be [=platform operator=] for the |op| pooling operation, given |options|.
1. Set |output|.{{MLOperand/[[operator]]}} to |opImpl|.
Expand Down Expand Up @@ -4589,14 +4759,33 @@ partial interface MLGraphBuilder {
- *SumSquare*: Compute the sum of the square of all the input values along the axes.
</div>

<details open algorithm>
<summary>
To <dfn for="MLGraphBuilder">calculate reduction output sizes</dfn>, given a [=/list=] of unsigned integers |inputShape|, a optional [=/list=] of unsigned integers |axes|, and [=/boolean=] |keepDimensions|, perform the following steps. They return a new [=/list=] of unsigned integers.
</summary>
<div class=algorithm-steps>
1. Let |inputSize| be |inputShape|'s [=list/size=].
inexorabletash marked this conversation as resolved.
Show resolved Hide resolved
1. If |axes| is not given, let |axes| be [=the range=] 0 to |inputSize|, exclusive.
1. If |keepDimensions| is true, then:
1. Let |outputShape| be a [=list/clone=] of |inputShape|.
1. [=list/For each=] |axis| of |axes|:
1. Set |outputShape|[|axis|] to 1.
1. Otherwise:
1. Let |outputShape| be an empty [=/list=].
1. [=list/For each=] |index| in [=the range=] 0 to |inputSize|, exclusive:
1. If |axes| does not [=list/contain=] |index|, then [=list/append=] |inputShape|[|index|].
fdwr marked this conversation as resolved.
Show resolved Hide resolved
1. Return |outputShape|.
</div>
</details>

<details open algorithm>
<summary>
To <dfn for="MLGraphBuilder" data-lt="reduce-op">create reduce operation</dfn> given [=string=] |op|, {{MLOperand}} |input| and {{MLReduceOptions}} |options|, run the following steps:
</summary>
<div class=algorithm-steps>
1. [=Assert=]: |op| is one of "reduceL1", "reduceL2", "reduceLogSum", "reduceLogSumExp", "reduceMax", "reduceMean", "reduceMin", "reduceProduct", "reduceSum", "reduceSumSquare".
1. If |options|.{{MLReduceOptions/axes}} [=map/exists=], if any of its elements is not in [=the range=] 0 to |input|'s [=MLOperand/rank=], exclusive, then [=exception/throw=] a "{{DataError}}" {{DOMException}}.
1. Let |outputShape| be the result of invoking the underlying implementation for calculating reduction output dimensions, given |options|.
1. Let |outputShape| be the result of [=MLGraphBuilder/calculating reduction output sizes=] given |input|'s [=MLOperand/shape=], |options|.{{MLReduceOptions/axes}} (if it [=map/exists=]), and |options|.{{MLReduceOptions/keepDimensions}}.
1. Let |desc| be a new {{MLOperandDescriptor}}.
1. Set |desc|.{{MLOperandDescriptor/dataType}} to |input|'s [=MLOperand/dataType=].
1. Set |desc|.{{MLOperandDescriptor/dimensions}} to |outputShape|.
Expand Down
Loading