Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content: Specify output size calculations #582

Merged

Conversation

inexorabletash
Copy link
Member

@inexorabletash inexorabletash commented Feb 24, 2024

This covers:

  • Reduction ops (including argMin/argMax)
  • conv2d
  • convTranspose2d
  • Pooling ops (which rely on conv2d)

Fixes #500


Preview | Diff

@inexorabletash
Copy link
Member Author

This is basically a translation of the Chromium C++ implementation into spec-ese, although parameter validation remains separated out. That can be revisited. And like the Chromium implementation, there's a lot of logic sharing among the ops.

We could also go further with helper ops, e.g. unpacking and re-packing operand layouts given an MLOperandLayout

@zolkis
Copy link
Collaborator

zolkis commented Feb 24, 2024

Wow, this is big! Thanks!

@inexorabletash
Copy link
Member Author

Wow, this is big! Thanks!

Yeah... I didn't know what I was getting myself into.

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
</summary>
<div class=algorithm-steps>
1. Let |effectiveFilterSize| be ( |filterSize| - 1 ) * |dilation| + 1.
1. Let |outputSize| be ( |inputSize| - |effectiveFilterSize| + |beginningPadding| + |endingPadding| ) / |stride| + 1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we handle any errors? Like |outputSize| is overflow or underflow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question... the specs I've been involved in usually operate in an abstract space with infinite precision, unlimited range, etc., with very limited checking.

I noticed the Chromium prototype impl is using checked math (i.e. testing underflow/overflow), and given the sizes that ML deals with it, it seems like this is going to be a practical concern.

We can spec this any way we want... ideas welcome. We should look at other specs for inspiration, too.

Copy link
Collaborator

@fdwr fdwr Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I've always found when implementing specs (like bidi, line breaking, vertical layout...) that I was grateful they left out that degree of fine validation and instead focused on the algorithm itself. Any validation specific to the nature of the operators (e.g. maybe an operator only supports even sizes) belongs in there, but too much otherwise muddies an already complex document. Maybe we can even call that out somewhere in a common section, that the implementation should handle overflow/underflow/safe limit checks, but not as extra prose for each operator.

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Show resolved Hide resolved
Copy link
Collaborator

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😮 This must have taken a while. It's certainly an improvement over "Let outputShape be the result of invoking the underlying implementation for calculating output dimensions, given options". Approved after typo fix. Thanks J.

index.bs Outdated Show resolved Hide resolved
index.bs Show resolved Hide resolved
@inexorabletash inexorabletash marked this pull request as draft February 27, 2024 19:04
@inexorabletash
Copy link
Member Author

inexorabletash commented Feb 27, 2024

Marking as draft. Once #587 merges I'll push a rebased version. I have it locally and it's much smaller; most of the lines now handle the nchw/nhwc layout unpacking/packing.

@inexorabletash inexorabletash marked this pull request as ready for review February 27, 2024 21:52
@inexorabletash
Copy link
Member Author

Okay, rebased - sorry about the forced push, but it should be ready for another review now.

Copy link
Contributor

@huningxin huningxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks much!

@huningxin huningxin merged commit 69e7ad6 into webmachinelearning:main Feb 28, 2024
1 check passed
github-actions bot added a commit that referenced this pull request Feb 28, 2024
SHA: 69e7ad6
Reason: push, by huningxin

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@inexorabletash inexorabletash deleted the content-output-shapes branch February 28, 2024 02:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Define "calculating output dimensions"
5 participants