-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New content: Add definition for shape broadcasting #534
New content: Add definition for shape broadcasting #534
Conversation
I'm still pretty hazy on definitions, so please look at this very skeptically. Re: introducing a new "Algorithms" section - this is a pattern I've adopted from other specs. I thought I'd give a whirl here and see what everyone thinks. It's easy enough to drop and put these definitions anywhere else to keep the overall spec structure unchanged. |
|
8438403
to
74aef99
Compare
Okay, this is updated to match the Chromium impl, I think. But... that may not be what's desired @a-sully mentions that unidirectional broadcasting seems specific to ONNX and other frameworks do things to work around this during export. |
AFAICT ONNX is an outlier in making a distinction between "unidirectional" and "multidirectional" broadcasting. This distinction is not made by:
The "unidirectional broadcastable" constraint of some ONNX ops (e.g. Numpy's broadcasting rules are the clear standard across the industry and IMHO is the standard that we should expose to the web. It then becomes up to the user agent to ensure that the constraints of the underlying framework (e.g. unidirectional broadcasting for ONNX, lack of inferred broadcasting specifications for XLA) are satisfied. |
We follow Numpy broadcast rules for ops (in the chromium implementation), I think it's the golden "standard". +1 to adopt this throughout the spec (I didn't realize the spec matmul referred to an implementation defined broadcast 😱).
Is ONNX unidirectional broadcast issue easy to workaround? The linked prelu() patch seems "easy enough". Should we (from the spec's perspective) expect implementations (e.g. chromium DML backend)to do asid easy workarounds? |
74aef99
to
0712df2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a comment, thanks much!
@fdwr and @wchao1115 , please also take a look. Thanks! |
1b63ff7
to
cf22546
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Josh.
Huh, I would have presumed
Yep, easy with a little reshape of leading 1's.
Whichever broadcasting behavior we have for prelu, unidirectional or bidirectional, it's trivial either way for backends to insert leading 1's as needed (which the Chromium DML backend already does for DML, since DML expects elementwise tensor descriptions of each input to have been massaged to consistent rank first - any higher level concepts like broadcasting already resolved). In the worst case, backends can utilize whatever logic they have for |
312a9e8
to
962ec76
Compare
History was getting messy - I force-pushed a new squashed commit. It rolls in #564 to get the build working, but I'll rebase once that merges. |
This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. Use broadcasting definition in expand(), rather than bespoke steps For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523. Co-authored-by: Dwayne Robinson <[email protected]>
962ec76
to
7ce286f
Compare
Josh: 🤔 Except larger branch integrations where we want to retain history, that's the general policy for cleaner history. Then the CR author doesn't need to ever explicitly squash their changes, because force pushes in GitHub yield this unfortunate result for reviewers to minimize the review delta since last time... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 👞✨
No worries... I tend to squash locally, force-push, and always review the full diff anyway to see what the actual change is going to be. I'll try to refrain from squashing/force-pushing here though, going forward! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more tiny fix after re-review, which I'll just (per your earlier permission) submit and complete this. Thanks Josh.
SHA: a26b587 Reason: push, by fdwr Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification.
A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms.
For #324, #462, and potentially #523.
Preview | Diff