Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conventions: Use "rank" for variable names, when appropriate #627

Merged

Conversation

inexorabletash
Copy link
Member

@inexorabletash inexorabletash commented Mar 28, 2024

Fixes #588


Preview | Diff

@inexorabletash inexorabletash requested a review from fdwr March 28, 2024 21:11
@inexorabletash
Copy link
Member Author

@fdwr - can you take a look?

I think the remaining uses of "size" in variable names refer to element counts or magnitudes (e.g. width and height). Maybe I missed some? I was on the fence about the broadcast algorithms that still use sizeA/sizeB. I'm happy to convert those over, but wanted your opinion.

If you want to search the rendered spec for "size" I find it handy to remove the IDL and code samples, e.g. running this on the console [...document.querySelectorAll('pre')].forEach(n=>n.remove())

@fdwr
Copy link
Collaborator

fdwr commented Mar 29, 2024

Thanks Josh 🙂. Scanning for all "size" occurrences, I only found one other operator to replace with rank, plus some dubious size-related statements. Do these proposed edits look valid to you?

batchNormalization's current wording kinda makes it sound like the bias can only be a scalar (1 element) rather than a 1D array:

    1. If |options|.{{MLBatchNormalizationOptions/scale}} [=map/exists=]:
-       1. If its [=list/size=] is not 1, then [=exception/throw=] a {{TypeError}}.
+       1. If its [=MLOperand/rank=] is not 1, then [=exception/throw=] a {{TypeError}}.
        1. If |options|.{{MLBatchNormalizationOptions/scale}}'s [=MLOperand/shape=][0] is not equal to |input|'s [=MLOperand/shape=][|options|.{{MLBatchNormalizationOptions/axis}}], then [=exception/throw=] a {{TypeError}}.
    1. If |options|.{{MLBatchNormalizationOptions/bias}} [=map/exists=]:
-       1. If its [=list/size=] is not 1, then [=exception/throw=] a {{TypeError}}.
+       1. If its [=MLOperand/rank=] is not 1, then [=exception/throw=] a {{TypeError}}.
        1. If |options|.{{MLBatchNormalizationOptions/bias}}'s [=MLOperand/shape=][0] is not equal to |input|'s [=MLOperand/shape=][|options|.{{MLBatchNormalizationOptions/axis}}], then [=exception/throw=] a {{TypeError}}.
    1. *Make graph connections:*

Since the rest of these are size related, but not about the rank, would you rather these be a separate CR?

reduce currently sounds like it preserves any input dimensions of length 1, but actually it sets reduced output dimensions to length 1. Also list size looks wrong because it's not about the list size, but rather the dimension values inside the list:

    : <dfn>keepDimensions</dfn>
    ::
-        If true, retains reduced dimensions with [=list/size=] 1.
+        If true, the output has the same rank as the input, setting any reduced dimensions to size 1.

slice checking the list size for 0 is redundant because there's a check below that the list size == the input rank anyway (and arguably rejecting an empty slice list for scalars is wrong anyway). I believe this statement was intended to check each dimension length != 0 inside the list rather than the list size:

<details open algorithm>
  <summary>
    The <dfn method for=MLGraphBuilder>slice(|input|, |starts|, |sizes|)</dfn> method steps are:
  </summary>
    1. If [=MLGraphBuilder/validating operand=] with [=this=] and |input| returns false, then [=exception/throw=] a {{TypeError}}.
-   1. If |sizes|'s [=list/size=] is 0, then [=exception/throw=] a {{TypeError}}.
+   1. If any of |sizes|'s have size 0, then [=exception/throw=] a {{TypeError}}.
    1. If |starts|'s [=list/size=] and |sizes|'s [=list/size=] are not both equal to |input|'s [=MLOperand/rank=], then [=exception/throw=] a {{TypeError}}.

I'm okay with sizeA here because it's actually about the shape's size. Though, somebody reported to me that Let sizeA be A’s size was confusing because it sounded like it meant tensor A (since we often use capital letters for tensors), but in this case A is actually a shape. How about:

- To unidirectionally broadcast the shapes A and B, perform the following steps. A and B are [lists](https://infra.spec.whatwg.org/#list) of positive integers, representing the dimensions of tensors, and the steps return a new [list](https://infra.spec.whatwg.org/#list) of positive integers, or failure.
+ To unidirectionally broadcast the shapes shapeA and shapeB, perform the following steps. shapeA and shapeB are [lists](https://infra.spec.whatwg.org/#list) of positive integers, representing the dimensions of tensors, and the steps return a new [list](https://infra.spec.whatwg.org/#list) of positive integers, or failure.

- 1. Let |sizeA| be |A|'s [=list/size=].
+ 1. Let |sizeA| be |shapeA|'s [=list/size=].
- 1. Let |sizeB| be |B|'s [=list/size=].
+ 1. Let |sizeB| be |shapeB|'s [=list/size=].
1. If |sizeB| > |sizeA|, then return failure.
- 1. Let |paddedB| be a [=list/clone=] of |B|.
+ 1. Let |paddedB| be a [=list/clone=] of |shapeB|.
1. While |paddedB|'s [=list/size=] is less than |sizeA|, [=list/prepend=] 1 to |paddedB|.
1. Let |outputShape| be a new [=/list=].
1. [=list/For each=] |index| in [=the range=] 0 to |sizeA|, exclusive:
-    1. Let |dimA| be |A|[|index|].
+    1. Let |dimA| be |shapeA|[|index|].
    1. Let |dimB| be |paddedB|[|index|].
    1. If |dimA| is not equal to |dimB| and |dimA| is not equal to 1, then return failure.
    1. [=list/Append=] |dimA| to |outputShape|.
1. Return |outputShape|.

@inexorabletash
Copy link
Member Author

I only found one other operator to replace with rank

Just to confirm: this was referring to the batchNormalization issues, not a separate issue?

would you rather these be a separate CR?

Nah, too much noise, this is all related.

, plus some dubious size-related statements. Do these proposed edits look valid to you?

Yes - incorporated verbatim in 38f9261 except for in slice()

-   1. If |sizes|'s [=list/size=] is 0, then [=exception/throw=] a {{TypeError}}.
+   1. If any of |sizes|'s have size 0, then [=exception/throw=] a {{TypeError}}.

I made this change instead:

-    1. If |sizes|'s [=list/size=] is 0, then [=exception/throw=] a {{TypeError}}.
+    1. If any of |sizes|'s [=list/items=] are 0, then [=exception/throw=] a {{TypeError}}.

.. and made sure |A||shapeA| etc everywhere. I'll add a convention note.

Copy link
Collaborator

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM JB. TY. 😎

@fdwr
Copy link
Collaborator

fdwr commented Apr 3, 2024

this was referring to the batchNormalization issues, not a separate issue?

Yep, it looked like you found every other case of size -> rank for other operators.

@huningxin Any thoughts on this one?

@inexorabletash inexorabletash requested a review from huningxin April 3, 2024 19:48
Copy link
Contributor

@huningxin huningxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @inexorabletash !

@huningxin huningxin merged commit e52f163 into webmachinelearning:main Apr 4, 2024
2 checks passed
github-actions bot added a commit that referenced this pull request Apr 4, 2024
SHA: e52f163
Reason: push, by huningxin

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@inexorabletash inexorabletash deleted the conventions-rank-not-size branch April 4, 2024 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rename inputSize variables as inputRank in algorithms
3 participants