-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wording change: Tighten up output shape calculation algorithms #523
Wording change: Tighten up output shape calculation algorithms #523
Conversation
For matmul(), the current text is:
Should the former be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks.
49e76e6
to
9596953
Compare
index.bs
Outdated
1. [=map/For each=] |index| in [=the range=] 0 to |size|, exclusive: | ||
1. Set |shape|[|index|] to the maximum of |shapeA|[|index|] and |shapeB|[|index|]. | ||
1. [=list/Append=] the maximum of |shapeA|[|index|] and |shapeB|[|index|] to |shape|. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: [=list/append=] for consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would index
be out-of-bounds if the size of a particular shape is less than size
?
And I think the previous algorithm itself has issue. If the size of a shape is less than maximum size, this shape should be prepended 1 until its size is equal to maximum size before entering this loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: [=list/append=] - It's written with a capital A so that the start of the sentence is capitalized. Bikeshed is smart enough to case-insensitive match the mixed-case version, but not smart enough to capitalize the start of the sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: previous algorithm - agreed, this looks "sus".
With the caveat that I'm still new to tensors, my intuition at least according to the prose ("If either a or b is N-dimensional where N > 2, it is treated as a stack of matrices with dimensions corresponding to the last two indices...") is that we need the end result to be:
- sizeA == sizeB (achieved by prepending - how does appending come in???)
- sizeA > 1 (via appending/prepending; alternately we reject per Simplify
matmul
op #470) - sizeB > 1 (via appending/prepending; alternately we reject per Simplify
matmul
op #470) - shapeA[sizeA - 1] = shapeB[sizeB - 2] (i.e. columns in first matrix = rows in second matrix)
... and the algorithm as written doesn't seem to either try to achieve this or enforce it.
Maybe someone that knows could weigh in with a more precise description or pseudocode? I can translate to spec language!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chromium implementation does this:
- if sizeA < 2 || sizeB < 2, throw -- per Simplify
matmul
op #470 - colsA = shapeA[sizeA - 1]
- rowsA = shapeA[sizeA - 2]
- colsB = shapeB[sizeB - 1]
- rowsB = shapeB[sizeB - 2]
- if colsA != rowsB, throw -- the 2D matrices must match
- slicedA = shapeA[0 : sizeA - 2]
- slicedB = shapeB[0 : sizeB - 2]
- if sizeA > 2 && sizeB > 2
- b = broadcastShapes(slicedA, slicedB, bidirectional = true) or throw
- outputShape = b with [rowsA, colsB] appended
- else if size A == 2 && size B == 2
- outputShape = [rowsA, colsB]
- else (sizeA == 2 or sizeB == 2)
- outputShape = (sizeA > sizeB) ? slicedA : slicedB) with [rowsA, colsB] appended
... where broadcastShapes() is a whole 'nother can of worms.
I notice https://webmachinelearning.github.io/webnn/#mlgraphbuilder-broadcast-shapes says it is implementation-defined - is that true!?!?!?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put #534 up to define broadcasting shapes, so for this PR we could wait for that to land, then rebase this on top of it and I can translate the above into spec text. I'm feeling close to actually understanding what's going on!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice https://webmachinelearning.github.io/webnn/#mlgraphbuilder-broadcast-shapes says it is implementation-defined - is that true!?!?!?
broadcast shouldn't be implementation-defined, thanks for putting a PR to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest update includes a commit that rewrites the matmul output shape calc to match Chromium's impl. PTAL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huningxin I see you're good with the rest of the CR, and Joshua's last iteration addressed this comment too. So I merged it, but please let me know when you're back from vacation if there's any other little feedback.
9596953
to
93bd980
Compare
This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. For webmachinelearning#324, webmachinelearning#462, and potentially webmachinelearning#523.
This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.
This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.
This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.
93bd980
to
c5314ea
Compare
This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.
This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.
c5314ea
to
5af908b
Compare
This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523.
5af908b
to
4e6cd98
Compare
This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. Use broadcasting definition in expand(), rather than bespoke steps For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523. Co-authored-by: Dwayne Robinson <[email protected]>
This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. Use broadcasting definition in expand(), rather than bespoke steps For webmachinelearning#324, webmachinelearning#378, webmachinelearning#462, and potentially webmachinelearning#523. Co-authored-by: Dwayne Robinson <[email protected]>
* New content: Add definition for shape broadcasting This change introduces a new section for Algorithms, following APIs, to collect algorithms referenced throughout the specification. A section for Broadcasting is introduced, which defines broadcasting shapes and gives an explicit algorithm matching WebNN implementations of NumPy's General Broadcasting Rules. Definitions for "broadcastable" and "unidirectionally broadcastable" are introduced. The previous definition of "broadcast-shapes" is removed in favor of these new algorithms. Use broadcasting definition in expand(), rather than bespoke steps For #324, #378, #462, and potentially #523. Co-authored-by: Dwayne Robinson <[email protected]> * Fix prelu parameter order --------- Co-authored-by: Dwayne Robinson <[email protected]>
For gemm() - make lists that are mutated be explicit clones using Infra terminology, and simplify the wording reversing the lists. For matmul() - make lists that are mutated be explicit clones using Infra terminology, use append/prepend definitions from Infra, convert a variable change from "let" to "set" and drop use of "array". For webmachinelearning#395
4e6cd98
to
30acaf4
Compare
Rebased and force-pushed - sorry about that. There are two commits - the first is unchanged from before, the second is a rewrite of the matmul shape calculations to match the Chromium impl. Please take a look? |
…ng-calc-output-shapes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can reduce some redundant GEMM steps, but otherwise LGTM sir.
Co-authored-by: Dwayne Robinson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Thanks J.
SHA: a2f7e0a Reason: push, by fdwr Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
For gemm() - make lists that are mutated be explicit clones using Infra terminology, and simplify the wording reversing the lists.
For matmul() - make lists that are mutated be explicit clones using Infra terminology, use append/prepend definitions from Infra, convert a variable change from "let" to "set" and drop use of "array".
For #395
Preview | Diff