Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bounded Dynamism RFC #194

Closed
wants to merge 17 commits into from
Closed

Bounded Dynamism RFC #194

wants to merge 17 commits into from

Conversation

smit-hinsu
Copy link
Contributor

@smit-hinsu smit-hinsu commented Sep 26, 2022

RFC describing the current state and future plans for bounded dynamism in StableHLO. This includes the tensor type with bounds, op semantics and bounded dynamism ops.

@burmako burmako self-requested a review September 26, 2022 06:18
@burmako burmako added the RFC label Sep 26, 2022
@burmako burmako self-assigned this Sep 27, 2022
Copy link
Contributor

@burmako burmako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your proposal and for your patience as we're working out the StableHLO evolution process!

I think that this RFC provides a good summary of the current state of bounded dynamism in *HLO opsets in (P1)-(P3). Both the statically-shaped spec (spec_draft.md) and the unbounded dynamism RFC (#8) are still in the works, but once they are finished I believe that this RFC will work well as the delta that specifically introduced bounded dynamism.

My main comment is about (P4) which I believe needs further work, both with respect to technical details of the proposal and the organizational details of what this proposal means for the community. PTAL!

docs/bounded_dynamism.md Outdated Show resolved Hide resolved
docs/bounded_dynamism.md Outdated Show resolved Hide resolved
docs/bounded_dynamism.md Outdated Show resolved Hide resolved
docs/bounded_dynamism.md Outdated Show resolved Hide resolved
docs/bounded_dynamism.md Outdated Show resolved Hide resolved
docs/bounded_dynamism.md Outdated Show resolved Hide resolved
docs/bounded_dynamism.md Outdated Show resolved Hide resolved
docs/bounded_dynamism.md Outdated Show resolved Hide resolved
docs/bounded_dynamism.md Outdated Show resolved Hide resolved
@smit-hinsu smit-hinsu force-pushed the main branch 2 times, most recently from 3861065 to 2c3021c Compare October 7, 2022 05:28
Copy link

@sherhut sherhut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some drive-by comments.

docs/bounded_dynamism.md Outdated Show resolved Hide resolved
docs/bounded_dynamism.md Outdated Show resolved Hide resolved
docs/bounded_dynamism.md Outdated Show resolved Hide resolved
@smit-hinsu
Copy link
Contributor Author

Thanks for the feedback!

I am talking with stakeholders to further refine this proposal and expand on the future vision and roadmap.

Should be able to have an update in a week.

docs/bounded_dynamism.md Outdated Show resolved Hide resolved
docs/bounded_dynamism.md Outdated Show resolved Hide resolved
@burmako burmako mentioned this pull request Nov 24, 2022
@burmako burmako assigned smit-hinsu and unassigned burmako Nov 27, 2022
burmako pushed a commit that referenced this pull request Dec 6, 2022
Implements TODO to add verification for bounds.

This is in preparation for moving bounds out of experimental as part of
#194

I considered doing this in the RFC itself but better to do the code
change separately and immediately given that these invariants are
already assumed during type inference.
@smit-hinsu
Copy link
Contributor Author

Addressed review comments. Thanks to @burmako for extensive offline discussions and help!

Please take another look.

@smit-hinsu smit-hinsu removed their assignment Dec 8, 2022
@burmako burmako self-assigned this Dec 8, 2022
Copy link

@rjpower rjpower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sharing this. Overall it looks good, and it's great to see this being documented thoroughly. I'm not an expert on bounded dynamism, so I'll reserve judgement for folks who have more experience here.

I mostly left nits on wording and examples, though I'd like to better understand how we plan to unify the different mechanisms here.

rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
Copy link
Contributor

@burmako burmako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Smit! Thank you for the updated RFC and for your continued hard work on working out a story for bounded dynamism in StableHLO.

To the best of my understanding, at the high level this RFC can be summarized as follows:

  1. Let's document bounded tensor type, GetDimensionSizeOp and SetDimensionSizeOp that have been part of the StableHLO opset since its creation when it was forked from MHLO.
  2. Bounded dynamism can probably be reframed as a special case of unbounded dynamism, so let's pave the way towards migrating from the former to the latter in the long term.

Below, I will provide feedback on these two parts separately:

1) Some of the concepts that are used to discuss part 1 are rather informal in comparison with the spec, but I think that's fine. We haven't yet fully worked out the StableHLO formalism (although it feels that we're just a few weeks away from it being relatively complete), and in the meanwhile we have to be able to make forward progress, which this RFC achieves.

The only major change to part 1 that I would request (in addition to a bunch of smaller comments) is for this PR to also update spec.md with sections for GetDimensionSizeOp and SetDimensionSizeOp.

2) As far as part 2 goes, my understanding is that it is currently a hypothesis that bounded dynamism can be reframed a special case of unbounded dynamism. This hypothesis looks plausible to me, but until we have strong theoretical and/or experimental evidence, I don't think we should recommend that "Producers are encouraged to use the unbounded dynamic operations" just yet.

Similarly, I would request to mark P4 as aspirational and sharpen it a little bit. For example, at the moment, it is not quite clear to me what the expected takeaway from P4 should be. It compares unbounded and bounded programs that do something similar, calls out the potential (i.e. plausible but not yet realized) benefits of unbounded programs and... that's it. Perhaps reformulate P4 or add some kind of a conclusion at the end?

3) Finally, I would like to better understand the proposed sequencing of this RFC with the upcoming dynamism RFC. E.g. P2 says "All ops that support dynamic operands or results can have bounds specified for them", but we haven't yet documented which StableHLO ops should document dynamic operands or result.

Does this mean that the proposal is to wait until the dynamism RFC is worked out and then review this RFC? Or the idea is to discuss this RFC in the context of what's currently roughly planned for the dynamism RFC and seek an advance approval?

rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
@burmako burmako assigned smit-hinsu and unassigned burmako Dec 19, 2022
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
Copy link

@miladm miladm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. I made a few comments below for you to help address.

Overall, I take it that the existing set_dimension_size and get_dimension_size APIs work as intended (i.e. identical to what the downstream XLA provides); the new dynamism APIs proposed here unlock unbounded dynamism which is theorized to be more flexible than the XLA bounded dynamism API. Correct me if I missed a key takeaway here plz.

rfcs/20220925-bounded-dynamism.md Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Outdated Show resolved Hide resolved
rfcs/20220925-bounded-dynamism.md Show resolved Hide resolved
docs/bounded_dynamism.md Outdated Show resolved Hide resolved

Bounded dynamism allows programs to represent the maximum runtime size that a
particular dynamic dimension of a tensor can have. This makes it possible to run
such programs on hardware that doesn't support dynamic tensors but could support
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is mostly a limitation of the software stack rather than the hardware when this happens. Now it may be that the hardware makes it difficult and or the resultant program's performance or space requirements get too great for the HW, but it's not an intrinsic HW issue. Perhaps hardware -> platforms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, better to be generic.

to separate approval. Note that producers that generate unbounded programs
don't need this
in StableHLO.
* Provide a transformation that converts StableHLO programs to bounded StableHLO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is bound propagation in scope? E.g., given program with bounds on inputs/certain ops, propagate it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not. There is some related work in #622


## Detailed Proposal

### (P1) Bounded tensor type using the encoding field in the RankedTensorType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets start with adding a "wrapper" class so that one can use LLVM style RTTI (e.g., dyn_cast) and then use its convenience methods. In particular this would allow flipping extension attribute to an interface later if needed without changes in users. The implementation details of how stored and actual type remains the same, but C++ API is more shielded from underlying [in particular I do want to be able to reference a context as in value inference above]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea!

Filed #812

`RankedTensorType` encoding field. Bounds in `TypeExtensionsAttr` is an
`int64_t` array of size equal to rank of the tensor. Values corresponding to
static dimensions must be `ShapedType::kDynamicSize` which is printed as `?` in
the IR. Disallowing a static bound value for static dimensions makes the IR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative is to only represent bounds for dynamic dims (e.g., size of bounds == number of dynamic dims).

Now XLA today uses bounds here to effectively represent a view into a padded structure (that and the API implementations thereof are implementation details of the interpretations there rather than represented in StableHLO or type system) and in that world padding of static shapes make sense. As the padded values represent the shape of the underlying memory. Now Tensor type here does not represent memory, but just to say one can still have dynamic or static dims in tensor type while having static bounds and it would make sense for some users. (its not a use case I'm particularly interested in as those should be memrefs as there is explicit memory).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we went with the simple list because it will provide easier accessor but that doesn't have to part of the IR given your comment about wrapper that could provide other views. I think we should at least weigh this.

%arg3: tensor<2xf32>,
%arg4: tensor<4xf32>) {
// %arg0 is compatible with %arg1, %arg2 and %arg3 as bounded types could have
// tensor<2xf32> type during runtime.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same could be said for a tensor<?xf32> and tensor<*xf32> input too. Is that allowed operand?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

%arg4: tensor<4xf32>) {
// %arg0 is compatible with %arg1, %arg2 and %arg3 as bounded types could have
// tensor<2xf32> type during runtime.
%0 = "stablehlo.add"(%arg0, %arg1) : (tensor<?xf32, #stablehlo.type_extensions<bounds = [3]>>, tensor<?xf32, #stablehlo.type_extensions<bounds = [2]>>) -> tensor<?xf32, #stablehlo.type_extensions<bounds = [2]>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From type inference point of view, this op here and its behavior effectively adds constraint that arg0 is bounded by 2 and the next does the same for arg2, and 70 makes it required to be. So given elementwise constraint on add we can therefore backpropagate that these aren't actually dynamic but all are tensor<2xf32>. Is that consistent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we could do that. For verification, we end up only checking the direct and not doing aggressive analysis to infer more information but that can be done separately.

```

Currently, the StableHLO dialect uses the MLIR ranked tensor type to represent
bounds. In the future we plan to introduce StableHLO type which supports bounds,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where it is good if you used the little wrapper class approach above to shield C++ changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #812

### (P3) get\_dimension\_size / set\_dimension\_size ops

The `get_dimension_size` op takes a tensor and a dimension index and returns the
runtime size as `tensor<i32>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not use tensor<index>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of generalizing to i64 and index as well for consistency. But given there is no requirement now. Seemed better to follow XLA until some use-case comes up.

Currently, the StableHLO dialect uses the MLIR ranked tensor type to represent
bounds. In the future we plan to introduce StableHLO type which supports bounds,
along with a custom pretty printing format. There will be a separate RFC on
this. Also, the proposal will follow StableHLO backward compatibility policies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be good to show how function calls will be handled. E.g., are we keeping to the constraints imposed by func.func and func.call ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that is imposed by func op and we are not making any changes there.

burmako pushed a commit that referenced this pull request Dec 21, 2022
Verification is missing two constraints. Filed
#790 for that.

RFC discussing this op: #194
@smit-hinsu
Copy link
Contributor Author

Update: We decided to merge this with #8 and have a single proposal covering both bounded and unbounded. Follow that issue to get updates and make any suggestions.

@smit-hinsu smit-hinsu closed this Dec 22, 2022
@burmako burmako mentioned this pull request Jul 5, 2023
@GleasonK GleasonK mentioned this pull request Dec 12, 2023
GleasonK added a commit that referenced this pull request Feb 5, 2024
The current design of dynamism in MHLO and StableHLO has been
practically useful. There are success stories of it connecting JAX,
PyTorch and TensorFlow to a number of compilers, in a mix of research
and production environments. This RFC aims to leverage existing support
for dynamism in the StableHLO dialect, discuss improvements to the
existing design and then formalize the improved design in the StableHLO
opset.

The main challenge with writing this RFC was that it affects the entire
opset. The current design involves a lot of corner cases, so it took
about a year of practical evaluation by the author - primarily within
JAX native serialization, but also correlating with other ongoing and
prior projects - to distill the design into just a few general design
principles. I think I'm happy with the outcome, but please take a look
at the "Summary" section for what that entails.

This RFC addresses a considerable chunk of community feedback on the
existing design, but some feedback is deliberately left out of scope for
this RFC to enable incremental progress while areas which require
additional alignment are developed in parallel. See sections "Community
feedback" and "Out of scope" for details.

The only open question at the moment is interoperability with HLO with
respect to bounded dynamism. The proposed representation for bounded
dynamic types in StableHLO is in parity with the representation for
bounded dynamic types in HLO. However, I'm not sure whether this parity
covers 100% of bounded dynamism functionality. For example: 1) there
appears to be a mismatch in how broadcasts are represented, 2) there is
misalignment in representations of dynamic windows (HLO has a high-level
representation: VALID and SAME, whereas StableHLO expects the producers
to dynamically compute window sizes). Nonetheless, I think that this
shouldn't block the initial review of the RFC, since there's a lot of
stuff to discuss - and in the meanwhile, I'll be working on confirming
interoperability with HLO.

Finally, I'd like to acknowledge Smit Hinsu's work on the [Bounded
Dynamism RFC](#194) from Q4
2022, which was superseded by this work. The representation for bounded
dynamic types in the StableHLO dialect was designed and implemented by
Smit, and Smit's proposal to allow bounded dynamic types everywhere is
compatible with the more general proposal from this RFC to enable
dynamism for all size-related program elements. Furthermore, Smit
contributed the formal spec for get_dimension_size as well as the
informal spec for set_dimension_size.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.