Reduce use of DataLayouts internals #1934
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is basically a peel off from #1929, and extends #1932. This PR:
parent(data)
and especiallysize(parent(data))
get_n_items
, and instead only integers are passed (when needed)UniversalSize
to leverage static parameters moreDataLayouts.array_size
andDataLayouts.farray_size
, which return static sizes of the datalayouts (array_size
uses a field dimension of 1, which is often helpful)This does increase some inference failures because I've passed some additional variables through to gpu kernels using
Val
, which should result in statically known offsets, potentially improving performance of some kernels (likely by no more than 30%).