Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pointer and length types to WasmType. #1423

Merged
merged 3 commits into from
Feb 20, 2024

Conversation

sunfishcode
Copy link
Member

In anticipation of memory64, provenance in Rust, and guest bindings with instrumented pointers, add Pointer and Length types to WasmType, and use it in the ABI for list, strings, and argument/return value pointers. Consumers that don't have anything special to do with these can handle them both the same as i32.

And, because the variant type Canonical ABI can unify pointers with integer types like i64, add a Pointer64 type as well, which represents a conceptual union of a pointer and an i64. Consumers that don't have anything special to do with this type can handle it the same as an i64.

In anticipation of memory64, provenance in Rust, and guest bindings with
instrumented pointers, add `Pointer` and `Length` types to `WasmType`,
and use it in the ABI for list, strings, and argument/return value
pointers. Consumers that don't have anything special to do with these can
handle them both the same as `i32`.

And, because the variant type Canonical ABI can unify pointers with
integer types like `i64`, add a `Pointer64` type as well, which
represents a conceptual union of a pointer and an `i64`. Consumers that
don't have anything special to do with this type can handle it the same
as an `i64`.
@sunfishcode
Copy link
Member Author

One awkward thing about this is that we no longer get signature deduping between two types that differ only in i32 vs. pointer, as seen in the test change here.

@pchickey
Copy link
Contributor

Can you say more about provenance in Rust, and guest bindings with instrumented pointers?

I'm apprehensive that this is the right way to solve this problem - right now the WasmType corresponds to the spec Number Types and these pointer and length types don't appear anywhere in the spec. I would need to understand the intended way these new AST nodes are used to give a more concrete suggestion, though.

@alexcrichton
Copy link
Member

Not deduping some types I think is fine, everything is dedup'd at the runtime layer anyway typically.

One thing that I'd find helpful to review this, which I think might also help address @pchickey's concern, could this be integrated with wit-bindgen to see what the changes there might be? I think that's the main consumer of WasmType apart from Wasmtime and Wasmtime would probaby look quite similar to wit-bindgen

@sunfishcode
Copy link
Member Author

Rust is adopting a semantic model where pointers conceptually carry additional "provenance" information which describes what objects may be accessed. We should ideally make the bindings use pointer types like *mut c_void instead of i32 when carrying pointer values at the Rust level. Pointers codegen down to i32 at the Wasm level, but give the Rust compiler better information at the Rust level.

For instrumented pointers, I'm thinking about systems that use fat-pointer techniques to add spatial or temporal memory safety bookkeeping to pointers. Not everyone will want to do this, but for people who do, the bindings generator will want to know which values are pointers.

Here's a preview of what this looks like it wit-bindgen:

sunfishcode/wit-bindgen@a939252

It's not done yet; in particular, the RET_AREA variable needs bigger changes, but it does illustrate the high-level idea.

@alexcrichton
Copy link
Member

Oh nice! That does look pretty self-contained yeah.

Some questions I have from this approach:

  • What is Pointer64? I would naively expect Pointer and Length types plus some external configuration about what size they are.
  • The lowering for Length in wit-component is always i32, but I would have otherwise expected something about threading through the memory's type that the canon lower was attached to to the point of conversion. That way you'd convert Length in the context of a memory type to a primitive, in which case you'd pick the index size.

@sunfishcode
Copy link
Member Author

sunfishcode commented Feb 20, 2024

Pointer64 is for when there's eg. a variant<string, s64> and the Canonical ABI unifies the string's pointer with the i64. Neither *mut T nor i64 is sufficient, because *mut T might not be 64-bit, and i64 doesn't carry the provenance. So it seems we need a Pointer64 type which is like a union between a pointer and an i64 and able to carry either.

@sunfishcode
Copy link
Member Author

Concerning Length, this PR doesn't implement memory64; it's just adding some infrastructure that I expect memory64 will need. I agree, memory64 will need to thread the linear-memory index type through more places.

@alexcrichton
Copy link
Member

Ah I see, I misunderstood Pointer64. Perhaps that could be named PointerOrI64 to help disambiguate a bit?

I'm also not sure how this will be represented in Rust. In the bindings you're representing it currently as (*mut T, *mut T) but I'm not sure that'll do the right thing of expanding to two arguments? (or if it does is it guaranteed to do so? I feel like I still haven't lived down "Alex made wasm32-unknown-unknown entirely incompatible with C" so I'm hesitant to rely on ABI lowerings as-is in rustc again.

@sunfishcode
Copy link
Member Author

PointerOrI64 makes sense.

Yeah, (*mut T, *mut T) isn't great. I still need to figure out something to do there, because it turns out #[repr(transparent)] on a union is unstable. But that's entirely on the wit-bindgen side.

@alexcrichton alexcrichton added this pull request to the merge queue Feb 20, 2024
@alexcrichton
Copy link
Member

Ok looks good to me, and thanks again for this!

Merged via the queue into bytecodealliance:main with commit 80ce319 Feb 20, 2024
17 checks passed
@sunfishcode sunfishcode deleted the sunfishcode/stuff branch February 20, 2024 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants