Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm u128/i128 is improperly aligned #133991

Closed
V0ldek opened this issue Dec 7, 2024 · 7 comments · Fixed by llvm/llvm-project#119204
Closed

wasm u128/i128 is improperly aligned #133991

V0ldek opened this issue Dec 7, 2024 · 7 comments · Fixed by llvm/llvm-project#119204
Labels
A-align Area: alignment control (`repr(align(N))` and so on) C-bug Category: This is a bug. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@V0ldek
Copy link
Contributor

V0ldek commented Dec 7, 2024

The u128 and i128 types on the wasm targets has alignment of 8. This is inconsistent with the alignment on all Tier-1 targets.

Compile this code:

#![crate_type = "cdylib"]
#[no_mangle]
pub extern "C" fn align_of_u128() -> usize {
    std::mem::align_of::<u128>()
}

Godbolt link.

The body of this function on all Tier-1 targets is "return 16", while on wasm32-unknown-unknown, wasm32-unknown-emscripten, and wasm32-wasi it compiles to "return 8".

This causes arrays that are aligned on wasm to no longer be aligned when pointers are passed through FFI to the host. For example, creating a Vec<u128> in wasm and returning it as slice to the host causes misalignment.

I am not sure if this is a compiler bug or if this is for some reason intentional, but it is very annoying. A workaround is to declare your own this wrapper to force alignment and use it throughout:

#[repr(C, align(16))]
struct au128(u128);

however that doesn't play well with external libraries that use u128.

Meta

This happens both on stable 1.83 (see the Godbolt link above) and my nightly from 2024-12-05

rustc --version --verbose:

rustc 1.85.0-nightly (c94848c04 2024-12-05)
binary: rustc
commit-hash: c94848c046d29f9a80c09aae758e27e418a289f2
commit-date: 2024-12-05
host: x86_64-unknown-linux-gnu
release: 1.85.0-nightly
LLVM version: 19.1.5
@V0ldek V0ldek added the C-bug Category: This is a bug. label Dec 7, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Dec 7, 2024
@jieyouxu jieyouxu added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ A-align Area: alignment control (`repr(align(N))` and so on) and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Dec 7, 2024
@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Dec 7, 2024

Quite a few non-wasm targets have eight byte alignment for u128 and i128 (e.g., 32-bit Arm and RISC-V targets). I don't know of any that are tier 1 right now, but tiers can change over time, tier 1 is a very high bar, and many tier 2 targets are widely used. In fact, some tier 1 targets had 8-byte-aligned u128 until earlier this year. So I would strongly advice against assuming any particular alignment for those types.

That said, in general i128 and u128 should be interoperable with C's 128-bit integer types when they exist. I don't know when Clang started supporting these types in wasm, but WebAssembly/tool-conventions#240 indicates that they're supported now with 16 byte alignment. So Rust may have to increase the alignment of these types on the wasm targets anyway for interop with C code that's also compiled to wasm. That doesn't change the fact that non-wasm code shouldn't assume it agrees with wasm code on the alignment.

A workaround is to declare your own this wrapper to force alignment and use it throughout:

Another solution is to use u128 on the wasm side but not on the host side, at least not for talking about slices pointing into wasm memory. You can avoid assuming any alignment for things in wasm memory and do unaligned reads via uN::from_le_bytes. Explicitly using little endian byte order also makes the host code portable to big endian platforms (wasm is always little-endian).

@V0ldek
Copy link
Contributor Author

V0ldek commented Dec 7, 2024

The reason I mention Tier 1 targets is that wasm is a bit "special", especially wasi, in that it often needs to interop with host functions. Different alignments between x86 and idk PowerPC are unlikely to matter, whereas wasm is by necessity embedded in another target.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Dec 7, 2024

Sure, but the host code isn't limited to tier 1 Rust targets. As a concrete example, wasmtime explicitly supports several targets that are or were tier 2 in rustc (currently Linux s390x and Linux riscv64, also macOS aarch64 was only promoted to Rust tier 1 recently). Rust target tiers are basically never relevant for "do I need to worry about this target when writing portable code?" because the tiers don't have any inherent meaning, they only reflect how much a target is currently tested in rust-lang/rust CI.

@workingjubilee
Copy link
Member

Yes.

We do not intend to alter our wasm ABI conformance based on non-wasm targets.

We do have to alter the alignment to match the ABI documented by tool-conventions.

@workingjubilee
Copy link
Member

workingjubilee commented Dec 8, 2024

cc @sunfishcode and @alexcrichton Should this be done upstream in LLVM? Like was done in llvm/llvm-project@dbad963 for SPARC.

sunfishcode added a commit to sunfishcode/llvm-project that referenced this issue Dec 9, 2024
Clang [defaults to aligning `__int128_t` to 16 bytes], while LLVM
`datalayout` strings [default to aligning `i128` to 8 bytes]. Wasm is
currently using the defaults for both, so it's inconsistent. Fix this
by adding `-i128:128` to Wasm's `datalayout` string so that it aligns
`i28` to 16 bytes too.

This is similar to dbad963 for SPARC.

This fixes rust-lang/rust#133991; see that issue for further discussion.

[defaults to aligning `__int128_t` to 16 bytes]: https://github.com/llvm/llvm-project/blob/f8b4182f076f8fe55f9d5f617b5a25008a77b22f/clang/lib/Basic/TargetInfo.cpp#L77
[default to aligning `i128` to 8 bytes]: https://llvm.org/docs/LangRef.html#langref-datalayout
sunfishcode added a commit to sunfishcode/llvm-project that referenced this issue Dec 9, 2024
Clang [defaults to aligning `__int128_t` to 16 bytes], while LLVM
`datalayout` strings [default to aligning `i128` to 8 bytes]. Wasm is
currently using the defaults for both, so it's inconsistent. Fix this
by adding `-i128:128` to Wasm's `datalayout` string so that it aligns
`i28` to 16 bytes too.

This is similar to llvm/llvm-project@dbad963 for SPARC.

This fixes rust-lang/rust#133991; see that issue for further discussion.

[defaults to aligning `__int128_t` to 16 bytes]: https://github.com/llvm/llvm-project/blob/f8b4182f076f8fe55f9d5f617b5a25008a77b22f/clang/lib/Basic/TargetInfo.cpp#L77
[default to aligning `i128` to 8 bytes]: https://llvm.org/docs/LangRef.html#langref-datalayout
sunfishcode added a commit to sunfishcode/llvm-project that referenced this issue Dec 9, 2024
Clang [defaults to aligning `__int128_t` to 16 bytes], while LLVM
`datalayout` strings [default to aligning `i128` to 8 bytes]. Wasm is
currently using the defaults for both, so it's inconsistent. Fix this
by adding `-i128:128` to Wasm's `datalayout` string so that it aligns
`i128` to 16 bytes too.

This is similar to llvm/llvm-project@dbad963 for SPARC.

This fixes rust-lang/rust#133991; see that issue for further discussion.

[defaults to aligning `__int128_t` to 16 bytes]: https://github.com/llvm/llvm-project/blob/f8b4182f076f8fe55f9d5f617b5a25008a77b22f/clang/lib/Basic/TargetInfo.cpp#L77
[default to aligning `i128` to 8 bytes]: https://llvm.org/docs/LangRef.html#langref-datalayout
@sunfishcode
Copy link
Member

sunfishcode commented Dec 9, 2024

It seems the situation is that Clang defaults to aligning __int128_t to 16 bytes, while LLVM datalayout strings default to aligning i128 to 8 bytes. Wasm is currently using the defaults for both, so it's inconsistent.

It appears the fix for SPARC was to add an explicit i128 entry to the datalayout string to override the default. That sounds like it would be a reasonable fix for Wasm too.

Another option would be to change clang, to align __int128_t to 8 bytes. Wasm doesn't have a i128 type and seems unlikely to add one for the foreseeable future (there is a discussion about i128 arithmetic, but that would use pairs of i64s). That would be a C ABI break, but __int128_t isn't that widely used, and technically it's not documented ABI yet, as WebAssembly/tool-conventions#240 hasn't landed yet. However, if Wasm ever adds a 128-bit atomic, it seems likely to require 16-byte alignment, since aarch64 and x86_64 both do, and clang would want to use it for _Atomic __int128_t and std::atomic<__int128_t>, and I don't know of any precedent for having alignment differ between atomic and non-atomic types.

So overall, adding -i128:128 to Wasm's datalayout string sounds like the best path to me. I've now posted llvm/llvm-project#119204 with a patch similar to llvm/llvm-project@dbad963, for WebAssembly.

@workingjubilee
Copy link
Member

Thanks!

sunfishcode added a commit to sunfishcode/llvm-project that referenced this issue Dec 9, 2024
Clang [defaults to aligning `__int128_t` to 16 bytes], while LLVM
`datalayout` strings [default to aligning `i128` to 8 bytes]. Wasm is
currently using the defaults for both, so it's inconsistent. Fix this
by adding `-i128:128` to Wasm's `datalayout` string so that it aligns
`i128` to 16 bytes too.

This is similar to llvm/llvm-project@dbad963
for SPARC.

This fixes rust-lang/rust#133991; see that issue for further discussion.

[defaults to aligning `__int128_t` to 16 bytes]: https://github.com/llvm/llvm-project/blob/f8b4182f076f8fe55f9d5f617b5a25008a77b22f/clang/lib/Basic/TargetInfo.cpp#L77
[default to aligning `i128` to 8 bytes]: https://llvm.org/docs/LangRef.html#langref-datalayout
sunfishcode added a commit to sunfishcode/llvm-project that referenced this issue Dec 10, 2024
Clang [defaults to aligning `__int128_t` to 16 bytes], while LLVM
`datalayout` strings [default to aligning `i128` to 8 bytes]. Wasm is
currently using the defaults for both, so it's inconsistent. Fix this
by adding `-i128:128` to Wasm's `datalayout` string so that it aligns
`i128` to 16 bytes too.

This is similar to llvm/llvm-project@dbad963
for SPARC.

This fixes rust-lang/rust#133991; see that issue for further discussion.

[defaults to aligning `__int128_t` to 16 bytes]: https://github.com/llvm/llvm-project/blob/f8b4182f076f8fe55f9d5f617b5a25008a77b22f/clang/lib/Basic/TargetInfo.cpp#L77
[default to aligning `i128` to 8 bytes]: https://llvm.org/docs/LangRef.html#langref-datalayout
sunfishcode added a commit to llvm/llvm-project that referenced this issue Dec 10, 2024
Clang [defaults to aligning `__int128_t` to 16 bytes], while LLVM
`datalayout` strings [default to aligning `i128` to 8 bytes]. Wasm is
currently using the defaults for both, so it's inconsistent. Fix this by
adding `-i128:128` to Wasm's `datalayout` string so that it aligns
`i128` to 16 bytes too.

This is similar to
[dbad963](dbad963)
for SPARC.

This fixes rust-lang/rust#133991; see that issue for further discussion.

[defaults to aligning `__int128_t` to 16 bytes]:
https://github.com/llvm/llvm-project/blob/f8b4182f076f8fe55f9d5f617b5a25008a77b22f/clang/lib/Basic/TargetInfo.cpp#L77
[default to aligning `i128` to 8 bytes]:
https://llvm.org/docs/LangRef.html#langref-datalayout
broxigarchen pushed a commit to broxigarchen/llvm-project that referenced this issue Dec 10, 2024
Clang [defaults to aligning `__int128_t` to 16 bytes], while LLVM
`datalayout` strings [default to aligning `i128` to 8 bytes]. Wasm is
currently using the defaults for both, so it's inconsistent. Fix this by
adding `-i128:128` to Wasm's `datalayout` string so that it aligns
`i128` to 16 bytes too.

This is similar to
[llvm/llvm-project@dbad963](llvm@dbad963)
for SPARC.

This fixes rust-lang/rust#133991; see that issue for further discussion.

[defaults to aligning `__int128_t` to 16 bytes]:
https://github.com/llvm/llvm-project/blob/f8b4182f076f8fe55f9d5f617b5a25008a77b22f/clang/lib/Basic/TargetInfo.cpp#L77
[default to aligning `i128` to 8 bytes]:
https://llvm.org/docs/LangRef.html#langref-datalayout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-align Area: alignment control (`repr(align(N))` and so on) C-bug Category: This is a bug. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants