Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform NaN randomization as part of the interpretation of float bits #279

Conversation

sunfishcode
Copy link
Member

This removes the THE_HOST_WANTS_TO option, which appeared to give hosts permission to interpret NaN bits how they "want to". The change here is to say that all core-wasm NaN bitpatterns are interpreted as the same component-model NaN value.

It's my understanding that the random_nan_bits function isn't meant to be the precise algorithm that nondeterministic-profile implementations must use, so this doesn't require hosts to do any new randomization work.

This change also fixes what appears to be a bug:
lift_flat_variant/lower_flat_variant were calling reinterpret_i32_as_float/reinterpret_float_as_i32 without performing NaN scrambling. By making NaN scrambling be part of interpretation, we ensure that it's performed anywhere interpretation is performed.

@lukewagner
Copy link
Member

Good catch on the missing maybe_scramble_nan* in the flat variant cases, we should fix that. However, I don't think this PR causes "all core-wasm NaN bitpatterns are interpreted as the same component-model NaN value" since it still produces semantically distinct NaN values upon lifting and lowering. Moreover, although your change is technically equivalent (saying you can non-deterministically choose to non-deterministically scramble NaN payloads is the same as saying that you can non-deterministically scramble NaN values), I think it's useful to make it clear to the reader that the host is allowed by the spec to simply propagate the NaN bitpattern; this just avoids confusion, so I'd rather keep THE_HOST_WANTS_TO.

This removes the `THE_HOST_WANTS_TO` option, which appeared to give hosts
permission to interpret NaN bits how they "want to". The change here is to
say that all core-wasm NaN bitpatterns are interpreted as the same
component-model NaN value.

It's my understanding that the `random_nan_bits` function isn't meant
to be the precise algorithm that nondeterministic-profile implementations
must use, so this doesn't require hosts to do any new randomization work.

This change also fixes what appears to be a bug:
`lift_flat_variant`/`lower_flat_variant` were calling
`reinterpret_i32_as_float`/`reinterpret_float_as_i32` without performing
NaN scrambling. By making NaN scrambling be part of interpretation, we
ensure that it's performed anywhere interpretation is performed.
@sunfishcode sunfishcode force-pushed the sunfishcode/variant-float-nan-scrambling branch from 6a34d7b to af62ee5 Compare December 19, 2023 20:18
@sunfishcode
Copy link
Member Author

However, I don't think this PR causes "all core-wasm NaN bitpatterns are interpreted as the same component-model NaN value" since it still produces semantically distinct NaN values upon lifting and lowering.

I've now changed the PR to canonicalize on lifting, and scramble on lowering. This better illustrates how "all core-wasm NaN bitpatterns are interpreted as the same component-model NaN value".

I think it's useful to make it clear to the reader that the host is allowed by the spec to simply propagate the NaN bitpattern; this just avoids confusion, so I'd rather keep THE_HOST_WANTS_TO.

It's my understanding that your goal here is to avoid confusion among hosts about "do I need to randomize?"

My goal here is to avoid the confusion among hosts about "is it ok for me to interpret the bits, if I «want to»?" If they do that, they break virtualization in a subtle way.

Also, if we can say there's only one component-model NaN value, that would make it clear why wave doesn't need nan payload syntax.

I've also now added comments to this PR to call out how hosts can avoid skip canonicalization and randomization overhead entirely in practice.

Copy link
Member

@lukewagner lukewagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this idea of saying that there is only one "semantic" NaN value -- it's just that lowering scrambles and oh hey, it's technically valid not to canonicalize if you don't want to. Nice and thanks for the consideration!

Just stylistic nits:

design/mvp/CanonicalABI.md Outdated Show resolved Hide resolved
design/mvp/CanonicalABI.md Outdated Show resolved Hide resolved
design/mvp/CanonicalABI.md Outdated Show resolved Hide resolved
@lukewagner
Copy link
Member

Thanks again for thinking through this!

@lukewagner lukewagner merged commit aba0746 into WebAssembly:main Dec 21, 2023
1 check passed
@sunfishcode sunfishcode deleted the sunfishcode/variant-float-nan-scrambling branch December 21, 2023 16:42
sunfishcode added a commit to sunfishcode/wave that referenced this pull request Jan 4, 2024
With WebAssembly/component-model#279, component-model floating-point
types have a single NaN value. Canonicalization isn't required, and
may be omitted as an optimization, but users shouldn't depend on NaN
bitpatterns being preserved.

To help users avoid depending on NaN bitpatterns being preserved as
they propagate through component-model values, canonicalize NaN values
in Wave.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants