Add `@depositBits` and `@extractBits` builtins #15285

ominitay · 2023-04-14T15:26:19Z

This PR implements the @depositBits and @extractBits builtins, which correspond to the pdep and pext instructions in the x86 BMI2 extension (see #14995). On architectures where these instructions are unavailable, their behaviour is emulated, as it is for integers larger than their otherwise supported sizes.

This implementation functions correctly, to my observations. Each backend will need to implement an emulation for the two builtins, and optionally leverage dedicated instructions where available.

This is my first contribution to the compiler itself, so my code may well be non-idiomatic, or completely terrible just in general, so please be careful to fully analyse what my code is actually doing :) Particularly, my big.int probably needs plenty of critique and modification. Additionally, I'm not actually quite confident about my additions in Liveness.zig.

I'll rebase this code onto master and clean up my commits once it is ready to merge.

To-do

Closes #14995

matu3ba

mainly some ideas how this PR could be better tested.

doc/langref.html.in

lib/std/math/big/int.zig

src/codegen/llvm.zig

matu3ba · 2023-04-14T20:21:26Z

lib/std/math/big/int_test.zig

@@ -2671,6 +2671,87 @@ fn popCountTest(val: *const Managed, bit_count: usize, expected: usize) !void {
    try testing.expectEqual(expected, val.toConst().popCount(bit_count));
 }

+test "big int extractBits" {


Some possible tests to have more confidence everything is correct:

min + max supported sized type

2 and 3 limb sizes

random input on your pc for varying (in CI with constant) seed (for both mask and original number): extract bit sequence, negate mask to extract the other bit sequence, applying or of extracted bit sequences must be identical to original.

same principle: set random bit sequence, negate mask to set the other one, applying or must be identical to original or destination number. Not sure how to best create destination number.

unclear and open (not feasible to solve with this PR): how to get structured testing of UUM for bigint and bigfloat

Random input can be taken like this:

const RndGen = std.rand.DefaultPrng; var rnd = RndGen.init(42); var i: u32 = 0; while (i < 10_000) : (i += 1) { var rand_num = rnd.random().int(i64); try test__popcountdi2(rand_num); }

Yeah issue with creating the destination value is that we need to have a known-good implementation to generate it. I guess we could just write this in regular Zig though and converting to a bigint.

I'm actually a bit skeptical of this now, since the actual opportunities for mistakes here would be around having multiple limbs. Generating random u64s likely wouldn't help us catch these, would it?

Generating random u64s likely wouldn't help us catch these, would it?

I would expect the main problems to be in either the bit mask logic or the limbs not getting properly initialized. My suggestion was more of an brute-force approach, but listing the edge cases should also work.

Hmm. I think we should look at a more reasoned set of tests then. The current ones effectively just test the basic bitmask logic, it'd be good to add some more to ensure that the boundaries between limbs are handled correctly, and that a difference in sizes is handled correctly. I'm fairly confident in both of these, but definitely can't hurt to strengthen this.

Uninitialised limbs aren't a worry though, since the output bigint is set to zero at the beginning.

matu3ba · 2023-04-14T21:39:51Z

zig fmt issue I think in llvm.zig:

2023-04-14T21:15:45.4950062Z + stage3-release/bin/zig fmt --check .. --exclude ../test/cases/ --exclude ../build-release
2023-04-14T21:15:45.4950642Z ../src/codegen/llvm.zig
2023-04-14T21:15:45.5071189Z ##[error]Process completed with exit code 1.

ominitay · 2023-04-14T22:24:49Z

Huh, suprised Emacs didn't automatically fix that.

ominitay · 2023-04-17T20:59:43Z

New commit removes limbs_buffer requirement from big.int functions. It gets the job done, but I don't particularly like it. Would appreciate advice on cleaning that up.

ominitay · 2023-04-18T09:03:27Z

After further thinking about how signed ints are handled, there is probably an accidental sign extension if passing a signed int parameter with fewer bits than the other parameter -- the signed int will coerce up, causing a sign-extension accidentally. Will test this later, add a behaviour test, and change the Sema code to effectively do an implicit bitcast to unsigned arguments before coercing them.

ominitay · 2023-04-18T10:51:21Z

I see two choices here. The two builtins can do an implicit bitcast to get the semantics we want, or we can leave the bitcast to the caller. I personally think the conversion from signed -> unsigned should be handled by the builtins, as there's no intuitive reason why passing i64 for example shouldn't work.

ominitay · 2023-04-18T10:52:30Z

Not sure why macos-debug tests failed there too.

ominitay · 2023-04-18T17:50:27Z

Decided to disallow using signed integers at all with these builtins. This simplifies the implementation greatly (makes me less likely to break things, and probably makes it faster), and just makes more sense; we're dealing with raw bits after all.

andrewrk · 2023-04-18T21:21:01Z

Looks like linking libc and not linking libc for ReleaseFast are racing at writing the cache manifest against each other, as both fail with the very exact error message:

fixed by #15351

ominitay · 2023-04-19T10:23:04Z

Forgot to enable the behaviour tests >_<

ominitay · 2023-06-17T18:01:48Z

Starting to come back to this now. One possible thought is that the compiler itself currently doesn't actually utilise these builtins, when it could use them for a subset of types at comptime. Obviously not a priority, but could perhaps be worthwhile to consider in the future?

This change implements depositBits and extractBits (equivalents of PDEP and PEXT) for Zig's bit ints. This change lays the groundwork for implementation of `@depositBits` and `@extractBits`. Tests have been added to check the behaviour of these two functions. The functions currently don't handle negative values (though negative values may be converted to twos complement externally), and aren't optimal in either memory or performance.

Implements std.math.big.int.Mutable.convertFromTwosComplement, to match convertToTwosComplement.

Incomplete: currently only implemented for 64-bit-or-smaller integers for x86(-64) in the LLVM backend.

Removes the requirement to copy and modify `mask`, removing the need to clone `mask` into a `Mutable` bigint.

andrewrk · 2023-10-19T01:05:43Z

It looks like this PR never reached "ready for review & merge" status. It's now bitrotted, so if you wish to continue this effort, please open a new PR against master branch.

ominitay · 2023-10-19T18:21:12Z

Will do! Have had a lot of commitments irl, but I'll try to find the time in a month-ish.

andrewrk · 2023-10-19T18:50:30Z

No pressure. Happy to take a look when you decide it is ready 👍

matu3ba reviewed Apr 14, 2023

View reviewed changes

ominitay force-pushed the pdeppext branch from 59b32a2 to 865ee43 Compare April 17, 2023 20:55

This comment was marked as resolved.

Sign in to view

ominitay force-pushed the pdeppext branch 2 times, most recently from c17db00 to 61a14a6 Compare April 18, 2023 17:39

ominitay force-pushed the pdeppext branch from 61a14a6 to c0a3475 Compare April 18, 2023 17:52

ominitay force-pushed the pdeppext branch from 14f5b6e to 19abab9 Compare June 18, 2023 12:44

jacobly0 mentioned this pull request Jun 20, 2023

wasm: fix decl alignment #16103

Merged

ominitay added 12 commits June 20, 2023 13:16

std.math.big.int: Conversion from 2's complement

9862b07

Implements std.math.big.int.Mutable.convertFromTwosComplement, to match convertToTwosComplement.

Write docs for @depositBits and @extractBits

ad8bff8

Implement @depositBits and @extractBits

befa47f

Incomplete: currently only implemented for 64-bit-or-smaller integers for x86(-64) in the LLVM backend.

LLVM: Implement emulation for @depositBits

70d2dc4

LLVM: Implement emulation for @extractBits

98f70c0

std.math.big.int: Fix index out-of-bounds

472734e

Add behaviour tests for @depositBits and @extractBits

a4ae063

zig fmt

fb1cb9f

Replace u6 with Log2Limb

d4da312

big.int.depositBits/extractBits: Remove limbs_buffer

3015a6a

Removes the requirement to copy and modify `mask`, removing the need to clone `mask` into a `Mutable` bigint.

Disallow signed integer types for deposit/extract

c13a54f

ominitay added 6 commits June 20, 2023 13:16

Actually use deposit/extract behaviour test

b2bba7a

Enable langref tests for deposit and extract

3792b6e

Allow use of comptime_int with deposit/extract

1843795

Improve compile errors for negative values

af42a76

update comments

1e8c707

Bring changes up-to-date with master

3d6e308

jacobly0 force-pushed the pdeppext branch from 19abab9 to 3d6e308 Compare June 20, 2023 17:25

andrewrk closed this Oct 19, 2023

ominitay mentioned this pull request Jan 25, 2024

Implement @depositBits and @extractBits #18680

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `@depositBits` and `@extractBits` builtins #15285

Add `@depositBits` and `@extractBits` builtins #15285

ominitay commented Apr 14, 2023 •

edited

Loading

matu3ba left a comment

matu3ba Apr 14, 2023

ominitay Apr 14, 2023

ominitay Apr 19, 2023

matu3ba Apr 21, 2023

ominitay Apr 21, 2023

ominitay Apr 21, 2023

matu3ba commented Apr 14, 2023

ominitay commented Apr 14, 2023

ominitay commented Apr 17, 2023

ominitay commented Apr 18, 2023

ominitay commented Apr 18, 2023

ominitay commented Apr 18, 2023

This comment was marked as resolved.

ominitay commented Apr 18, 2023

andrewrk commented Apr 18, 2023

ominitay commented Apr 19, 2023

ominitay commented Jun 17, 2023

andrewrk commented Oct 19, 2023

ominitay commented Oct 19, 2023

andrewrk commented Oct 19, 2023

Add @depositBits and @extractBits builtins #15285

Add @depositBits and @extractBits builtins #15285

Conversation

ominitay commented Apr 14, 2023 • edited Loading

To-do

matu3ba left a comment

Choose a reason for hiding this comment

matu3ba Apr 14, 2023

Choose a reason for hiding this comment

ominitay Apr 14, 2023

Choose a reason for hiding this comment

ominitay Apr 19, 2023

Choose a reason for hiding this comment

matu3ba Apr 21, 2023

Choose a reason for hiding this comment

ominitay Apr 21, 2023

Choose a reason for hiding this comment

ominitay Apr 21, 2023

Choose a reason for hiding this comment

matu3ba commented Apr 14, 2023

ominitay commented Apr 14, 2023

ominitay commented Apr 17, 2023

ominitay commented Apr 18, 2023

ominitay commented Apr 18, 2023

ominitay commented Apr 18, 2023

This comment was marked as resolved.

ominitay commented Apr 18, 2023

andrewrk commented Apr 18, 2023

ominitay commented Apr 19, 2023

ominitay commented Jun 17, 2023

andrewrk commented Oct 19, 2023

ominitay commented Oct 19, 2023

andrewrk commented Oct 19, 2023

Add `@depositBits` and `@extractBits` builtins #15285

Add `@depositBits` and `@extractBits` builtins #15285

ominitay commented Apr 14, 2023 •

edited

Loading