Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add @depositBits and @extractBits builtins #15285

Closed
wants to merge 18 commits into from

Conversation

ominitay
Copy link
Contributor

@ominitay ominitay commented Apr 14, 2023

This PR implements the @depositBits and @extractBits builtins, which correspond to the pdep and pext instructions in the x86 BMI2 extension (see #14995). On architectures where these instructions are unavailable, their behaviour is emulated, as it is for integers larger than their otherwise supported sizes.

This implementation functions correctly, to my observations. Each backend will need to implement an emulation for the two builtins, and optionally leverage dedicated instructions where available.

This is my first contribution to the compiler itself, so my code may well be non-idiomatic, or completely terrible just in general, so please be careful to fully analyse what my code is actually doing :) Particularly, my big.int probably needs plenty of critique and modification. Additionally, I'm not actually quite confident about my additions in Liveness.zig.

I'll rebase this code onto master and clean up my commits once it is ready to merge.

To-do

  • Look at tidying up and possibly optimising the bigint algorithms.
  • Use llvm.ppc.pdepd and pextd when available.
  • Consider how the available 64-bit pext/pdep instructions could be used to compute values for larger integers (u128, etc).
  • Avoid mutating the mask in the bigint representation, so that we can avoid allocating a buffer and copying into it
  • Improve documentation
    • Allowed types
  • Review general code quality, this is my first compiler contribution after all.
  • Implement in the other (non-LLVM) backends (could be done after this PR is merged).
  • Improve test coverage, testing against random numbers.
  • Replace use of std.mem.set with @memset.

Closes #14995

Copy link
Contributor

@matu3ba matu3ba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mainly some ideas how this PR could be better tested.

doc/langref.html.in Show resolved Hide resolved
lib/std/math/big/int.zig Outdated Show resolved Hide resolved
lib/std/math/big/int.zig Outdated Show resolved Hide resolved
src/codegen/llvm.zig Outdated Show resolved Hide resolved
@@ -2671,6 +2671,87 @@ fn popCountTest(val: *const Managed, bit_count: usize, expected: usize) !void {
try testing.expectEqual(expected, val.toConst().popCount(bit_count));
}

test "big int extractBits" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some possible tests to have more confidence everything is correct:

  • min + max supported sized type
  • 2 and 3 limb sizes
  • random input on your pc for varying (in CI with constant) seed (for both mask and original number): extract bit sequence, negate mask to extract the other bit sequence, applying or of extracted bit sequences must be identical to original.
  • same principle: set random bit sequence, negate mask to set the other one, applying or must be identical to original or destination number. Not sure how to best create destination number.
  • unclear and open (not feasible to solve with this PR): how to get structured testing of UUM for bigint and bigfloat

Random input can be taken like this:

const RndGen = std.rand.DefaultPrng;
    var rnd = RndGen.init(42);
    var i: u32 = 0;
    while (i < 10_000) : (i += 1) {
        var rand_num = rnd.random().int(i64);
        try test__popcountdi2(rand_num);
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah issue with creating the destination value is that we need to have a known-good implementation to generate it. I guess we could just write this in regular Zig though and converting to a bigint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually a bit skeptical of this now, since the actual opportunities for mistakes here would be around having multiple limbs. Generating random u64s likely wouldn't help us catch these, would it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generating random u64s likely wouldn't help us catch these, would it?

I would expect the main problems to be in either the bit mask logic or the limbs not getting properly initialized. My suggestion was more of an brute-force approach, but listing the edge cases should also work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I think we should look at a more reasoned set of tests then. The current ones effectively just test the basic bitmask logic, it'd be good to add some more to ensure that the boundaries between limbs are handled correctly, and that a difference in sizes is handled correctly. I'm fairly confident in both of these, but definitely can't hurt to strengthen this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uninitialised limbs aren't a worry though, since the output bigint is set to zero at the beginning.

@matu3ba
Copy link
Contributor

matu3ba commented Apr 14, 2023

zig fmt issue I think in llvm.zig:

2023-04-14T21:15:45.4950062Z + stage3-release/bin/zig fmt --check .. --exclude ../test/cases/ --exclude ../build-release
2023-04-14T21:15:45.4950642Z ../src/codegen/llvm.zig
2023-04-14T21:15:45.5071189Z ##[error]Process completed with exit code 1.

@ominitay
Copy link
Contributor Author

Huh, suprised Emacs didn't automatically fix that.

@ominitay
Copy link
Contributor Author

New commit removes limbs_buffer requirement from big.int functions. It gets the job done, but I don't particularly like it. Would appreciate advice on cleaning that up.

@ominitay
Copy link
Contributor Author

After further thinking about how signed ints are handled, there is probably an accidental sign extension if passing a signed int parameter with fewer bits than the other parameter -- the signed int will coerce up, causing a sign-extension accidentally. Will test this later, add a behaviour test, and change the Sema code to effectively do an implicit bitcast to unsigned arguments before coercing them.

@ominitay
Copy link
Contributor Author

I see two choices here. The two builtins can do an implicit bitcast to get the semantics we want, or we can leave the bitcast to the caller. I personally think the conversion from signed -> unsigned should be handled by the builtins, as there's no intuitive reason why passing i64 for example shouldn't work.

@ominitay
Copy link
Contributor Author

Not sure why macos-debug tests failed there too.

@matu3ba

This comment was marked as resolved.

@ominitay ominitay force-pushed the pdeppext branch 2 times, most recently from c17db00 to 61a14a6 Compare April 18, 2023 17:39
@ominitay
Copy link
Contributor Author

Decided to disallow using signed integers at all with these builtins. This simplifies the implementation greatly (makes me less likely to break things, and probably makes it faster), and just makes more sense; we're dealing with raw bits after all.

@andrewrk
Copy link
Member

Looks like linking libc and not linking libc for ReleaseFast are racing at writing the cache manifest against each other, as both fail with the very exact error message:

fixed by #15351

@ominitay
Copy link
Contributor Author

Forgot to enable the behaviour tests >_<

@ominitay
Copy link
Contributor Author

Starting to come back to this now. One possible thought is that the compiler itself currently doesn't actually utilise these builtins, when it could use them for a subset of types at comptime. Obviously not a priority, but could perhaps be worthwhile to consider in the future?

ominitay added 12 commits June 20, 2023 13:16
This change implements depositBits and extractBits (equivalents of PDEP
and PEXT) for Zig's bit ints. This change lays the groundwork for
implementation of `@depositBits` and `@extractBits`.

Tests have been added to check the behaviour of these two functions.

The functions currently don't handle negative values (though negative
values may be converted to twos complement externally), and
aren't optimal in either memory or performance.
Implements std.math.big.int.Mutable.convertFromTwosComplement, to match
convertToTwosComplement.
Incomplete: currently only implemented for 64-bit-or-smaller integers
for x86(-64) in the LLVM backend.
Removes the requirement to copy and modify `mask`, removing the need to
clone `mask` into a `Mutable` bigint.
@andrewrk
Copy link
Member

It looks like this PR never reached "ready for review & merge" status. It's now bitrotted, so if you wish to continue this effort, please open a new PR against master branch.

@andrewrk andrewrk closed this Oct 19, 2023
@ominitay
Copy link
Contributor Author

Will do! Have had a lot of commitments irl, but I'll try to find the time in a month-ish.

@andrewrk
Copy link
Member

No pressure. Happy to take a look when you decide it is ready 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add @pext and @pdep builtins
3 participants