Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix overflow exception in BitArray #8494

Conversation

jimbowhales
Copy link

(1 << count) - 1 underflows when count = 31 and type is Int32.

@straight-shoota
Copy link
Member

Could you please also add a test case that fails without the change?

@jimbowhales jimbowhales force-pushed the whales/bitarray-overflow-fix branch from d904778 to 4b94331 Compare November 20, 2019 13:57
@straight-shoota
Copy link
Member

I'd prefer specific test that not only ensure that the call does not fail but also returns the correct value.

@jimbowhales jimbowhales force-pushed the whales/bitarray-overflow-fix branch from 4b94331 to 91d36e6 Compare November 20, 2019 17:18
@jimbowhales
Copy link
Author

I did come across another bug writing the test for the first. Fixed too.

True-initialization and invert have broken the invariant, so give it
up and update == and hash to not assume trailing bits are zero.
@jimbowhales jimbowhales force-pushed the whales/bitarray-overflow-fix branch from 91d36e6 to b43fa51 Compare November 21, 2019 06:42

if count <= 32
BitArray.new(count).tap { |ba| ba.@bits[0] = bits.to_u32 }
BitArray.new(count).tap { |ba| ba.@bits[0] = bits.to_u32! }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see when this can overflow if bits &= (1 << count) &- 1 is working correctly. However, it is not.

(1 << count) is simply zero when count >= 32, because 1 is an int32. This doesn't matter though, since we don't care about the MSB bits being zero. So I think the bits &= (1 << count) &- 1 line can be removed and the to_u32! can stay. And I think the masking in the size <=32 case can be removed too. I'd like a comment left about leaving the trailing upper though.

Usages of (1 << count) in the rest of the file must be audited though. I think they're all fine, but i'd like a second pair of eyes.

# representation and their hashes for equivalent BitArrays after a downsize as the
# discarded bits may not have been zeroed.
return LibC.memcmp(@bits, other.@bits, malloc_size) == 0
return true if size == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why this entire change is needed? I don't understand why the last bit needs special handling. And if it does, there should be a comment in the code explaining why.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs special handling since unused bits at the end of the array are not guaranteed to be in a defined state, so they have to be masked out (i.e., set to zero) before comparing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's not correct. Pointer.malloc makes sure everything is zero. The comparison is broken because LibC.memcmp expects the number of bytes but it uses malloc_size which is in terms of UInt32. Changing it to use malloc_size * sizeof(UInt32) makes it work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. The method [] will copy bits from an existing bit array and just copy them over without zeroing the non-interesting part, that's why memcmp doesn't work.

I think we should fix [] to guarantee the invariant that unused bits are always zero.

Copy link
Contributor

@RX14 RX14 Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really care which method is used, this PR is certainly a valid method.

Nice catch on the memcmp size being wrong though.

@HertzDevil
Copy link
Contributor

Resolved by #10809. (This PR assumes BitArrays can have unused bits set which is no longer true.)

@HertzDevil HertzDevil closed this Aug 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants