Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Consider using u64 or usize for Bitmap elements instead of u8 #585

Open
Dandandan opened this issue Nov 7, 2021 · 2 comments
Open

Consider using u64 or usize for Bitmap elements instead of u8 #585

Dandandan opened this issue Nov 7, 2021 · 2 comments

Comments

@Dandandan
Copy link
Collaborator

This should probably be quite a bit faster for doing many operations like comparison kernels etc.

I think there is also some potential for simplifying code, as some (unsafe) code now tries to create u64 chunks (see #578 , there are other many other places in the code doing this, i.e. bitwise ops).

Having the bitmap use u64 instead of u8 would reduce the need to do those conversions and also speed up some parts which work at u8 at a time (e.g. comparison kernels currently use u8) without the need to

Some references:

It would be quite an invasive change so I think it's good to discuss potential trade-offs first.

@jorgecarleitao
Copy link
Owner

jorgecarleitao commented Nov 7, 2021

:D ahaha, recent discussion on this: https://github.com/teratide/narrow/issues/23

I agree that it is worth trying out. I think that bitvec's semantics is different from arrow on a u64, though. Specifically, on a u64, the indexes of bits here need to be

[7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8, ...]

because the semantics in arrow is over individual bytes.

@ritchie46
Copy link
Collaborator

Another option might be to create a safe API that allows working on slices of &[usize]. For instance the unsafe part could be isolated by a fn split_by_alignment<usize> -> (&[u8], &[usize], &[u8]).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants