Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: in-place bitwise operations #38

Open
huguesb opened this issue Nov 21, 2024 · 6 comments
Open

Feature: in-place bitwise operations #38

huguesb opened this issue Nov 21, 2024 · 6 comments

Comments

@huguesb
Copy link

huguesb commented Nov 21, 2024

I'm doing some graph manipulation, with the graph being represented by adjacency list, where each vertex is an integer. Using BitSet as a replacement for HashSet<usize> is making things considerably faster, however the inability to do in-place bitwise operations on BitSet is really annoying and forcing me to do a lot of tedious for loops to add or remove entries.

@tower120
Copy link
Owner

Can you show example of " a lot of tedious for loops to add or remove entries."?

Also, if you do a graph - could this https://crates.io/crates/hibit_tree be more useful for you? You can use it as a Map<usize, Vert> and can do inter-set operations on it. It is faster then intesrsecting a set, and then accessing a HashMap by index.

This would be a more or less https://docs.rs/hibit_tree/0.1.0-alpha.3/hibit_tree/struct.SparseTree.html equal to hi_sparse_bitset::BitSet

@huguesb
Copy link
Author

huguesb commented Nov 22, 2024

Can you show example of " a lot of tedious for loops to add or remove entries."?

Mostly it's having to turn every a.extend(b) into an explicit for x in b { a.insert(x) } when I would have expected to be able to just do a |= b

Also, if you do a graph - could this https://crates.io/crates/hibit_tree be more useful for you? You can use it as a Map<usize, Vert> and can do inter-set operations on it. It is faster then intesrsecting a set, and then accessing a HashMap by index.

Neat! But not really relevant to my use case :) I'm storing vertices in a Vec, not a HashMap and I'm using the BitSet for reachability analysis

@tower120
Copy link
Owner

Ok - I'll see what I can do.
I may do some refactor as well, so let me ask a few questions:

  • Do you use BitSet or SmallBitSet?
  • What is your index-range? What max index you use within set?
  • Do you need compile-time configurable hierarchy depth? The shallower - the faster, the deeper - the higher index-range.

@huguesb
Copy link
Author

huguesb commented Nov 23, 2024

* Do you use `BitSet` or `SmallBitSet`?

BitSet

* What is your index-range? What max index you use within set?

I'm computing dependency graphs from python code so the max index is basically the number of Python modules under consideration, which varies by repository. For the big repo that I'm focusing on right now that's ~70k

* Do you need compile-time configurable hierarchy depth? The shallower - the faster, the deeper - the higher index-range.

No. In my case, the index range is only known at run-time.

While I have your attention and you're considering some changes, it would be nice to have some helper functions for serialization. Right now I'm doing the dumbest possible thing: storing each bitset as a length-prefixed sequence of varint, which means I need to iterate once to get the length with.iter().count(), and then a second time to get the indices. This makes ser/des unnecessarily slow, and the stored data high overhead (on the plus side, indices are sorted so it's highly compressible). Having an easy way to serialize and deserialize as a sequence of data blocks would be much preferrable. Bonus point if the API is not tied to any specific ser/des framework (fwiw I'm using speedy )

@tower120
Copy link
Owner

Are you aware of BitSet memory overhead? See memory layout scheme on doc's main page.

No. In my case, the index range is only known at run-time.

But the BitSet have limited range. It's depth always fixed, for performance. Currently there is 3 levels. And you can configure only block width. I'm talking about configuring it's depth as well. For example, if you would fit u16::MAX range - you could deal with 2level x 256bit configuration - that will both reduce hierarchy memory overhead and speed up all operations.

... length-prefixed sequence of varint ...

What is that?

need to iterate once to get the length with.iter().count()

There is currently no len counter at all. And it can be only implemented for physical BitSet (I mean not for the lazy-one, like
intersection result). For lazy-ones the fastest theoretical way to get len is by summing len of all data blocks block_iter().for_each(|block| block.len()).

If you actually need len in BitSet specifically - yes I can add that. Make an separate issue if that so - I may forget that.

Having an easy way to serialize and deserialize as a sequence of data blocks would be much preferrable.

You CAN iterate over datablocks now https://docs.rs/hi_sparse_bitset/0.6.1/hi_sparse_bitset/#datablocks with https://docs.rs/hi_sparse_bitset/0.6.1/hi_sparse_bitset/trait.BitSetInterface.html#method.block_iter. It returns https://docs.rs/hi_sparse_bitset/0.6.1/hi_sparse_bitset/struct.DataBlock.html . You should be able to serialize it. The problem is that there is no interface for pushing blocks for efficient deserialization. This I can add.

@huguesb
Copy link
Author

huguesb commented Nov 23, 2024

Are you aware of BitSet memory overhead? See memory layout scheme on doc's main page.

No. In my case, the index range is only known at run-time.

But the BitSet have limited range. It's depth always fixed, for performance. Currently there is 3 levels. And you can configure only block width. I'm talking about configuring it's depth as well. For example, if you would fit u16::MAX range - you could deal with 2level x 256bit configuration - that will both reduce hierarchy memory overhead and speed up all operations.

Yeah, I went with the _128 version to be safe. I could probably have a more optimized hierarchy but it was enough of an improvement over HashSet that I didn't bother tweaking further. If there was an easy way to pick the right hierarchy based on the max range at run-time I'd use that but I don't want to hard-code a small range at compile time.

... length-prefixed sequence of varint ...

What is that?

https://protobuf.dev/programming-guides/encoding/#varints

need to iterate once to get the length with.iter().count()

There is currently no len counter at all. And it can be only implemented for physical BitSet (I mean not for the lazy-one, like intersection result). For lazy-ones the fastest theoretical way to get len is by summing len of all data blocks block_iter().for_each(|block| block.len()).

If you actually need len in BitSet specifically - yes I can add that. Make an separate issue if that so - I may forget that.

Honestly, the only reason I need the length right now is as an implementation detail for serialization, and sometimes for debug printing so not a huge deal if there is a better serialization option

Having an easy way to serialize and deserialize as a sequence of data blocks would be much preferrable.

You CAN iterate over datablocks now https://docs.rs/hi_sparse_bitset/0.6.1/hi_sparse_bitset/#datablocks with https://docs.rs/hi_sparse_bitset/0.6.1/hi_sparse_bitset/trait.BitSetInterface.html#method.block_iter. It returns https://docs.rs/hi_sparse_bitset/0.6.1/hi_sparse_bitset/struct.DataBlock.html . You should be able to serialize it. The problem is that there is no interface for pushing blocks for efficient deserialization. This I can add.

That would be great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants