Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set operations for multiple sets at a time #57

Closed
Kerollmops opened this issue Jun 2, 2020 · 2 comments · Fixed by #223
Closed

Set operations for multiple sets at a time #57

Kerollmops opened this issue Jun 2, 2020 · 2 comments · Fixed by #223

Comments

@Kerollmops
Copy link
Member

Kerollmops commented Jun 2, 2020

Hey,

I discovered this pure Rust version of the Roaring library of Daniel Lemire et al., performances are pretty good at least way better than my sdset library.

I was wondering if you had any rough idea of how to implement a set operation and more specifically an union on multiple set at a time.

I dived a little bit into the C implementation and found out that the main performance gain seems to be related to the fact that the library doesn't change the internal representation (and keep the bitmap one) until all sets operations have been applied, then the internal type is changed. Am I right?

https://github.com/RoaringBitmap/CRoaring/blob/59d70d010da5f606f1339fb4c4f200be11f590c6/src/roaring.c#L611-L629

Thank you for your great job!

@Kerollmops Kerollmops changed the title Set operations for multiple maps at a time Set operations for multiple sets at a time Jun 2, 2020
@Nemo157
Copy link
Collaborator

Nemo157 commented Jun 2, 2020

This seems pretty easy to do by having an iterator that walks across multiple bitmaps at once, returning all the containers at each key (similar to Pairs but expanded to more than just 2 bitmaps). Then taking the stores from the containers and doing the operation directly with them, before wrapping back into a container. Finally collecting all the (non-empty) resulting containers into a new bitmap.

@Kerollmops
Copy link
Member Author

Kerollmops commented Jun 2, 2020

Hey @Nemo157,

I totally understand what you proposed! I started working on the "hardest" part of the problem already and found out that this "multi-Pairs" type is not that easy to implement. So I would like you to take a quick look at what I achieved, there is many allocations, but most of those can't be removed, I believe.

https://github.com/Kerollmops/roaring-rs/blob/0d806a8dbcabab7f8f53d5dc3de8c321b2c94a69/src/bitmap/cmp.rs#L142-L196

EDIT I rewrote it with an heap and interior mutability, there is a little bit more boilerplate (PartialEq/ParitalOrd/Ord/Eq things) but overall it is faster, with less allocations and the main algorithm is easier to read.

https://github.com/Kerollmops/roaring-rs/blob/0a03dfb/src/bitmap/cmp.rs#L142-L236

@saik0 saik0 mentioned this issue Jan 12, 2022
3 tasks
@bors bors bot closed this as completed in 2828be5 Aug 30, 2022
not-jan pushed a commit to not-jan/roaring-rs that referenced this issue Aug 31, 2022
223: Implements multioperation for the bitmaps and tree maps r=Kerollmops a=irevoire

Fixes RoaringBitmap#57, closes RoaringBitmap#58, closes RoaringBitmap#109, closes RoaringBitmap#139, and closes RoaringBitmap#219.

There is a lot of performance improvement, but here is a before / after on the operations that were the faster currently (when we can do assign between owned bitmaps).

## And
```
group                                                    after                                  before
-----                                                    -----                                  ------
Successive And/Multi And Owned/census-income             1.00     14.6±0.25µs        ? ?/sec    15.42   224.9±0.76µs        ? ?/sec
Successive And/Multi And Owned/census-income_srt         1.00     14.2±0.25µs        ? ?/sec    3.98     56.4±8.22µs        ? ?/sec
Successive And/Multi And Owned/census1881                1.00     20.7±0.33µs        ? ?/sec    37.18   770.1±1.62µs        ? ?/sec
Successive And/Multi And Owned/census1881_srt            1.00     25.8±1.29µs        ? ?/sec    1.12     28.8±0.09µs        ? ?/sec
Successive And/Multi And Owned/weather_sept_85           1.00     60.7±2.48µs        ? ?/sec    2.15    130.2±2.96µs        ? ?/sec
Successive And/Multi And Owned/weather_sept_85_srt       1.00     48.3±2.21µs        ? ?/sec    2.32    112.2±1.07µs        ? ?/sec
Successive And/Multi And Owned/wikileaks-noquotes        1.00     24.4±0.50µs        ? ?/sec    2.73     66.6±0.27µs        ? ?/sec
Successive And/Multi And Owned/wikileaks-noquotes_srt    1.00     20.3±0.58µs        ? ?/sec    1.09     22.0±0.30µs        ? ?/sec
```

## Or
```
group                                                    after                                  before
-----                                                    -----                                  ------
Successive Or/Multi Or Owned/census-income               1.00    629.3±4.46µs        ? ?/sec    2.29  1441.4±41.36µs        ? ?/sec
Successive Or/Multi Or Owned/census-income_srt           1.00    582.5±1.81µs        ? ?/sec    1.61    937.8±4.03µs        ? ?/sec
Successive Or/Multi Or Owned/census1881                  1.00   1143.4±4.55µs        ? ?/sec    3.48      4.0±0.07ms        ? ?/sec
Successive Or/Multi Or Owned/census1881_srt              1.00    743.4±4.40µs        ? ?/sec    3.49      2.6±0.02ms        ? ?/sec
Successive Or/Multi Or Owned/weather_sept_85             1.00      2.9±0.02ms        ? ?/sec    1.06      3.1±0.01ms        ? ?/sec
Successive Or/Multi Or Owned/weather_sept_85_srt         1.00   1344.5±7.80µs        ? ?/sec    1.06  1426.5±38.08µs        ? ?/sec
Successive Or/Multi Or Owned/wikileaks-noquotes          1.00    476.3±4.43µs        ? ?/sec    5.27      2.5±0.01ms        ? ?/sec
Successive Or/Multi Or Owned/wikileaks-noquotes_srt      1.00    259.4±3.90µs        ? ?/sec    7.17   1860.0±3.30µs        ? ?/sec
```

Co-authored-by: saik0 <[email protected]>
Co-authored-by: Kerollmops <[email protected]>
Co-authored-by: Tamo <[email protected]>
Co-authored-by: Irevoire <[email protected]>
Co-authored-by: Tamo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants