-
Notifications
You must be signed in to change notification settings - Fork 224
Conversation
a08e426
to
549ddbf
Compare
Codecov Report
@@ Coverage Diff @@
## main #442 +/- ##
=======================================
Coverage 79.87% 79.87%
=======================================
Files 371 371
Lines 22753 22758 +5
=======================================
+ Hits 18174 18178 +4
- Misses 4579 4580 +1
Continue to review full report at Codecov.
|
Cool idea! I think it is worth benchmarking: we now have 2 (smaller) iterations instead of one. I can PR a bench for slicing bitmaps. |
Yes, I was also wondering this. Perhaps there is some more instruction level parallelism as they are independent. But you are right. Let's benchmark |
I ran this benchmark: let offset = ((size as f64) * 0.2) as usize;
let len = ((size as f64) * 0.55) as usize;
c.bench_function(&format!("bitmap_count_zeros {}", log2_size), |b| {
b.iter(|| {
let r = bitmap.clone().slice(offset, len);
assert!(r.null_count() > 0);
})
}); Worst case side of the spectrumSo that we have a smaller chunk at the start and at the end of the array. We slice only 55% as this would be an almost worst case scenario (51% would be). So the result would be better if we slice bigger chunks. It seems we make a difference once we don't fit the cache anymore.
Best case side of the spectrumHere we slice a large chunk (85% of the array)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome; very good result.
It may be worth creating a bench that you used, in case someone else would like to use it for future improvements.
549ddbf
to
3585481
Compare
Added 👍 |
Could you resolve the conflict? |
f51fff8
to
2a6093d
Compare
Good to go. |
Every time we slice an array, we count the null values. This PR does a small optimization so that we only count he null values of the smallest chunk of memory.