-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Generating boolean column vector with roaring bitmap. #2113
Comments
This is related to NVIDIA/spark-rapids#10904 |
I don't think we have any GPU implementation for roaring bitmap yet. |
I have the experimental code porting parts of RoaringBitmap to CUDA, not a perfect fit for the need(only support int32 keys instead of int64 ones, the later is more challenging) and not open-sourced yet(still some perf optimization work in progress). Just want to join in to continue the discussion. Assuming we have it already, like in below: public static boolean[] filter(CuRoaring64Bitmap gpu_bmp, int start, int end) {
boolean[] ret = new boolean[len];
gpu_bmp.contains_range(start, end, ret); // better to be a batched function call for efficiency
return ret;
} there're still some questions - is 64 bits a hard requirement or 32 bits would also be fine sometimes(by remapping etc.)? where would be any idea? |
|
if it just serves as a on gpu bitmap based filter, seems sth worth to try by minimal engineering efforts. pls let me check whether https://github.com/NVIDIA/cuCollections could provide the required functionality or not in next few days. |
Currently we use roaring bitmap because it's highly efficient for compression and OR operation, which is quite important for previouse operation. |
RoaringBitmap is popular and highly efficient, well, it's 64bits version seems could be further improved according to my limited knowledge https://mp.weixin.qq.com/s/tJQoNRZ5UDJ_IASZLlhB4Q. Anyway, it does not matter that much as only a subset of roaring features required for this use case. @liurenjie1024 could u pls elaborate on expected set size and indices distribution(I mean the raw integers feed into the set), and to be filtered range length(end - start) for the target application? |
Hi, @pmixer
The expected set size depends on actual customer's file size. Simple bitset maybe not practical since 8 million rows requires 1M size, and that's only for one file. We may have thousands of files. |
Well, thousands of files means few GBs mem needed, may be a waste but possible to have a try before further optimization. Just FYI, currently there are https://github.com/NVIDIA/cuCollections/tree/dev/include/cuco/detail/trie/dynamic_bitset in cuCollection and https://nvidia.github.io/cccl/thrust/api/function_group__set__operations_1gacf5edd558fd96eee01b6425b3909c6b4.html in Thrust open-sourced for on-gpu set operations. |
Is your feature request related to a problem? Please describe.
Deletion vector is a widely used technique used for efficiently filtering data in columnar storage.
Describe the solution you'd like
We need to implement a gpu kernel to generate a boolean vector efficiently, following java code demo the logic in cpu:
Describe alternatives you've considered
Currently we use roaring bitmap due to it's efficient for compression and bitmap operations. We can use other gpu optimized bitmaps instead if someone could find one which is more gpu friendly.
The text was updated successfully, but these errors were encountered: