-
Notifications
You must be signed in to change notification settings - Fork 849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RLEDecoder::get_batch_with_dict may panic on bit-packed runs longer than 1024 #3029
Comments
num_values
number computation in rle.rs is wrong
I've found the underlying cause of this is an accounting bug in In particular if the runs are longer than 1024, it may try to read more values from the underlying bit reader than there is capacity for. If the actual number of values is not a multiple of 8, this will return more values, as the length of bit packed runs is actually ambiguous. Such a scenario will result in a panic when it tries to copy these values across. Will post a PR to fix shortly |
|
Describe the bug
Reading a specific parquet file triggers: thread 'main' panicked at 'index out of bounds: the len is 1024 but the index is 1024', /Users/wolfvollprecht/Programs/arrow-rs/parquet/src/encodings/rle.rs:492:25
The max-index size computation is wrong.
The text was updated successfully, but these errors were encountered: