parquet Performance Optimization: StructArrayReader Redundant Level & Bitmap Computation #1034
Labels
enhancement
Any new improvement worthy of a entry in the changelog
parquet
Changes to the parquet crate
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
StructArrayReader
currently always computes repetition and definition level buffers, along with a null bitmask - see here. In the case where the struct array is not nullable, and has no nullable or repeated parents, this is redundant. The bitmask will be alltrue
, and no parent array reader is going to consult the levels buffers. This situation will arise in the common case of a flat schema.Describe the solution you'd like
Skip the definition and repetition level logic in the case where the definition level and repetition level of the struct is 0.
Describe alternatives you've considered
The logic could remain the same
The text was updated successfully, but these errors were encountered: