You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Parquet supports the FIXED_LEN_BYTE_ARRAY physical type in addition to the BINARY physical type (which is basically a variable length byte array). cuDF can currently read Decimals and other types which are stored as FIXED_LEN_BYTE_ARRAY, but can't read the FIXED_LEN_BYTE_ARRAY data as binary data (or string). It should behave the same as BINARY.
Describe the bug
Parquet supports the
FIXED_LEN_BYTE_ARRAY
physical type in addition to theBINARY
physical type (which is basically a variable length byte array). cuDF can currently read Decimals and other types which are stored asFIXED_LEN_BYTE_ARRAY
, but can't read the FIXED_LEN_BYTE_ARRAY data as binary data (or string). It should behave the same asBINARY
.Basically, this schema:
should be handled the same as:
This is required for NVIDIA/spark-rapids#7449
Steps/Code to reproduce bug
flba_binary_parquet.zip
The attached parquet file has one row with the following schema:
In cuDF Python:
Expected behavior
The output should be:
Environment overview (please complete the following information)
Ran
ipython
inside Docker containerThe text was updated successfully, but these errors were encountered: