[BUG] Misinterpretation of Parquet List schema with single GROUP child named "array" #13313
Labels
0 - Backlog
In queue waiting for assignment
bug
Something isn't working
cuIO
cuIO issue
libcudf
Affects libcudf (C++/CUDA) code.
Milestone
This bug is to track a (possible) misinterpretation of Parquet list schemas when stored in a legacy format. This is a follow-up to #13277.
This is specific to rules #3 and #4 in the Parquet
LogicalType
spec, which states:Consider the following schema, from the Parquet file attached herewith:
libcudf
seems to interpret this asList<Int32>
:By my reading of the spec, this should be interpreted as a
List<Struct<Int32>>
. Apache Spark seems to concur:The text was updated successfully, but these errors were encountered: