from_arrow
handles empty/duplicate column names badly
#11632
Labels
A-interop-arrow
Area: interoperability with other Arrow implementations (such as pyarrow)
bug
Something isn't working
P-low
Priority: low
python
Related to Python Polars
Checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Reproducible example
Log output
No response
Issue description
pyarrow Tables allow duplicate column names, while we do not.
We handle this issue for empty columns by replacing them with
column_0
,column_1
, etc.For duplicate named columns, we simply drop any duplicates (only keeping the last one).
Expected behavior
I propose we raise an error on duplicate column names, with the suggestion to specify the
schema
argument. Then we can treat empty column names (""
) the same as any other column name.Installed versions
main branch, pyarrow 13.0.0
The text was updated successfully, but these errors were encountered: