-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Support spark.sql.parquet.binaryAsString=true #4040
Comments
Mini repro: Reading a hive parquet table:
Or reading a parquet file with string inside directly:
|
Adding Binary Type test
For binary Type data Summary
|
So the way this flag works in Spark is that it is specifically for backwards compatibility with older versions of Parquet (or rather the versions of Parquet used in other systems like Hive/etc.). When I ran this in Spark on the CPU, this meant that when I wrote a parquet file from Spark that included a BINARY column, it would always be read back as a BINARY column (even when This is why we still need #5416 |
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I wish the RAPIDS Accelerator for Apache Spark would [...]
Support spark.sql.parquet.binaryAsString=true.
The text was updated successfully, but these errors were encountered: