[BUG] parquet reader data corruption in nested schema after https://github.com/rapidsai/cudf/pull/13302 #9948
Labels
bug
Something isn't working
cudf_dependency
An issue or PR with this label depends on a new feature in cudf
After string column changes included rapidsai/cudf#13302 a customer with nested schemas reported a corruption where a struct<map<string, struct<...>>> column had issues with the keys in the inner map.
We bisected cuDF changes until we found the culprit and have worked with the author of that PR to produce a fix.
The symptom from our side was that the last offset in the offset buffer of the keys string column was way too large, pointing to memory that was not part of the string data column. This produced garbage output that was later carried around and eventually written to file. The issue didn't trigger compute-sanitizer in our attempts.
PR to fix rapidsai/cudf#14557
The text was updated successfully, but these errors were encountered: