You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When reading this file with cuDF and specifying a schema with column number of type int and also specifying recover_with_null, the column returned is a struct instead of an int and has null values for the first two records and then has a struct with a child int containing the number 4.
It appears that line 3 is treated as valid and multiline, and line 4 is read as a child.
Steps/Code to reproduce bug
I am testing via Spark RAPIDS, but I suspect that this issue could be reproduced by adding this scenario to the TEST_F(JsonReaderTest, JSONLinesRecovering) test in cpp/tests/io/json_test.cpp.
Expected behavior
I would expect line 3 to be treated as invalid and return a NULL.
Environment overview (please complete the following information)
N/A
Environment details
N/A
Additional context
The text was updated successfully, but these errors were encountered:
I've opened #14252 to address this issue. Feel free to test against the PR to verify if the new behaviour matches Spark for JSON lines with incomplete records (or incomplete records, strings, field names, etc.).
Describe the bug
The following input has one invalid JSON record on line 3 (missing the closing
}
).When reading this file with cuDF and specifying a schema with column
number
of typeint
and also specifyingrecover_with_null
, the column returned is a struct instead of an int and has null values for the first two records and then has a struct with a child int containing the number 4.It appears that line 3 is treated as valid and multiline, and line 4 is read as a child.
Steps/Code to reproduce bug
I am testing via Spark RAPIDS, but I suspect that this issue could be reproduced by adding this scenario to the
TEST_F(JsonReaderTest, JSONLinesRecovering)
test incpp/tests/io/json_test.cpp
.Expected behavior
I would expect line 3 to be treated as invalid and return a NULL.
Environment overview (please complete the following information)
N/A
Environment details
N/A
Additional context
The text was updated successfully, but these errors were encountered: