You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
CUDF is in the middle of fixing a lot of the issues with nested parsing in JSON (rapidsai/cudf#16545).
Once that goes in we can start to see that we are not covering some corner cases in JSON as fully as we would like.
When parsing numbers out of an array if any number in the array overflows, then the entire array needs to be null. We do it on a per item basis.
(1 to 20).map(upper => (1 to upper).map(i => "1" + ("0" * i)).mkString("""{"a":[""",",","]}")).toDF("json").repartition(1).selectExpr("from_json(json, 'a ARRAY<BYTE>') as a_byte", "from_json(json, 'a ARRAY<SHORT>') as a_short", "from_json(json, 'a ARRAY<INT>') as a_int", "from_json(json, 'a ARRAY<LONG>') as a_long", "from_json(json, 'a ARRAY<DECIMAL(21,0)>') as a_decimal").show(false)
This is not true for an array of structs.
(1 to 20).map(upper => (1 to upper).map(i => "1" + ("0" * i)).mkString("""{"a":[{"b":""","""},{"b":""","}]}")).toDF("json").repartition(1).selectExpr("from_json(json, 'a ARRAY<STRUCT<b:BYTE>>') as a_byte", "from_json(json, 'a ARRAY<STRUCT<b:SHORT>>') as a_short", "from_json(json, 'a ARRAY<STRUCT<b:INT>>') as a_int", "from_json(json, 'a ARRAY<STRUCT<b:LONG>>') as a_long", "from_json(json, 'a ARRAY<STRUCT<b:DECIMAL(21,0)>>') as a_decimal").show(false)
Not sure why that is, but it is.
The text was updated successfully, but these errors were encountered:
revans2
changed the title
[BUG] JSON Scan and StructToJson should invalidate an array on out of bounds
[BUG] JSON Scan and JsonToStruct should invalidate an array on out of bounds
Sep 26, 2024
Describe the bug
CUDF is in the middle of fixing a lot of the issues with nested parsing in JSON (rapidsai/cudf#16545).
Once that goes in we can start to see that we are not covering some corner cases in JSON as fully as we would like.
When parsing numbers out of an array if any number in the array overflows, then the entire array needs to be null. We do it on a per item basis.
This is not true for an array of structs.
Not sure why that is, but it is.
The text was updated successfully, but these errors were encountered: