-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update JsonToStructs and ScanJson to have white space normalization #10575
Update JsonToStructs and ScanJson to have white space normalization #10575
Conversation
Signed-off-by: Robert (Bobby) Evans <[email protected]>
build |
build |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment, overall lgtm.
def deepTransformView(cv: ColumnView, dt: Option[DataType] = None) | ||
def deepTransformView(cv: ColumnView, dt: Option[DataType] = None, | ||
nestedMismatchHandler: Option[(ColumnView, DataType) => | ||
(Option[ColumnView], ArrayBuffer[AutoCloseable])] = None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the handler need to return a mutable ArrayBuffer? I think the handler could return an immutable Seq given how it's being used, and that seems more flexible and less error-prone than forcing an ArrayBuffer here.
build |
@jlowe please take another look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the ArrayBuffer -> Seq change only happened halfway, suggested some updates.
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuJsonReadCommon.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuJsonReadCommon.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuJsonReadCommon.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuJsonReadCommon.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/ColumnCastUtil.scala
Outdated
Show resolved
Hide resolved
build |
@jlowe please take another look |
This also contributes to #10491 in a very small way by adding in a few more tests.
Mostly it turns on white space normalization and tries to verify that it is doing the right thing, but there were some errors, so I filed more issues.