-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fallback non-UTC TimeZoneAwareExpression with zoneId [databricks] #10996
Fallback non-UTC TimeZoneAwareExpression with zoneId [databricks] #10996
Conversation
…one config Signed-off-by: Haoyang Li <[email protected]>
Signed-off-by: Haoyang Li <[email protected]>
build |
build |
@@ -1123,7 +1123,7 @@ abstract class BaseExprMeta[INPUT <: Expression]( | |||
if (!needTimeZoneCheck) return | |||
|
|||
// Level 2 check | |||
if (!isTimeZoneSupported) return checkUTCTimezone(this) | |||
if (!isTimeZoneSupported) return checkUTCTimezone(this, getZoneId()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- do we still need to check SQLConf.get.sessionLocalTimeZone? what if user set SQLConf.get.sessionLocalTimeZone to a non UTC timezone and did not set timezone in expression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- there're still many invokation of isUTCTimezone() in the code, do we need to go through all of them and keep them consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- do we still need to check SQLConf.get.sessionLocalTimeZone? what if user set SQLConf.get.sessionLocalTimeZone to a non UTC timezone and did not set timezone in expression?
I believed that the expression's timezone is from the config and have higher priority, will test this behavior to see if it is correct. thanks.
- there're still many invokation of isUTCTimezone() in the code, do we need to go through all of them and keep them consistent?
Will check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested this behavior via spark ut:
expression will follow zoneId first after this pr, under both cpu and gpu.
package org.apache.spark.sql.rapids.suites
import java.util.{Calendar, TimeZone}
import org.apache.spark.sql.catalyst.expressions._
import org.apache.spark.sql.catalyst.util.DateTimeTestUtils.{PST, UTC, UTC_OPT}
import org.apache.spark.sql.catalyst.util.DateTimeUtils
import org.apache.spark.sql.rapids.utils.{RapidsJsonConfTrait, RapidsTestsTrait}
import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.types._
class RapidsJsonExpressionsSuite
extends JsonExpressionsSuite with RapidsTestsTrait with RapidsJsonConfTrait {
test("to_json with timestamp 2") {
val schema = StructType(StructField("t", TimestampType) :: Nil)
val c = Calendar.getInstance(TimeZone.getTimeZone(UTC))
c.set(2016, 0, 1, 0, 0, 0)
c.set(Calendar.MILLISECOND, 0)
val struct = Literal.create(create_row(c.getTimeInMillis * 1000L), schema)
checkEvaluation(
StructsToJson(Map.empty, struct, UTC_OPT),
"""{"t":"2016-01-01T00:00:00.000Z"}"""
)
SQLConf.get.setConfString("spark.sql.session.timeZone", "UTC")
// SQLConf.get.setConfString("spark.rapids.sql.enabled", "false")
checkEvaluation(
StructsToJson(Map.empty, struct, Option(PST.getId)),
"""{"t":"2015-12-31T16:00:00.000-08:00"}"""
)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- there're still many invokation of isUTCTimezone() in the code, do we need to go through all of them and keep them consistent?
Verified that these use cases (mostly related to scan and file format) should read the config. There is no zoneId there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…IDIA#10996) * Fallback non-UTC TimeZoneAwareExpression with zoneId instead of timeZone config Signed-off-by: Haoyang Li <[email protected]> * clean up Signed-off-by: Haoyang Li <[email protected]> --------- Signed-off-by: Haoyang Li <[email protected]>
Close #10995
In some
TimeZoneAwareExpression
that plugin supports only UTC, the expression will fallback according toGpuOverrides.isUTCTimezone
. In some case when we call an expression directly by passing a timezone as it's parameter, like in Spark UT:The fallback doesn't work because it didn't change any timezone config.
This PR uses the zoneId in the expr to check the timezone instead of reading the timezone config to check whether to fallback to cpu under non-utc timezone.