-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add time zone config to set non-UTC [databricks] #9652
Changes from all commits
0b112ed
cbba2e3
ee930f3
f1267de
d7bb5e6
28591d4
a696b68
4c8fa96
a449e95
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -77,6 +77,15 @@ def is_emr_runtime(): | |
def is_dataproc_runtime(): | ||
return runtime_env() == "dataproc" | ||
|
||
def get_test_tz(): | ||
return os.environ.get('TZ', 'UTC') | ||
|
||
def is_utc(): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I didn't realize There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done in #9482
|
||
return get_test_tz() == "UTC" | ||
|
||
def is_not_utc(): | ||
return not is_utc() | ||
|
||
_is_nightly_run = False | ||
_is_precommit_run = False | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: where would all these funcs be used? also pytest conf would rely on run_pyspark script seems weird
can you at least try
os.environ.get('PYSP_TEST_spark_sql_session_timeZone', 'UTC')
to make sure it could have a default?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these funcs will be used in pytest xfail. For an existing test case before operator supports non-utc: add xfail(is_non_utc()).
I tested we can get this config in the
conftest
.conftest is internal file of IT, so it's safe to get Env variable from run_pyspark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm OK, for me pytest code itself should at least provide the defaults here
I would like to hear more feedback from other developers~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If does not set it, then an error throws:
This error will force us to set this cfg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just threw the code together so a get with a default looks like a great addition for safety/robustness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on this? I'm confused how we can export one variable and read it in conftest but somehow can't do the same to another. Is something in the shell startup environment bashing the TZ variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@res-life wrapping
TZ
withPYSP_TEST_spark_sql_session_timeZone
wouldn't be same as original way we did in is_tz_utc?Java systemDefault will respect TZ environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Offline discussed with @res-life , let's add comments that we need to get utc time before spark session starts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can not use the following code when the Spark session is not started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because TZ is rewritten to UTC in data_gen.py
Removed the above code. See my last commit:
I use Env variable TZ now.
And I updated TimestampGen to avoid generate out of range timestamp