Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add time zone config to set non-UTC [databricks] #9652

Merged
merged 9 commits into from
Nov 17, 2023

Conversation

res-life
Copy link
Collaborator

@res-life res-life commented Nov 7, 2023

Contributes #9627

Add time zone config for CI to set and test non-UTC time zone

Signed-off-by: Chong Gao [email protected]

@res-life res-life requested review from pxLi and GaryShen2008 November 7, 2023 02:30
@pxLi pxLi changed the title Add time zone config to set non-UTC Add time zone config to set non-UTC [databricks] Nov 7, 2023
@res-life
Copy link
Collaborator Author

res-life commented Nov 7, 2023

build

@@ -77,6 +77,15 @@ def is_emr_runtime():
def is_dataproc_runtime():
return runtime_env() == "dataproc"

def get_test_tz():
Copy link
Collaborator

@pxLi pxLi Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: where would all these funcs be used? also pytest conf would rely on run_pyspark script seems weird

can you at least try os.environ.get('PYSP_TEST_spark_sql_session_timeZone', 'UTC') to make sure it could have a default?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these funcs will be used in pytest xfail. For an existing test case before operator supports non-utc: add xfail(is_non_utc()).

also pytest conf would relying on run_pyspark script seems weird.

I tested we can get this config in the conftest.
conftest is internal file of IT, so it's safe to get Env variable from run_pyspark.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conftest is internal file of IT, so it's safe to get Env variable from run_pyspark.

hmm OK, for me pytest code itself should at least provide the defaults here

I would like to hear more feedback from other developers~

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If does not set it, then an error throws:

>>> os.environ["PYSP_TEST_spark_sql_session_timeZone"]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<frozen os>", line 679, in __getitem__
KeyError: 'PYSP_TEST_spark_sql_session_timeZone'

This error will force us to set this cfg.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just threw the code together so a get with a default looks like a great addition for safety/robustness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can not get the TZ env variable in conftest.py.

Can you elaborate on this? I'm confused how we can export one variable and read it in conftest but somehow can't do the same to another. Is something in the shell startup environment bashing the TZ variable?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@res-life wrapping TZ with PYSP_TEST_spark_sql_session_timeZone wouldn't be same as original way we did in is_tz_utc?

Java systemDefault will respect TZ environment.

import java.time.ZoneId;

public class Test{
   public static void main(String [] args) {
     System.out.println("time zone is " + ZoneId.systemDefault());
   }

}
$ export TZ="UTC" && java Test
time zone is UTC
$ export TZ="Asia/Shanghai" && java Test
time zone is Asia/Shanghai

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offline discussed with @res-life , let's add comments that we need to get utc time before spark session starts.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can not use the following code when the Spark session is not started.

jvm = spark.sparkContext._jvm
    utc = jvm.java.time.ZoneId.of('UTC').normalized()
    sys_tz = jvm.java.time.ZoneId.systemDefault().normalized()

Copy link
Collaborator Author

@res-life res-life Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can not get the TZ env variable in conftest.py.

Because TZ is rewritten to UTC in data_gen.py

os.environ['TZ'] = 'UTC'
time.tzset()

Removed the above code. See my last commit:
I use Env variable TZ now.
And I updated TimestampGen to avoid generate out of range timestamp

@pxLi pxLi added the test Only impacts tests label Nov 7, 2023
@res-life res-life requested a review from revans2 November 7, 2023 07:54
@res-life
Copy link
Collaborator Author

res-life commented Nov 7, 2023

I saw the skip logic in date_gen.py: when timezone is not UTC and data gen contains timestamp type, skip the tests.
pushed a new commit.

@res-life
Copy link
Collaborator Author

res-life commented Nov 7, 2023

build

@res-life
Copy link
Collaborator Author

res-life commented Nov 7, 2023

I cherry-picked this PR into #9482

revans2
revans2 previously approved these changes Nov 7, 2023
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits really.

@@ -77,6 +77,15 @@ def is_emr_runtime():
def is_dataproc_runtime():
return runtime_env() == "dataproc"

def get_test_tz():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just threw the code together so a get with a default looks like a great addition for safety/robustness.

def get_test_tz():
return os.environ["PYSP_TEST_spark_sql_session_timeZone"]

def is_utc():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize is_tz_utc existed. I think we should probably only keep one way of checking if the timezone is UTC or not. is_tz_utc has the problem that we need a spark session to make it work. That is fine, but it also makes it difficult to use it to skip a test unless it happens after the test starts to run. This patch removes all uses of is_tz_utc so perhaps we should also delete the implementation too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #9482

  • default value with PYSP_TEST_spark_sql_session_timeZone
  • Remove useless is_tz_utc

@res-life
Copy link
Collaborator Author

build

@res-life
Copy link
Collaborator Author

Building was blocked by #9681

@res-life
Copy link
Collaborator Author

build

@res-life
Copy link
Collaborator Author

build

@res-life
Copy link
Collaborator Author

res-life commented Nov 15, 2023

Premerge error:

[2023-11-15T08:29:12.100Z] - mixed blocking alloc with spill *** FAILED ***

[2023-11-15T08:29:12.102Z]   java.util.concurrent.TimeoutException:

[2023-11-15T08:29:12.102Z]   at com.nvidia.spark.rapids.HostAllocSuite$TaskThread$TaskThreadTrackingOp.get(HostAllocSuite.scala:107)

[2023-11-15T08:29:12.102Z]   at com.nvidia.spark.rapids.HostAllocSuite$AllocOnAnotherThread.waitForAlloc(HostAllocSuite.scala:218)

[2023-11-15T08:29:12.102Z]   at com.nvidia.spark.rapids.HostAllocSuite.$anonfun$new$50(HostAllocSuite.scala:705)

[2023-11-15T08:29:12.102Z]   at com.nvidia.spark.rapids.HostAllocSuite.$anonfun$new$50$adapted(HostAllocSuite.scala:703)

[2023-11-15T08:29:12.103Z]   at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:29)

[2023-11-15T08:29:12.103Z]   at com.nvidia.spark.rapids.HostAllocSuite.$anonfun$new$49(HostAllocSuite.scala:703)

[2023-11-15T08:29:12.103Z]   at com.nvidia.spark.rapids.HostAllocSuite.$anonfun$new$49$adapted(HostAllocSuite.scala:699)

[2023-11-15T08:29:12.103Z]   at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:29)

[2023-11-15T08:29:12.103Z]   at com.nvidia.spark.rapids.HostAllocSuite.$anonfun$new$48(HostAllocSuite.scala:699)

[2023-11-15T08:29:12.103Z]   at com.nvidia.spark.rapids.HostAllocSuite.$anonfun$new$48$adapted(HostAllocSuite.scala:690)

Get result in 1s but time out exception occurred. Seems it's a ramdom failure.
I'll re-run the premerge.

Refer to: #9671

@res-life
Copy link
Collaborator Author

build

@res-life
Copy link
Collaborator Author

@revans2 We need to merge this first.

revans2
revans2 previously approved these changes Nov 15, 2023
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I would like others to review/approve too before merging because I wrote a lot of this code.

@res-life res-life requested a review from winningsix November 16, 2023 01:57
@res-life
Copy link
Collaborator Author

build

1 similar comment
@res-life
Copy link
Collaborator Author

build

@winningsix
Copy link
Collaborator

@res-life Per discussed, let's file a ticket tracking DB failed issue if you can produce it with seed ID in last round.

@res-life
Copy link
Collaborator Author

build

@res-life
Copy link
Collaborator Author

@jlowe @NVnavkumar Please help review, thannks.

@res-life
Copy link
Collaborator Author

@res-life Per discussed, let's file a ticket tracking DB failed issue if you can produce it with seed ID in last round.

@winningsix Did not reproduce it in the last premerge.
And I did not reproduce it on DB 3.2.1 using the seed ID. will try DB 3.3.0

Copy link
Collaborator

@winningsix winningsix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@res-life res-life merged commit 3ef145d into NVIDIA:branch-23.12 Nov 17, 2023
36 checks passed
@res-life
Copy link
Collaborator Author

@winningsix Also DB 330 dis not reproduce the previous error.

Anyway, record the previous error here

[2023-11-16T09:54:48.168Z] [2023-11-16T09:54:09.082Z] =========================== short test summary info ============================
[2023-11-16T09:54:48.168Z] [2023-11-16T09:54:09.082Z] FAILED ../../src/main/python/delta_zorder_test.py::test_delta_dfp_reuse_broadcast_exchange[false-0][DATAGEN_SEED=1700125723, IGNORE_ORDER({'local': True})] - pyspark.errors.exceptions.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGES...
[2023-11-16T09:54:48.169Z] [2023-11-16T09:54:09.082Z] FAILED ../../src/main/python/delta_zorder_test.py::test_delta_dfp_reuse_broadcast_exchange[false-1][DATAGEN_SEED=1700125723, IGNORE_ORDER({'local': True})] - pyspark.errors.exceptions.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGES...
[2023-11-16T09:54:48.169Z] [2023-11-16T09:54:09.082Z] FAILED ../../src/main/python/delta_zorder_test.py::test_delta_dfp_reuse_broadcast_exchange[false-2][DATAGEN_SEED=1700125723, INJECT_OOM, IGNORE_ORDER({'local': True})] - pyspark.errors.exceptions.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGES...
[2023-11-16T09:54:48.169Z] [2023-11-16T09:54:09.082Z] FAILED ../../src/main/python/delta_zorder_test.py::test_delta_dfp_reuse_broadcast_exchange[false-3][DATAGEN_SEED=1700125723, IGNORE_ORDER({'local': True})] - pyspark.errors.exceptions.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGES...
[2023-11-16T09:54:48.169Z] [2023-11-16T09:54:09.082Z] FAILED ../../src/main/python/delta_zorder_test.py::test_delta_dfp_reuse_broadcast_exchange[false-4][DATAGEN_SEED=1700125723, IGNORE_ORDER({'local': True})] - pyspark.errors.exceptions.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGES...
[2023-11-16T09:54:48.169Z] [2023-11-16T09:54:09.082Z] FAILED ../../src/main/python/delta_zorder_test.py::test_delta_dfp_reuse_broadcast_exchange[true-0][DATAGEN_SEED=1700125723, INJECT_OOM, IGNORE_ORDER({'local': True})] - pyspark.errors.exceptions.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGES...
[2023-11-16T09:54:48.169Z] [2023-11-16T09:54:09.082Z] FAILED ../../src/main/python/delta_zorder_test.py::test_delta_dfp_reuse_broadcast_exchange[true-1][DATAGEN_SEED=1700125723, INJECT_OOM, IGNORE_ORDER({'local': True})] - pyspark.errors.exceptions.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGES...
[2023-11-16T09:54:48.169Z] [2023-11-16T09:54:09.082Z] FAILED ../../src/main/python/delta_zorder_test.py::test_delta_dfp_reuse_broadcast_exchange[true-2][DATAGEN_SEED=1700125723, INJECT_OOM, IGNORE_ORDER({'local': True})] - pyspark.errors.exceptions.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGES...
[2023-11-16T09:54:48.169Z] [2023-11-16T09:54:09.082Z] FAILED ../../src/main/python/delta_zorder_test.py::test_delta_dfp_reuse_broadcast_exchange[true-3][DATAGEN_SEED=1700125723, IGNORE_ORDER({'local': True})] - pyspark.errors.exceptions.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGES...
[2023-11-16T09:54:48.169Z] [2023-11-16T09:54:09.082Z] FAILED ../../src/main/python/delta_zorder_test.py::test_delta_dfp_reuse_broadcast_exchange[true-4][DATAGEN_SEED=1700125723, IGNORE_ORDER({'local': True})] - pyspark.errors.exceptions.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGES...
[2023-11-16T09:54:48.169Z] [2023-11-16T09:54:09.082Z] = 10 failed, 241 passed, 5 skipped, 21949 deselected, 5 xfailed, 9 warnings in 2695.10s (0:44:55) =

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Only impacts tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants