Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

LT Nightly tests: removing forked and adding overall timeout for 2 hrs #19792

Merged
merged 1 commit into from
Feb 3, 2021

Conversation

access2rohit
Copy link
Contributor

@access2rohit access2rohit commented Jan 26, 2021

Description

--forked option causes significant slowdown in pytest runs hence it is better to remove that option and simply increase timeout for large tensor tests

Checklist

Essentials

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

Testing

Done by creating the nightly CI docker build:
ci/build.py -e BRANCH=master --docker-registry mxnetci --platform ubuntu_cpu --docker-build-retries 3 --shm-size 500m /work/runtime_functions.sh build_ubuntu_cpu_large_tensor
and then ran inside the container:
ci/build.py -e BRANCH=master --docker-registry mxnetci --platform ubuntu_cpu --docker-build-retries 3 --shm-size 500m /work/runtime_functions.sh build_ubuntu_cpu_large_tensor

Final output of above command:

========== 201 passed, 11 skipped, 275 warnings in 4228.09s (1:10:28) ==========
2021-02-02 03:11:30,367 - root - INFO - Waiting for status of container 3364a354be6a for 600 s.
2021-02-02 03:11:30,552 - root - INFO - Container exit status: {'Error': None, 'StatusCode': 0}
2021-02-02 03:11:30,553 - root - INFO - Container exited with success 👍
2021-02-02 03:11:30,553 - root - INFO - Executed command for reproduction:

@mxnet-bot
Copy link

Hey @access2rohit , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [miscellaneous, sanity, centos-gpu, windows-cpu, unix-cpu, unix-gpu, clang, windows-gpu, centos-cpu, website, edge]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Jan 26, 2021
@szha szha requested a review from leezu January 27, 2021 03:10
@leezu
Copy link
Contributor

leezu commented Jan 27, 2021

Can you provide more details? Why would removing --fork make the tests run longer (ie. timeout increase is needed)?

@access2rohit
Copy link
Contributor Author

access2rohit commented Jan 27, 2021

Can you provide more details? Why would removing --fork make the tests run longer (ie. timeout increase is needed)?

@leezu
If run 1-2 tests with and without --forked then without forked takes 2-3 mins and finishes successfully but with forked it hangs till the timeout provided in pytest.ini isn't reached. With --forked option neither of the following timeout override options take effect:

@pytest.mark.timeout(3600)
export PYTEST_TIMEOUT=3600
--timeout=3600

It seems that with forked individual timeout is picked only from pytest.ini and even individual decorators don't seem to work only global timeout can be set in them which is not meaningful if individual timeout is 20 mins for each test.

@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-work-in-progress PR is still work in progress labels Jan 27, 2021
Copy link
Contributor

@leezu leezu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying! LGTM

@lanking520 lanking520 added pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress labels Jan 27, 2021
@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Feb 2, 2021
@lanking520 lanking520 added pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress labels Feb 2, 2021
@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-work-in-progress PR is still work in progress labels Feb 2, 2021
@access2rohit access2rohit changed the title removing forked and adding overall timeout for 2 hrs LT removing forked and adding overall timeout for 2 hrs Feb 3, 2021
@access2rohit access2rohit changed the title LT removing forked and adding overall timeout for 2 hrs LTNightly tests: removing forked and adding overall timeout for 2 hrs Feb 3, 2021
@access2rohit access2rohit changed the title LTNightly tests: removing forked and adding overall timeout for 2 hrs LT Nightly tests: removing forked and adding overall timeout for 2 hrs Feb 3, 2021
@lanking520 lanking520 added pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-review PR is waiting for code review and removed pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress labels Feb 3, 2021
@Zha0q1 Zha0q1 merged commit 65beccd into apache:master Feb 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants