Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increate timeouts of tests that frequently timeout in CI #1228

Merged
merged 5 commits into from
Aug 29, 2023

Conversation

pentschev
Copy link
Member

In the past few weeks some tests have timed out with certain frequency in CI, probably due to its load. Attempt to avoid those by increasing timeouts from 20 to 30 seconds.

@pentschev pentschev requested a review from a team as a code owner August 28, 2023 13:57
@github-actions github-actions bot added the python python code needed label Aug 28, 2023
@pentschev
Copy link
Member Author

Sample of such timeouts can be seen in yesterday's nightly tests.

@pentschev pentschev added bug Something isn't working 3 - Ready for Review Ready for review by team non-breaking Non-breaking change labels Aug 28, 2023
@pentschev
Copy link
Member Author

More examples of timeouts from last night: https://github.com/rapidsai/dask-cuda/actions/runs/6008227979/attempts/1

@wence-
Copy link
Contributor

wence- commented Aug 29, 2023

Those timeouts sadly still persist in this PR. Do we need to bump them further?

@pentschev
Copy link
Member Author

I'm setting up locally to see if I can reproduce them and how long the average runtime seems to be on our systems. I'm wondering if this is only the timeout or there's something else hidden under the timeout error.

@pentschev pentschev requested a review from a team as a code owner August 29, 2023 10:24
@github-actions github-actions bot added the ci label Aug 29, 2023
@pentschev
Copy link
Member Author

Or perhaps an easier and more accurate way to measure this is to (maybe temporarily) increase timeouts further and print durations of all tests in CI.

@pentschev pentschev added 2 - In Progress Currently a work in progress and removed 3 - Ready for Review Ready for review by team labels Aug 29, 2023
@pentschev
Copy link
Member Author

Of course, now that the timeout has been increased there's nothing unusual, with test_communicating_proxy_objects taking < 10s every time. I think we may need to leave --durations=0 for a few days and see it during the nightly tests, it looks to me like nightly tests fail because that's when we have the highest utilization of CI nodes when nightly tests are running for all projects, and thus things get slower.

What do you think we should do here @wence- ?

@wence-
Copy link
Contributor

wence- commented Aug 29, 2023

That sounds like a reasonable approach, yes.

@pentschev pentschev added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Aug 29, 2023
@pentschev
Copy link
Member Author

That sounds like a reasonable approach, yes.

Alright then, should we merge it as is then? I'll keep an eye for tonight's run and see what I can find out.

@pentschev
Copy link
Member Author

/merge

1 similar comment
@pentschev
Copy link
Member Author

/merge

@pentschev
Copy link
Member Author

@wence- was trying to get this merged today but it seems today's not the day to merge it. 😞

@rapids-bot rapids-bot bot merged commit 171fd2c into rapidsai:branch-23.10 Aug 29, 2023
@pentschev pentschev deleted the test-timeout-increase branch October 4, 2023 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working ci non-breaking Non-breaking change python python code needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants