Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix test execution sometimes hanging (OSOE-129) #126

Closed
Piedone opened this issue Jun 1, 2022 · 11 comments
Closed

Fix test execution sometimes hanging (OSOE-129) #126

Piedone opened this issue Jun 1, 2022 · 11 comments
Labels
bug Something isn't working

Comments

@Piedone
Copy link
Member

Piedone commented Jun 1, 2022

Sometimes, for seemingly no good reason, the Tests step hangs and times out if we let it run. Let's fix this.

  • See e.g.:
    • This build, logs:
      logs_364.zip. Happened on Ubuntu.
    • This one happened on Windows. Logs:
      logs_334.zip
    • Another one, ironically happened when setting the workflow timeout as a stopgap to prevent them from running for 6 hours.
    • See others under canceled Actions (note that cancellation can be by the user's request too, a run is only interesting if it took one hour and was canceled by the system).
  • It seems that the test run doesn't even start dotnet test. Maybe a command before that is hanging.
@Piedone Piedone added the bug Something isn't working label Jun 1, 2022
@Piedone
Copy link
Member Author

Piedone commented Jun 21, 2022

This might have been fixed by accident after moving the test-executing code to PS. We need to see if new such occasions happen.

@Piedone
Copy link
Member Author

Piedone commented Jun 27, 2022

Nope, here's a recent one.

@Piedone
Copy link
Member Author

Piedone commented Jun 28, 2022

And here.

@Piedone
Copy link
Member Author

Piedone commented Jun 28, 2022

Hmm, it seems that if we set maxParallelThreads = MaxParallelTests = 2 then it reliably gets stuck. See the previous ones, and here and here. This coincides with the Azure D2v2 VM used by GitHub-hosted runners having 2 vCores. However, when the issues was originally opened, we didn't have such config. Also, this run as with a parallelism of 4. Maybe the multiples of vCores cause an issue?

@Piedone
Copy link
Member Author

Piedone commented Jun 28, 2022

Doesn't seem so. Here and here a run with a parallelism of 3 got stuck. Maybe it's the fifth test run that changes parallelism that always hangs.

@Piedone Piedone changed the title Fix test execution sometimes hanging Fix test execution sometimes hanging (OSOE-129) Aug 8, 2022
@Piedone
Copy link
Member Author

Piedone commented Aug 14, 2022

Perhaps now it's fixed? No new timeouts since June.

@sarahelsaig
Copy link
Member

I had a non-repeatable timeout in SNOW-119 just 2 days ago.

@Piedone
Copy link
Member Author

Piedone commented Aug 14, 2022

@sarahelsaig
Copy link
Member

sarahelsaig commented Aug 14, 2022

Ah, I didn't realize this issue was Actions-specific. Then disregard my previous comment. The timeout was in TeamCity.

@Piedone
Copy link
Member Author

Piedone commented Aug 15, 2022

I see.

@Piedone
Copy link
Member Author

Piedone commented Sep 2, 2022

There are still no such timeouts. I guess it solved itself, then. Good bug!

@Piedone Piedone closed this as not planned Won't fix, can't repro, duplicate, stale Sep 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants