-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCT fails with OSError: [Errno 98] Address already in use
#6345
Comments
@aleksbykov isn't it related to manager you added recently in #6302? |
@soyacz , yes looks it is related to that code. It failed when try to initialize the new process event for counter. But it seems internal python errro, so event if it will be initialized lately, it will get same error, because addr will in used anyway. May be i am wrong. As workaround, i can add try except on launching this event-process, so start of it doesn't fail the test? WDYT? with perf and longevities i didn't catch such error. |
We don't count events in ArtifactTests, so this could be a workaround.
Because we run only one test at a time in sct runners. How about resigning from using manager at all and use Queue/Pipe as a bus for add/get/remove counter (or even don't get counters, just get_stats straight from |
one thing is for sure in artifacts we could get into a situation we would be running multiple jobs on the same builder as the same time. so the failure of address is already in use, sounds related to two test running at the same time. we shouldn't have such a limit, it cloud backfire in multiple places. |
sounds like it's a bug in python 3.9 and 3.10 |
I'm opening a PR with rebuild of the image, I think it would solve this one. |
looks like we are using a very early release of python 3.10 and we didn't got the revert that was done in python/cpython#98503 rebuilding image with latest release of python 3.10 Fixes: scylladb#6345 Ref: python/cpython#98503
looks like we are using a very early release of python 3.10 and we didn't got the revert that was done in python/cpython#98503 rebuilding image with latest release of python 3.10 Fixes: #6345 Ref: python/cpython#98503
looks like we are using a very early release of python 3.10 and we didn't got the revert that was done in python/cpython#98503 rebuilding image with latest release of python 3.10 Fixes: #6345 Ref: python/cpython#98503 (cherry picked from commit 61bd889)
looks like we are using a very early release of python 3.10 and we didn't got the revert that was done in python/cpython#98503 rebuilding image with latest release of python 3.10 Fixes: scylladb#6345 Ref: python/cpython#98503 (cherry picked from commit 61bd889)
Issue description
The
artifacts-ubuntu2204-arm-test
job failed with the error:The error happened at the very beginning of the execution of
Run SCT Test ()
stage of the pipeline, whenClusterTester
was being initialized - the test didn't really start.How frequently does it reproduce?
It reproduced in many other artifacts jobs
scylla-master
:scylla-enterprise
:and more
Installation details
Cluster size: 1 nodes (im4gn.xlarge)
Scylla Nodes used in this run:
No resources left at the end of the run
OS / Image:
ami-022c8ce295ce9ac4c
(aws: undefined_region)Test:
artifacts-ubuntu2204-arm-test
Test id:
707e495b-0757-47c6-966e-4a613fa62c4f
Test name:
scylla-master/artifacts/artifacts-ubuntu2204-arm-test
Test config file(s):
Logs and commands
$ hydra investigate show-monitor 707e495b-0757-47c6-966e-4a613fa62c4f
$ hydra investigate show-logs 707e495b-0757-47c6-966e-4a613fa62c4f
Logs:
Jenkins job URL
Argus
The text was updated successfully, but these errors were encountered: