-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI build failed for merged PR #3587
Comments
https://gitlab.icp.uni-stuttgart.de/espressomd/espresso/-/jobs/215951 |
That's odd. There haven't been any changes to the EK recently (or to ROCm) as far as I remember. Do you have an easy way of checking the CI logs to see when it popped up for the first time? |
As root on elk, you can do grep -rF 'ek_charged_plate (Failed)' /srv/gitlab/artifacts/*/*/*/2020_03_*/*/*/job.log |
To limit the results to the python branch, to ROCm builds and to the last 3 months: grep -l -P 'Checking out .{6,12} as python' $(grep -l -F 'rocm-python3:latest' $(grep -l -F 'ek_charged_plate (Failed)' /srv/gitlab/artifacts/*/*/*/2020_0?_*/*/*/job.log)) There are only two results. Read files with |
The `ln -s /opt/rocm/bin/hcc* /opt/rocm/hip/bin/` issue has been worked around by properly setting `HCC_PATH` on the CMake side. The shutdown issue has been worked around by replacing interrupts with polling (suggested at ROCm/roctracer#22 (comment)). Something is wrong with the destruction order in our code, but I cannot easily identify what. It's not the missing `cudaDestoryStream` though. Fixes #3620 (according to `ctest -R save_checkpoint_lb.cpu-p3m.cpu-lj-therm.lb_1 --repeat-until-fail 1000`). Fixes #3587 (according to `ctest -R ek_charged_plate --repeat-until-fail 100`). **TODO** - https://github.com/espressomd/docker/blob/master/docker/rocm-python3/Dockerfile-latest needs to be updated to ROCm 3.3 once this pull request is merged.
https://gitlab.icp.uni-stuttgart.de/espressomd/espresso/pipelines/11478
The text was updated successfully, but these errors were encountered: