Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

secex-data volume hack is subject to races #835

Open
jlebon opened this issue Mar 13, 2023 · 6 comments
Open

secex-data volume hack is subject to races #835

jlebon opened this issue Mar 13, 2023 · 6 comments

Comments

@jlebon
Copy link
Member

jlebon commented Mar 13, 2023

On non-s390x, we rely on the --volume=secex-data:/data.secex:ro switch we pass to podman in cosa remote-session create to just create an empty volume. This logic though is subject to races if we're creating multiple remote sessions onto the same non-s390x builder:

Error: creating named volume "secex-data": adding volume to state: name secex-data is in use: volume already exists

(In this case, this happened in the bump-lockfile job, which often gets executed in parallel for testing-devel and next-devel.)

Probably the simplest fix for this is to have it created at provisioning time. That way it's consistent with s390x too.

@dustymabe
Copy link
Member

I wonder if there is a race condition in podman itself here that they would be interested to fix.

@jschintag
Copy link
Contributor

While this does seem like a podman issue, for now as a workaround we could apply the same method we use on the s390x builder to keep the volume from beeing garbage collected in the first place. This would go against the point of why we are using the volume in the first place, that is to prevent doing any configuration on other builders that is only really needed on one arch. But it should prevent this issue, as the volume would always be available.

- path: /home/builder/.config/systemd/user/secex-data-keepalive.service
mode: 0644
user:
name: builder
group:
name: builder
contents:
inline: |
[Unit]
Description=Run keepalive container for secex-data volume. See: https://github.com/containers/podman/issues/17051
[Service]
Type=oneshot
ExecStart=podman run -d --replace --name secex-data-keepalive -v secex-data:/data.secex:ro registry.fedoraproject.org/fedora:36 sleep infinity
links:
- path: /home/builder/.config/systemd/user/default.target.wants/secex-data-keepalive.service
target: /home/builder/.config/systemd/user/secex-data-keepalive.service
user:
name: builder
group:
name: builder

@dustymabe
Copy link
Member

We'll be able to remove that "keep from being garbage collected" debt very soon as the fix for containers/podman#17051 is in podman 4.5.0, which is already in next and will be in testing and stable within a month so we could put off fixing this until we drop that tech debt.

@jschintag want to confirm the fix for containers/podman#17051 works as you expected?

@jschintag
Copy link
Contributor

I tested it and it works.

@dustymabe
Copy link
Member

@jschintag is original issue described by this ticket still an issue?

@jschintag
Copy link
Contributor

I mean as this is for non-s390x architectures, i would say yes, this could still happen. I did not hear anything about the race condition being fixed for podman. Did we ever even create a Issue over at https://github.com/containers/podman/issues for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants