Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition when "skopeo copy" multiple tags into the same oci:directory at the same time #1028

Open
majek opened this issue Aug 18, 2020 · 4 comments
Labels
kind/feature A request for, or a PR adding, new functionality

Comments

@majek
Copy link

majek commented Aug 18, 2020

I'm not sure what guarantees does skopeo give with regard to races. See:

marek@mrnew:/tmp$ (skopeo copy docker://registry.fedoraproject.org/fedora:30 oci:image:30  &); (skopeo copy docker://registry.fedoraproject.org/fedora:32 oci:image:32 &); (skopeo copy docker://registry.fedoraproject.org/fedora:33 oci:image:33)

... wait for them to finish...

marek@mrnew:/tmp$ jq . < image/index.json |grep name
        "org.opencontainers.image.ref.name": "latest"
        "org.opencontainers.image.ref.name": "31"
        "org.opencontainers.image.ref.name": "33"

I would expect to see the tag "32" there as well, but I presume it raced with other downloads. Is it expected? Is it okay to run multiple "skopeo copy" into "oci:dir" at the same time?

@mtrmac mtrmac transferred this issue from containers/skopeo Aug 18, 2020
@mtrmac
Copy link
Collaborator

mtrmac commented Aug 18, 2020

Thanks for your report.

Handling concurrent writes hasn’t been an explicit design goal so far, and is non-obvious to achieve in general on Linux (mandatory file locking is not available, advisory file locking is up to individual implementations, the usual temp file + rename trick breaks even that).

Basically c/image would have to invent its own private locking schema for oci: directories, and hope that there isn’t any other concurrent writer.

Worse, there’s a design dichotomy between locking for the full duration of an operation (in which case the above series of copies would get no speed-up to speak of) and locking only for individual file writes (which would work for a group of add-only writers but could break pretty badly once something like #993 is added — blobs could be removed before an image is finished being written). There’s probably a way to design locking / in-progress state to support both fast concurrent writers and safety against concurrent deletes — but is that complexity really worth it?

So, at this point, I’d recommend serializing the Skopeo invocations; or maybe, if the goal is to transfer images using a file system, run a temporary docker/distribution server, copy images there, and transfer the backing storage of the server. That would ~avoid the concurrent delete problem (because there isn’t a single index to serialize, and deletes are not enabled there either :) ) and more importantly preserve the original representation+digests of the images, not forcing a conversion to OCI.

@majek
Copy link
Author

majek commented Aug 31, 2020

Git is a great example of concurrent access, synced by disk, done right - so it is possible. "temporary distribution server" -> suggestions?

@mtrmac
Copy link
Collaborator

mtrmac commented Aug 31, 2020

podman run -p 5000:5000 registry:2 with an appropriate storage volume, or the out-of-container equivalent.

@mtrmac mtrmac added the kind/feature A request for, or a PR adding, new functionality label Dec 9, 2022
@mtrmac
Copy link
Collaborator

mtrmac commented Jan 2, 2024

Worse, there’s a design dichotomy … could break pretty badly once something like #993 is added

After #2003 , we do now support deleting images from an oci: destination. So any implementation would need to handle that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature A request for, or a PR adding, new functionality
Projects
None yet
Development

No branches or pull requests

2 participants