Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't write duplicate entries e.g. symlinks to tar file #88

Open
fionera opened this issue Jan 15, 2025 · 5 comments
Open

Don't write duplicate entries e.g. symlinks to tar file #88

fionera opened this issue Jan 15, 2025 · 5 comments

Comments

@fionera
Copy link

fionera commented Jan 15, 2025

Currently there are multiple entries for the same target file inside a rpm2tar result. Because of this, it isn't possible to build e.g. an oci_layer out of it. The symlinks should be add to the collectors map of written files and the collector should skip and write for these

@kellyma2
Copy link
Collaborator

Do you have a reproducer?

@fionera
Copy link
Author

fionera commented Jan 15, 2025

load("@bazeldnf//:deps.bzl", "rpmtree")
load("@rules_oci//oci:defs.bzl", "oci_image", "oci_load", "oci_push")

rpmtree(
    name = "sandbox",
    rpms = [
        "@binutils-0__2.41-38.fc40.x86_64//rpm",
        "@binutils-gold-0__2.41-38.fc40.x86_64//rpm",
    ],
    symlinks = {
        "/usr/bin/ld": "/usr/bin/ld.bfd",
    },
    visibility = ["//visibility:public"],
)
oci_image(
    name = "sandbox_image",
    base = "@distroless_base",
    entrypoint = [],
    tars = [
        ":sandbox",
    ],
    visibility = ["//visibility:private"],
    workdir = "/root",
)
oci_load(
    name = "load",
    image = ":sandbox_image",
    repo_tags = ["foo"],
)

    rpm(
        name = "binutils-0__2.41-38.fc40.x86_64",
        sha256 = "5dba5e8826c29a4b4d55fb506c9b6f929ded1e73259fce26630cf13f1f4d5715",
        urls = [
            "https://dl.fedoraproject.org/pub/fedora/linux/updates/40/Everything/x86_64/Packages/b/binutils-2.41-38.fc40.x86_64.rpm",
        ],
    )
    rpm(
        name = "binutils-gold-0__2.41-38.fc40.x86_64",
        sha256 = "02962db175354365a447c0cfd56c7f4902359dee3a4b302c9d123799b840f218",
        urls = [
            "https://dl.fedoraproject.org/pub/fedora/linux/updates/40/Everything/x86_64/Packages/b/binutils-gold-2.41-38.fc40.x86_64.rpm",
        ],
    )

@manuelnaranjo
Copy link
Collaborator

So the overlap is in usr/lib/.build-id that I think we should ignore anyway, how does the error manifest? What are your bazel calls that leads to the issue? Could you make a repro repo so we can test things?

@fionera
Copy link
Author

fionera commented Jan 15, 2025

rpm2tar writes its symlinks ( https://github.com/rmohr/bazeldnf/blob/main/cmd/rpm2tar.go#L64 ) before the actual files from all rpms. But because the collector doesn't know that the symlinks are written (they get directly add to the tarWriter), the original file is also added. e.g. /usr/bin/ld

If you try to run the :load target your docker daemon will complain that there are multiple entries

@manuelnaranjo
Copy link
Collaborator

So ld comes from the symlinks you're passing into the rpmtree call and from 1 of the rpms (not from both). The only overlapping file is in the path I mentioned, I opened both rpms manually.
Maybe for the symlink you should add a tar that creates the symlink in another layer, you will still be waisting space on the image as ld binary is still there.
I would say to work on a fix the first thing we need is either a repro repository or an e2e test like the other ones we have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants