Skip to content
This repository has been archived by the owner on Feb 28, 2023. It is now read-only.

integration tests fail to setup overlay within container launch #10

Closed
drahnr opened this issue Nov 17, 2020 · 15 comments · Fixed by #128
Closed

integration tests fail to setup overlay within container launch #10

drahnr opened this issue Nov 17, 2020 · 15 comments · Fixed by #128
Assignees
Labels

Comments

@drahnr
Copy link
Contributor

drahnr commented Nov 17, 2020

During the impl of #9 I ran into issues trying to create a overlay mount from withing another container, which is part of the unit test harness.

This https://github.com/paritytech/sccache/blob/bernhard-podman/src/bin/sccache-dist/build.rs#L273-L307 piece of code errors out with the following error:

(shortened uuids to d9629, added newlines for readbility)

 WARN 2020-11-17T09:31:07Z: sccache::dist::http::server: Res 2 error: run build failed,
caused by: Compilation execution failed,
caused by: Failed to mount overlay FS: 
overlayfs "/sccache-bits/build-dir/toolchains/d9629",
upperdir="/sccache-bits/build-dir/builds/d9629-1/upper",
workdir="/sccache-bits/build-dir/builds/d9629-1/work" -> "/sccache-bits/build-dir/builds/d9629-1/target":
Operation not permitted (os error 1) (
"/sccache-bits/build-dir/toolchains/d9629": exists,
upperdir: exists,
workdir: exists,
same-fs,
target: exists,
mapped-root)

To reproduce:

cargo t --features dist-tests test_dist_basic -- --nocapture

in branch bernhard-podman.

Context

The outer container is a rootless podman container.

podman has configurable backends, overlay, vfs, btrfs - the first and last were attempted without any effect.
Adding --privileged or --add-cap CAP_SYS_ADMIN were also attempted for either backend without effect.
Relevant code: https://github.com/paritytech/sccache/blob/bernhard-podman/tests/harness/mod.rs#L354-L387

@drahnr drahnr changed the title integration tests do not work in containers integration tests fail to setup overlay within container launch Nov 17, 2020
@drahnr drahnr self-assigned this Nov 17, 2020
@TriplEight
Copy link
Contributor

I see you want to run podman within unprivileged podman. And this is something is rather unreal with gitlab right now.
To start with, there's no clean way to implement the podman runner, can provide you with some details.
Second point is, if there will be a custom runner, it will be just one runner, dedicated to a single project, which is maaaybe a bit of an overkill. Especially if there are another options.

Among another options - gitlab is capable of running dind or the second container as service

Both are are GitLab/Docker hacks™, so should work

@TriplEight
Copy link
Contributor

for the reference:
containers/podman#3917
containers/podman#4131

@gww-parity
Copy link
Contributor

I see no lowerdir option speficied for overlayfs.

Regarding upperdir and workdir they looks like on same fs, so it's good.

As I understood from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e9be9d5e76e34872f0c37d72e25bc27fe9e2c54c , without lowerdir , whole overlayfs does not make sense?

@drahnr
Copy link
Contributor Author

drahnr commented Nov 17, 2020

for the reference:
containers/podman#3917
obsoleted by the second one :)
containers/podman#4131
and this one says it works for podman inside docker.

docker run --rm -ti \
--security-opt seccomp=unconfined \
--security-opt label=disable \
--cap-add SYS_ADMIN \
--cap-add SYS_RESOURCE \
--env STORAGE_DRIVER=vfs \
quay.io/podman/stable sh -c "podman run hello-world"

What we need is bubblewrap inside podman for step 1, and the make sure that works in the above as well (if we want to go that route).

@drahnr
Copy link
Contributor Author

drahnr commented Nov 17, 2020

I see no lowerdir option speficied for overlayfs.

Regarding upperdir and workdir they looks like on same fs, so it's good.

As I understood from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e9be9d5e76e34872f0c37d72e25bc27fe9e2c54c , without lowerdir , whole overlayfs does not make sense?

I don't think this is correct.

Overlay::writable(
                        iter::once(overlay.toolchain_dir.as_path()),
                        upper_dir,
                        work_dir,
                        &target_dir,
                    ).mount()

has one lower dir represented by the iterator as required API desc

@drahnr
Copy link
Contributor Author

drahnr commented Nov 18, 2020

@drahnr
Copy link
Contributor Author

drahnr commented Nov 18, 2020

Demo to showcase the issue for faster test cycles https://github.com/drahnr/overlay-fs-gen

@drahnr
Copy link
Contributor Author

drahnr commented Nov 20, 2020

Even the test binary fails with non root (even on the host!), so it has to be a privileged container that is run as root or via the docker daemon.

@TriplEight
Copy link
Contributor

Since this is about CI, it should be linked with #1

I've been thinking about our options, the most obvious is services: in gitlab CI. It's about providing a network-accessible service via a legacy docker's --link option, in a slow dind mode. It's not possible to run arbitrary code in a "service" interactively, but it's real to overwrite its ENTRYPOINT and CMD, to run something as a server. This is mostly used for serving databases.

@TriplEight
Copy link
Contributor

TriplEight commented Dec 28, 2020

Other options to run docker in the pipeline include running a shell executor, binding a socket, but they won't be optimal for us.

The remaining choice that I can think of right now:

  • run the service and the client inside of the same image, the major downside of it is we won't test networking here
  • have a kubernetes deployment

@drahnr
Copy link
Contributor Author

drahnr commented Jan 4, 2021

Other options to run docker in the pipeline include running a shell executor, binding a socket, but they won't be optimal for us.

I think this is how it was done where I worked in the past, now that you put the links up.

* run the service and the client inside of the same image, the major downside of it is we won't test networking here

Not sure how we would test networking right now. That would need some additional machinery.

* have a kubernetes deployment

Yikes, that seems to be a bit of an overkill? My experience with kubernetes is limited though.


Idea: Docker can run arbitrary executors, and it might make sense to look into using firecracker or katacontainers instead of the default runc underlying executor of docker which would give better host isolation.

@gww-parity
Copy link
Contributor

Not sure how we would test networking right now. That would need some additional machinery.

I guess you mean, e.g. one another process in container acting as proxy, through which those parts would be talking through, to monitor/analyse etc network through that proxy?

(Eventually instrumenting network calls via LD_PRELOAD :P)

@drahnr
Copy link
Contributor Author

drahnr commented Jan 5, 2021

I guess you mean, e.g. one another process in container acting as proxy, through which those parts would be talking through, to monitor/analyse etc network through that proxy?

Yes, but we should discuss that in a separate issue and it's also a topic for the a $(distant future) release.

@TriplEight
Copy link
Contributor

TriplEight commented Jan 5, 2021

kubernetes

Yikes, that seems to be a bit of an overkill?
Even though it might look like an overkill (and as a good-enough setup I'd name docker-compose), but kubernetes is more versatile and flexible, so we will win with it in the long term. Besides, making GitLab work with kubernetes is easier than maybe everything else, and certainly more stable.

With Kubernetes we will be able to scale the setup and perform everything we will ever need, like testing networking, distributed multi-source caching etc.

I also don't have that much of experience with setting up k8s, luckily we have who to ask for the help.

@TriplEight
Copy link
Contributor

I just realized that we are blocked here.
In any case, we will need the green CI, so we'll be able to produce a containerized sccache to run it in CI with dist tests.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants