Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/e2e: Add Toolbox-specific test cases #7588

Merged

Conversation

HarryMichal
Copy link
Member

@HarryMichal HarryMichal commented Sep 10, 2020

In the past, Toolbox[0] has been affected by several of Podman's
bugs/changes of behaviour. This is one of the steps to assure that as
Podman progresses, Podman itself and subsequently Toolbox do not regress.
One of the other steps is including Toolbox's system tests in Podman's
gating systems (which and to what extent is yet to be decided on).

The tests are trying to stress parts of Podman that Toolbox needs for
its functionality: permission to handle some system files, correct
values/permissions/limits in certain parts, management of users and
groups, mounting of paths,.. The list is most likely longer and
therefore more commits will be needed to control every aspect of the
Toolbox/Podman relationship :).

Some test cases in test/e2e/toolbox_test.go rely on some tools being
present in the base image[1]. That is not the case with the common
ALPINE image or the basic Fedora image.

Some tests might be duplicates of already existing tests. I'm more in
favor of having those duplicates. Thanks to that it will be clear what
functionality/behaviour Toolbox requires.

[0] https://github.com/containers/toolbox
[1] https://github.com/containers/toolbox/#image-requirements

I'm opening the PR to get some feedback if these tests seem okay. I'm 100% sure that these are not all the tests cases we'd like to get covered but I think it's better to have something to start with and then pile on top of it :).

/cc @debarshiray @edsantiago

@openshift-ci-robot
Copy link
Collaborator

@HarryMichal: GitHub didn't allow me to request PR reviews from the following users: debarshiray.

Note that only containers members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

In the past, Toolbox[0] has been affected by several of Podman's
bugs/changes of behaviour. This is one of the steps to assure that as
Podman progresses, Podman itself and subsequently Toolbox do not regress.
One of the other steps is including Toolbox's system tests in Podman's
gating systems (which and to what extent is yet to be decided on).

The tests are trying to stress parts of Podman that Toolbox needs for
its functionality: permission to handle some system files, correct
values/permissions/limits in certain parts, management of users and
groups, mounting of paths,.. The list is most likely longer and
therefore more commits will be needed to control every aspect of the
Toolbox/Podman relationship :).

Some tests might be duplicates of already existing tests. I'm more in
favor of having those duplicates. Thanks to that it will be clear what
functionality/behaviour Toolbox requires.

[0] https://github.com/containers/toolbox

I'm opening the PR to get some feedback if these tests seem okay. I'm 100% sure that these are not all the tests cases we'd like to get covered but I think it's better to have something to start with and then pile on top of it :).

/cc @debarshiray @edsantiago

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.


session = podmanTest.Podman([]string{"run", "--dns", "none", ALPINE,
"rm", "/etc/resolv.conf", ";",
"touch", "/etc/resolv.conf"})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this syntax actually work? The usual convention seems to be:

    podmanTest...( "sh", "-c", "this;that;theother")

...but this was just a detour for bringing me to my favorite rant: "run something and only check exit code" is not a test. Some typos can lead to a command passing but not actually doing anything. A much better idea is to check for unusual strings. For instance:

sh -c "rm /etc/resolv.conf && touch -d '1970-01-01 00:02:03' /etc/resolv.conf && stat -c %s:%Y /etc/resolv.conv"

...and confirm that you get an exact match on 0:123

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this syntax actually work? The usual convention seems to be:

Yes, I was not sure entirely how to go about the commands in the entry-point. Your idea seems to be the right one.

...but this was just a detour for bringing me to my favorite rant: "run something and only check exit code" is not a test.

Roger! The next iteration should fix this. Thank you!

@HarryMichal HarryMichal force-pushed the add-toolbox-e2e-tests branch 5 times, most recently from 2803b85 to 5a2ea0f Compare September 22, 2020 17:34
@rhatdan
Copy link
Member

rhatdan commented Sep 22, 2020

@edsantiago
Copy link
Member

@rhatdan I'm waiting for CI churn

@mheon
Copy link
Member

mheon commented Sep 22, 2020

We may want to add fedoraToolbox to the cached images - it's a ~60mb pull for three separate tests.

@rhatdan
Copy link
Member

rhatdan commented Sep 22, 2020

Yes adding this to cached images makes sense.

@HarryMichal
Copy link
Member Author

I still need to finish the test-cases before this can be merged. I keep running into problems :).

Is it on me to cache the image or is that something done by you, guys? I wanted to avoid relying on the fedora-toolbox image because the way it is stored in the registry is stupid and requires a bit of maintenance (the full name of the image is fxx/fedora-toolbox where xx stands for the version number of Fedora instead of being called just fedora-toolbox). But for the maintenance, there are contacts to me or @debarshiray.

@TomSweeneyRedHat
Copy link
Member

Great idea @HarryMichal , thanks for taking this on.
Changes LGTM in general, but I'll lean on @edsantiago 's opinion.
Tests at the moment are not happy.

@HarryMichal HarryMichal force-pushed the add-toolbox-e2e-tests branch 13 times, most recently from 1d42761 to eb5ee25 Compare September 24, 2020 13:31
@HarryMichal
Copy link
Member Author

The tests are failing on just one test case and I'm not sure if it is due to the test case not being correct or it's actually hitting a bug. The --ulimit host option in podman create is supposed to set ulimits in the container to match the host but the hard limit in the container is twice is large than on the host.

@rhatdan, what do you think?

@edsantiago
Copy link
Member

@HarryMichal see setrlimit(2):

       The  soft  limit  is  the value that the kernel enforces for the corre‐
       sponding resource.  The hard limit acts  as  a  ceiling  for  the  soft
       limit:  an  unprivileged process may set only its soft limit to a value
       in the range from 0 up to the hard limit, and (irreversibly) lower  its
       hard   limit.    A  privileged  process  (under  Linux:  one  with  the
       CAP_SYS_RESOURCE capability in the initial  user  namespace)  may  make
       arbitrary changes to either limit value.

@edsantiago
Copy link
Member

@HarryMichal our CI environment changed today; you're going to need to rebase in order for us to merge this.

@HarryMichal HarryMichal force-pushed the add-toolbox-e2e-tests branch from 1508039 to e6b5c9d Compare October 5, 2020 20:12
@HarryMichal
Copy link
Member Author

I must say that the names of the tests are much clearer now :). Good work!

@rhatdan
Copy link
Member

rhatdan commented Oct 5, 2020

/lgtm
/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 5, 2020
@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 5, 2020
@edsantiago
Copy link
Member

Did someone press re-run on a failed test here? If so, I think we have a problem.

When I checked earlier this morning I saw a test failure that looked real. It is now gone, but there has been no push. That means this is a flake, and is going to come back to haunt us later on.

The errors are both in Toolsbox-specific testing, two different tests, each failed three times in a row. The failures are the ginkgo anti-helpful ones, but @HarryMichal I beg you to please look into them before merging this.

Copy link
Member

@edsantiago edsantiago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to see if this blocks merging. I want to understand the failures because I really do not want them to make their way into master.

@openshift-ci-robot openshift-ci-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 6, 2020

session = podmanTest.Podman([]string{"create", "--name", "test", "--userns=keep-id", "--user", "root:root", fedoraToolbox, "sh", "-c",
fmt.Sprintf("%s; %s; %s; sleep 1000", useradd, groupadd, usermod)})
session.WaitWithDefaultTimeout()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HarryMichal I'm pretty sure the error is here: this is racy. Could you look into using WaitContainerReady()? You would need to echo READY before the sleep, and adjust the podman logs = nil test below, and also have to do this on all other tests; but I think that's the easiest safest way to proceed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WaitContainerReady() does what I was trying to accomplish by calling sleep. I did not know about it. Thank you! I'll adjust the code to make use of it.

@HarryMichal HarryMichal force-pushed the add-toolbox-e2e-tests branch from e6b5c9d to 566b553 Compare October 8, 2020 12:48
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Oct 8, 2020
@HarryMichal
Copy link
Member Author

I tried to make use of WaitContainerReady() and while going through the tests I also got rid of ok, _ := session.GrepString() + Expect(ok).To(BeTrue()) and replaced those lines with Expect(session.OutputToString()).To(ContainSubstring()).

@HarryMichal
Copy link
Member Author

Ah! I forgot to adjust most of the tests to account for the fact the logs are not empty most of the time now because of the use of WaitContainerReady().

@HarryMichal HarryMichal force-pushed the add-toolbox-e2e-tests branch 2 times, most recently from 20cffe0 to c3c9551 Compare October 8, 2020 15:36
In the past, Toolbox[0] has been affected by several of Podman's
bugs/changes of behaviour. This is one of the steps to assure that as
Podman progresses, Podman itself and subsequently Toolbox do not regress.
One of the other steps is including Toolbox's system tests in Podman's
gating systems (which and to what extent is yet to be decided on).

The tests are trying to stress parts of Podman that Toolbox needs for
its functionality: permission to handle some system files, correct
values/permissions/limits in certain parts, management of users and
groups, mounting of paths,.. The list is most likely longer and
therefore more commits will be needed to control every aspect of the
Toolbox/Podman relationship :).

Some test cases in test/e2e/toolbox_test.go rely on some tools being
present in the base image[1]. That is not the case with the common
ALPINE image or the basic Fedora image.

Some tests might be duplicates of already existing tests. I'm more in
favour of having those duplicates. Thanks to that it will be clear what
functionality/behaviour Toolbox requires.

[0] https://github.com/containers/toolbox
[1] https://github.com/containers/toolbox/#image-requirements

Signed-off-by: Ondřej Míchal <[email protected]>
@HarryMichal HarryMichal force-pushed the add-toolbox-e2e-tests branch from c3c9551 to a1e1a3a Compare October 9, 2020 14:32
@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: edsantiago, HarryMichal

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 12, 2020
@edsantiago
Copy link
Member

Nice work! Let's get this in.
/lgtm
/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 12, 2020
@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 12, 2020
@openshift-merge-robot openshift-merge-robot merged commit ce7478b into containers:master Oct 12, 2020
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 24, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. ok-to-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants