-
Notifications
You must be signed in to change notification settings - Fork 787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DO NOT MERGE Flake handling: cache and prefetch images #2036
Conversation
This is hard to review: all the _prefetch insertions were added manually after many iterations of manual code inspection and testing. I would appreciate a careful eye to make sure that I didn't accidentally add a Tested by instrumenting One question that has long troubled me and now bothers me even more: Is it absolutely necessary to use so many source images? E.g.
|
@@ -483,6 +507,7 @@ load helpers | |||
} | |||
|
|||
@test "bud-http-context-dir-with-Dockerfile-post" { | |||
# FIXME FIXME FIXME: this is 100% identical to the -pre test above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you click on the expando (more context) above, you'll see that this test is identical to the -pre
one. What was the intention here? Is it OK to collapse them both into one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea. @TomSweeneyRedHat @nalind Any ideas?
@@ -1662,12 +1772,14 @@ load helpers | |||
|
|||
@test "bud using gitrepo and branch" { | |||
target=gittarget | |||
# FIXME: this test takes a really long time. Is it necessary to do twice? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
almost 4 minutes, because it uses fedora as base image and runs a dnf install. Is there some other simpler test we can do instead? Can we create a new module under github.com/containers
just for testing this? Or even just a side branch of containers/BuildSourceImage
that just does FROM scratch
and RUN echo got here
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about switching it to:
run_buildah bud --format docker --layers -f tests/bud/shell/Dockerfile -t test git://github.com/containers/buildah#master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Runs in 3.5 seconds.
|
I don't see a problem with consolidating on a couple of images. I do think it is handy to make sure we still have tests with apt-get and dnf/yum. But getting this to the point that we test fewer distinct images would be fine. |
☔ The latest upstream changes (presumably 45543bf) made this pull request unmergeable. Please resolve the merge conflicts. |
4e129b1
to
2b16706
Compare
@edsantiago whats the scoop on this one? |
2b16706
to
bd1c9f6
Compare
@rhatdan right now I'm just playing with it, wanting to see if/how it runs in CI, if it saves any time, if it reduces flakes. I've changed the title accordingly to |
Never mind: |
/hold |
Show of hands: who here loves submitting a PR, then coming back hours later to find one job failed, then spending time poring over logs and finding a network error? Anyone? Anyone? This is a lame attempt to minimize such flakes by caching commonly-used images and restoring them on demand. We introduce a new helper, _prefetch(), which podman-pulls an image the first time, podman-saves it, then on subsequent calls (for the same image) podman-loads it: @test foo { _prefetch alpine busybox ...tests that run buildah-from either } This is an imperfect solution: it is incomplete and will grow more so over time as new tests are added. It is difficult to verify its coverage. I'm really unhappy with it but if it works, the Total Sum Of Unhappiness might decrease overall thanks to fewer flakes. If it doesn't work, it's trivial to remove _prefetch calls using a sed script. Shall we give it a chance? Signed-off-by: Ed Santiago <[email protected]>
bd1c9f6
to
63e500b
Compare
Also: images json test: rewrite to actually check for keys instead of just number of lines. Reason: when using older podman to prefetch (in f29), 'history' key is lost, giving us 26 lines of output instead of 30. Signed-off-by: Ed Santiago <[email protected]>
63e500b
to
f0b7958
Compare
2115: Kill Travis and Enable Bors r=rhatdan a=cevich ***Depends on:*** #1848 #1971 #2036 #2121 Co-authored-by: Ed Santiago <[email protected]> Co-authored-by: Chris Evich <[email protected]>
2115: Kill Travis and Enable Bors r=rhatdan a=cevich ***Depends on:*** #1848 #1971 #2036 #2121 Co-authored-by: Ed Santiago <[email protected]> Co-authored-by: Chris Evich <[email protected]>
2115: Kill Travis and Enable Bors r=rhatdan a=cevich ***Depends on:*** #1848 #1971 #2036 #2121 Co-authored-by: Ed Santiago <[email protected]> Co-authored-by: Chris Evich <[email protected]>
Oh yikes. This was not supposed to happen: it has not been reviewed. |
@nalind @TomSweeneyRedHat @rhatdan I'm sorry to request this, but could you find time to post-review this and let me know if you find glaring problems? It got merged last week due to my misunderstanding about how |
I am fine with the change, my only concern would be, are we still exercising buildah pulling images from a registry? |
Yes. None of the tests in |
Ok lets let it go for now, and we will see if we have any issues going forward. |
Fix two issues identified in containers#2036: - the 'gitrepo and branch' test was pulling from a place that took four minutes; change it to our own repo, suggested by Dan, which takes just a few seconds. -- also, remove what I think is an unnecessary dup. If buildah can pull from a branch, it can pull from master. - the httpd tests were really confusing, with lots of copy/pasted code differing in only small ways. Refactor to make the purpose of each test more apparent, and to make it easier to add new ones as needed. Signed-off-by: Ed Santiago <[email protected]>
Fix three issues identified in containers#2036: - the 'gitrepo and branch' test was pulling from a place that took four minutes; change it to our own repo, suggested by Dan, which takes just a few seconds. -- also, remove what I think is an unnecessary dup. If buildah can pull from a branch, it can pull from master. - the httpd tests were really confusing, with lots of copy/pasted code differing in only small ways. Refactor to make the purpose of each test more apparent, and to make it easier to add new ones as needed. - combine bud-http-context-dir-with-Dockerfile -pre and -post, since they were identical. (Context: they started off being different tests, with command-line options in different order, but as of containers#493 the -post form of options no longer works so the -post test is no longer relevant) Signed-off-by: Ed Santiago <[email protected]>
2265: bud.bats - cleanup, refactoring r=rhatdan a=edsantiago Fix two issues identified in #2036: - the 'gitrepo and branch' test was pulling from a place that took four minutes; change it to our own repo, suggested by Dan, which takes just a few seconds. -- also, remove what I think is an unnecessary dup. If buildah can pull from a branch, it can pull from master. - the httpd tests were really confusing, with lots of copy/pasted code differing in only small ways. Refactor to make the purpose of each test more apparent, and to make it easier to add new ones as needed. Signed-off-by: Ed Santiago <[email protected]> /kind cleanup ```release-note no ``` Co-authored-by: Ed Santiago <[email protected]>
Show of hands: who here loves submitting a PR, then coming back
hours later to find one job failed, then spending time poring
over logs and finding a network error? Anyone? Anyone?
This is a lame attempt to minimize such flakes by caching
commonly-used images and restoring them on demand. We
introduce a new helper, _prefetch(), which podman-pulls
an image the first time, podman-saves it, then on
subsequent calls (for the same image) podman-loads it:
This is an imperfect solution: it is incomplete and will
grow more so over time as new tests are added. It is
difficult to verify its coverage. I'm really unhappy
with it but if it works, the Total Sum Of Unhappiness
might decrease overall thanks to fewer flakes. If it
doesn't work, it's trivial to remove _prefetch calls
using a sed script. Shall we give it a chance?
Signed-off-by: Ed Santiago [email protected]