-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use more robust Maven image selection process #93
Use more robust Maven image selection process #93
Conversation
good idea I tried to evaluate the impact: i added https://github.com/jvm-repo-rebuild/reproducible-central/blob/master/bin/update-statistics.sh script that generates https://github.com/jvm-repo-rebuild/reproducible-central/blob/master/doc/buildspec-stats.txt I have an issue with current full rewrite: it takes ages to run (I even just hit my pull rate limit on Dockerhub :) ) and has a huge impact on values I imagine that keeping the existing algorithm would be ok, with adding at the end a check that the image exists, and if it does not exist, using the new algorithm with less options (because we don't need old values for old JDKs) we would have the best of all worlds, isn't it? |
Works for me! I added a commit with a new proposal. Note that I swapped out |
@@ -122,6 +122,14 @@ mvnBuildDocker() { | |||
*) | |||
mvnImage=maven:${mvnVersion}-eclipse-temurin-${jdk}-alpine | |||
esac | |||
if ! docker pull -q ${mvnImage} > /dev/null 2>&1; then | |||
for image in maven:{${mvnVersion},3}-eclipse-temurin-${jdk}-alpine; do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For recent JDKs and Maven versions these two variants seem to suffice.
if docker pull -q ${image} > /dev/null 2>&1; then | ||
mvnImage=${image} | ||
break | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that the previous revision could inspect more images than necessary. Are simple for-loop avoids that.
@@ -122,6 +122,14 @@ mvnBuildDocker() { | |||
*) | |||
mvnImage=maven:${mvnVersion}-eclipse-temurin-${jdk}-alpine | |||
esac | |||
if ! docker pull -q ${mvnImage} > /dev/null 2>&1; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Stephan202 I tested this (with bin/update-statistics.sh) and found that docker pull -q
fails when the image is already available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting; locally I can't reproduce this:
$ docker --version
Docker version 20.10.20, build 9fdeb9c
$ docker pull -q maven:3.8.5-eclipse-temurin-17-alpine 2>&1 2>/dev/null; echo $?
docker.io/library/maven:3.8.5-eclipse-temurin-17-alpine
0
$ docker pull -q maven:3.8.5-eclipse-temurin-17-alpine 2>&1 2>/dev/null; echo $?
docker.io/library/maven:3.8.5-eclipse-temurin-17-alpine
0
$ docker pull -q maven:this-one-does-not-exist 2>&1 2>/dev/null; echo $?
1
Perhaps it depends on the version of Docker used (which one do you have installed locally?).
I can have a closer look at bin/update-statistics.sh
later; right now I'm commuting; don't want to pull new images while using mobile internet :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can have a closer look at
bin/update-statistics.sh
later.
Didn't happen yesterday, and today is full of meeting and events; I intend to circle back to this PR later this week. Sorry for the delay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no worries, we all have a day job, I have the same type of delays that you patiently waited for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tested (and improved a little bit the script): you're right, docker pull -q
works, I probably was hit by my rate limit or anything stupid and transient
now, with the bin/update-statistics.sh
script, we can easily evaluate the impact of any change in image tag selection and see if it breaks any buildspec tool x jdk selection
I let you review and update the statistics script while changing rebuild.sh
so we can merge with safety
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally found a moment to get back to this :)
I saw that bin/update-statistics.sh
uses docker inspect
rather than docker manifest inspect
. So this means there are three options. I wrote the following script to compare them:
#!/usr/bin/env bash
# The image against which to test.
image=maven:3.6.3-openjdk-17-slim
print_header() {
local header="${1}"
echo -e "\n\033[0;34m${header}\033[0m"
}
print_result() {
test $? == 0 && echo -e '\033[0;32mSUCCESS\033[0m' || echo -e '\033[0;31mFAILURE\033[0m'
}
# Drop the image if it exists.
docker image rm -f "${image}"
print_header 'Running `docker inspect` while the image is locally absent.'
time docker inspect "${image}" > /dev/null
print_result
print_header 'Running `docker manifest inspect` while the image is locally absent.'
time docker manifest inspect "${image}" > /dev/null
print_result
print_header 'Running `docker pull -q` while the image is locally absent.'
time docker pull -q "${image}" > /dev/null
print_result
print_header 'Running `docker inspect` while the image is locally present.'
time docker inspect "${image}" > /dev/null
print_result
print_header 'Running `docker manifest inspect` while the image is locally present.'
time docker manifest inspect "${image}" > /dev/null
print_result
print_header 'Running `docker pull -q` while the image is locally present.'
time docker pull -q "${image}" > /dev/null
print_result
Based on the results, it seems to me that:
docker inspect
is by far the fastest, but only says whether the given image exists locally. It seems this command cannot be used inrebuild.sh
, unless there's some kind of "pre-population" of images. (Which might be worth the effort of implementing; see below.)docker manifest inspect
takes ~3 seconds if the image doesn't exist, and ~5 if it does, irrespective of how often the command is executed.docker pull -q
takes ~2 seconds if the image doesn't exist, and ~1.5 if is does, unless the image doesn't exist locally yet, in which case it of course takes longer (but that's a cost that needs to be paid regardless).
Based on this I see two options:
- Stick with
docker pull -q
inrebuild.sh
. - Introduce a
docker pull -q
-based "bootstrap phase" and usedocker inspect
inrebuild.sh
.
(The timings presented here may of course be specific to my setup; interested to hear whether you can reproduce these results.)
Co-authored-by: Stephan Schroevers <[email protected]> Co-authored-by: Hervé Boutemy <[email protected]>
yes, great: I merged and reworked the algorithm into 0b5d8be thanks a lot |
This is a proposal based on @hboutemy's suggestion here.