Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker build with Ubuntu mirrors #16098

Merged
merged 2 commits into from
Mar 23, 2023

Conversation

MiguelWeezardo
Copy link
Member

@MiguelWeezardo MiguelWeezardo commented Feb 13, 2023

Description

In the last few weeks, we've seen failures in the job that builds the Docker image where it couldn't connect to ports.ubuntu.com or archive.ubuntu.com. Adding mirrors to apt should make it more resilient against such failures.

These two mirrors were chosen by getting a list of mirrors that include ports using this: https://askubuntu.com/questions/428698/are-there-alternative-repositories-to-ports-ubuntu-com-for-arm
and then checking which ones are the fastest using netselect, executed on an ec2 instance in us-east-2:

$ sudo netselect -v -s10 -t20 $(cat valid.txt )
Running netselect to choose 10 out of 14 addresses.      
...............................................................................................
    0 http://mirrors-ca.muzzy.us/ubuntu/
  173 http://mirrors.ocf.berkeley.edu/ubuntu-ports/
  417 http://mirror.kumi.systems/ubuntu-ports/
  621 http://jp.mirror.coganng.com/ubuntu-ports/
  716 http://mirror.deace.id/ubuntu/
 1134 http://mirrors.tuna.tsinghua.edu.cn/ubuntu-ports/
 1155 http://ubuntu-mirror.cloud.mu/ubuntu-ports/
 1281 http://mirror.misakamikoto.network/ubuntu-ports/
 1460 http://in.mirror.coganng.com/ubuntu-ports/
 1550 http://mirror.nishi.network/ubuntu-ports/

Out of these, the first one is invalid as it doesn't work over http, returning a 403.

Additional context and related issues

#15807 (comment)
https://github.com/trinodb/trino/actions/runs/4004028582/jobs/6872683430

This and #15840 both attempt to solve the same problem, but this solution is preferred as it doesn't rely on cache entries existing. #15840 can still be used to speed up Docker builds occasionally, but it isn't reliable enough to solve this problem.

Release notes

(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Feb 13, 2023
@MiguelWeezardo
Copy link
Member Author

Once I check all architectures build successfully, I will drop all commits except the "Add more mirrors" one

@MiguelWeezardo MiguelWeezardo added the dx Issues or pull requests related to developer experience label Feb 23, 2023
@MiguelWeezardo
Copy link
Member Author

@hashhar Do you think this approach is worth considering?

@MiguelWeezardo MiguelWeezardo requested review from losipiuk and martint and removed request for findepi and electrum March 8, 2023 15:20
@MiguelWeezardo
Copy link
Member Author

MiguelWeezardo commented Mar 8, 2023

@losipiuk Any chance of getting this merged?

@nineinchnick
Copy link
Member

Test Docker image step ran in 15 minutes, and it's usually around 9 minutes. Can you rebase, so we can see if that was an outlier?

docker build \
"${WORK_DIR}" \
--pull \
--platform "linux/$arch" \
-f Dockerfile \
-t "${TAG_PREFIX}-$arch" \
--build-arg "TRINO_VERSION=${TRINO_VERSION}"
rm -fr "${WORK_DIR}/default/apt/sources.list.d"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not keep not put those mirror files in default/etc and have all of those for build of images for each architecture. Would it break sth? I think the non-matching mirros woudl not be used. Then you do not need to modify this script at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my initial approach, but if I remember correctly having all architectures defined at once made the apt update step longer since it tried to download package lists from all mirrors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One performance improvement that might be worth checking out is having all mirrors defined while also using docker buildx to build multiple architectures in parallel. That way we'd have a single shared build context, and the parallel build gains might offset some of the cost of fetching all mirror package lists.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One performance improvement that might be worth checking out is having all mirrors defined while also using docker buildx to build multiple architectures in parallel. That way we'd have a single shared build context, and the parallel build gains might offset some of the cost of fetching all mirror package lists.

I have a WIP branch with this approach in branch michal/docker_ubuntu_mirrors_dockerx

@MiguelWeezardo
Copy link
Member Author

Test Docker image step ran in 15 minutes, and it's usually around 9 minutes. Can you rebase, so we can see if that was an outlier?

It's down to 11m23s in the second execution, still a little more than the usual.

Enabled: yes
Types: deb
URIs: https://mirrors.ocf.berkeley.edu/ubuntu/ https://mirror.kumi.systems/ubuntu/
Suites: jammy jammy-updates jammy-backports jammy-security
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it always be jammy. Aren't they bumping debian version when there is new one in eclipse-temurin:17-jdk?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think once eclipse-temurin:17-jdk updates their base image we should update the mirror descriptors manually.
Other than pinning ourselves to a specific hash of eclipse-temurin we have no control over the distro version they use.
They might even switch from Ubuntu to something which doesn't use APT.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is we will not notice that moment right?
Or will we get some error when they update?

Copy link
Member Author

@MiguelWeezardo MiguelWeezardo Mar 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error might not directly mention a mismatch of Ubuntu versions, but I expect it might look like a huge update with APT trying to pull every installed package twice. I think it will either timeout the job or take long enough for us to notice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder if using a tag instead of a hash for the base image is a good idea. I think it's better to use immutable hashes for build repeatability.

Copy link
Member

@hashhar hashhar Mar 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably build these mirrorlists dynamically during image build by something like:

source /etc/os-release
# the above exports a UBUNTU_CODENAME variable
sed -i 's/UBUNTU_CODENAME/${UBUNTU_CODENAME}/g' /etc/apt/sources.list.d/*

And in the mirrorlist we commit to repo we replace occurences of actual release name with UBUNTU_CODENAME.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder if using a tag instead of a hash for the base image is a good idea

I think the point of not using specific version it to always get the most recent security patches in place

We can probably build these mirrorlists dynamically during image build by something like

This looks nice

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder if using a tag instead of a hash for the base image is a good idea

I think the point of not using specific version it to always get the most recent security patches in place

I understand that was the intention, but it seems risky because a build of the same trino git hash might work today and fail tomorrow just because the tag was updated in between. I think a more repeatable approach would be to pin ourselves to a specific docker image hash and use dependabot to keep it up to date with the tag. In this case every docker image change would correspond with a commit in trino that we can roll back if it breaks something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think repeatablitity of build is a very important thing we want to address here. Having subsequent builds use safe docker as a base, without manual intervention to bump base image version is more important imo.

Copy link
Member

@losipiuk losipiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please squash the fixup

@MiguelWeezardo MiguelWeezardo force-pushed the michal/docker_ubuntu_mirrors branch 2 times, most recently from 2cf3887 to 8e5c991 Compare March 22, 2023 14:21
Future-proof against Ubuntu codename update

This approach works as long the following assumptions hold:
- the location of `/etc/os-release` file does NOT change
- the name of the UBUNTU_CODENAME environment variable does NOT change
- `eclipse-temurin` still uses Ubuntu as it's base
@MiguelWeezardo MiguelWeezardo force-pushed the michal/docker_ubuntu_mirrors branch from 8e5c991 to a8a7912 Compare March 22, 2023 14:22
@losipiuk losipiuk merged commit 1d400c3 into trinodb:master Mar 23, 2023
@github-actions github-actions bot added this to the 411 milestone Mar 23, 2023
@MiguelWeezardo MiguelWeezardo deleted the michal/docker_ubuntu_mirrors branch March 24, 2023 12:48
MiguelWeezardo added a commit to MiguelWeezardo/trino that referenced this pull request Apr 3, 2023
The recent merge of trinodb#16098
increased build robustness in the event of main Ubuntu repo failures,
at the cost of 2-3 minutes for downloading Ubuntu mirror indexes.

It turns out the 45 minute timeout was very close to the actual time
needed to perform the entire job, and extending it is
necessary to avoid cancelling the job.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed dx Issues or pull requests related to developer experience
Development

Successfully merging this pull request may close these issues.

4 participants