Skip to content

Latest commit

 

History

History
160 lines (119 loc) · 19 KB

cache-docker-images_github-ci.md

File metadata and controls

160 lines (119 loc) · 19 KB

Caching docker images for CI runs at GitHub

Issue description

If a CI configuration (i.e. a "GitHub workflow") requires a docker image to run, it downloads such images for each CI run. These repeated downloads of docker images, which are often hundreds of megabytes to gigabytes large, significantly slow down each CI run and consume vast amounts of network bandwidth.

Specific issue

Specifically, using the Sailfish-SDK images provided by Coderus for a CI run results in downloading a docker image between 1 GB and 3,5 GB in size (depending on the SDK / SailfishOS version to build for) up to three times (once for each of the supported architectures: aarch64, armv7hl and i486) from an external "docker registry" (here: Docker Hub). This affects the simple variant of using these images (by directly using the coderus/github-sfos-build "action") and the more sophisticated one alike.

As these images are downloaded by all users of Coderus' SailfishOS Platform SDK Docker images hosted at DockerHub (the Docker "registry") and Docker imposes consequtively stricter "rate limiting" (i.e. limits for download volume and / or frequency, before access is severely throttled or someone pays for it), this may prevent the use of these images for CI runs in the future.

Issue analysis

Initial assessment

Caching "locally" means, with the measure(s) provided at GitHub, e.g. GitHub "actions". Ultimately all these solutions use GitHub's action/cache, which provides (as of 2023) 10 GB of cache, expiring cached items LRU based or when an item was not accessed for a week. But as some research shows, there are many variants and indirections how to utilise GitHub's action/cache.

Alternative solutions

Other "solutions", as an external, caching proxy server, are implicitly not very effective.

Reducing the size of docker images is always a valid approach, has some potential (many docker images carry large amounts of unnecessary cruft), but is time consuming and futile, as the creation and distribution of such images are inviting to a "quick & dirty" approach (i.e. they are much quicker and easier to create and distribute than optimised).

The only real alternative solution is to host container images "locally" at GitHub, i.e. at GitHub's container registry. For an introduction, see GitHub's documentation for creating, managing and distributing "GitHub packages".

Basic properties of GitHub's action/cache

  • The action/cache seems to be implicitly run in the context of the user runner. While a sudo su executed as part of a run: statement is effective for subsequent shell commands (tested with the Ubuntu-Linux runner environment provided by GitHub in 2023), I have not found a way to let an "action" run in a different user context.

  • The action/cache only accepts download targets (i.e. local paths) to be configured as items to cache, not download sources.

  • These first two properties of GitHub's action/cache prevent to simply cache the images downloaded by the local docker instance, usually (in 2023) in /var/lib/docker/<Docker Storage Driver, e.g, overlay2>/ on Linux (utilising overlayfs, as recommended by Docker Inc. and preinstalled on Ubuntu 2x.yz), because /var/lib/docker and all its sub-directories are assigned to the user and group root and provide no access for others. Adding the user runner to the group root does not help, because this only provides search permission in directories (i.e. the x bit is set for directories), but still no access to the files in /var/lib/docker/[<storage-driver>](https://docs.docker.com/storage/storagedriver/overlayfs-driver/)/.

  • The action/cache only caches items used in a successful CI job run. Sometimes it makes sense to always cache items, which are known be independent of the outcome of a CI run, e.g. classic prerequisites for it; exactly what the Sailfish-SDK images constitute for building software for SailfishOS at GitHub.

    Others have also noticed that long ago and trivially patched the original action/cache (e.g. [1], [2]), but very often this ultimately results in stale forks. Hence applying this trivial change by "live patching" is the only maintainable solution, which resulted in an improved version of the "live patching" approach.

    Unfortunately GitHub has not provided a way to adjust this behaviour by a CI configuration, despite [see] issue #92 (and subsequent issues #165, #334 etc.) has been filed for GitHub's action/cache long ago.
    Edit: Mostly solved by the initial release of actions/cache/save and actions/cache/restore in December 2022; although this extension of the original action/cache still provides a larger feature set and is structurally analog to GitHub's new actions/cache/save and actions/cache/restore. This is now the recommended way of storing items in a cache, regardless if the whole action is sucessful or fails; still "live patching" GitHub's original action/cache to also cache when the job fails still has some appeal due to the simpler usage of action/cache compared to the new action/cache/save and action/cache/restore, which all three are now and continue to be maintained by GitHub. As their basic properties are the same (except for this point), the remainder of this document can stay unchanged.

    Plan: Enhance and release a "live patching" action, which downloads (actually: checks-out), patches and transparently maps to the locally patched version of the original action/cache, ultimately also to the GitHub Marketplace.

Exploring the solution space

Pre-download the container images

The most trivial way to cope with action/cache's access limitations is to pre-download images expicitly. For this one creates a download directory by issuing mkdir -p $GITHUB_WORKSPACE/<image-name> (the -p is only used to prevent an error, when the dirctory already exists; $GITHUB_WORKSPACE resolves to /home/runner/<repository-name>/<repository-name> on Linux (yes, twice <repository-name>), GitHub calls this location "runner workspace", it is naturally also the initial PWD), download the image by some third party tool (the docker CLI commands do not allow for setting the download location), then execute a docker image load (or docker image import) and ultimately continue as before (e.g. instanciating and starting a docker container by docker run).

Unfortunately this approach does not work for large images (e.g. > 1 GB) due to space constraints GitHub imposes for the runner home directory. I have not followed the idea of alleviating this by raising the quota, because that requires analysis (is it imposted by a classic quota and can it be raised by sudoing?) and might be seen by GitHub as cirumventing their constraints.

Mind that the git repository is also checked out to the "runner workspace" ($GITHUB_WORKSPACE) as root directory, so do pay attention to not clobber any files or directories of your source repository.

Suitable tools for downloading docker images to arbitrary locations in the local file-system:

  • Its source code is hosted at GitHub and carries no license.
  • Apparently unmaintained.
  • Does not provide releases or git tags.
  • Is a simple and small Python script (187 sloc, 7,3 KBytes), called docker_pull.py.
  • Its source code is hosted at GitHub and carries no license.
  • Created in 2022.
  • Does provide two releases (as of 2023-01-07) and git tags.
  • Written in Go, pre-compiled versions are 11,6 MBytes large.
  • Inspired by / an implementation in Go of docker-drag, the tool discussed one bullet point above.
  • http only?

Use an "action", which utilises action/cache

Suitable "actions" to cache downloaded docker images:

  • Its source code is hosted at GitHub and uses the MIT license.
  • Does provide two releases (as of 2023-01-07) and two corresponding git tags.
  • Written in Go.
  • Not much used.
  • Initially appeared to be an easy and elegant soultion, but …
  • http only?
  • Its source code is hosted at GitHub and uses the MIT license.
  • Does provide stable releases and git tags (lots!).
  • Written in bash, heavily uses bash specific features.
  • Small, the two bash scripts summarised are < 600 sloc, < 15 KBytes.
  • Aimed at a different purpose: To cache docker images which are needed for building an own image.
  • Initially appeared to be (ab)usable for solely caching the download of docker images, but a little analysis shows, that one would have to dissect the main bash script and adapt it for this purpose: Currently a docker build call is unavoidable.
  • Its source code is hosted at GitHub and uses the Unlicense license.
  • Does provide two releases (as of 2023-01-07) and git tags.
  • Written in JavaScript.
  • Small, summarised < 700 sloc, < 25 KBytes.
  • Appears to be unmaintained.
  • Nobody seems to use it.
  • Appears to be easier to (ab)use for only caching the downloaded docker images than Build docker images using cache (discussed one bullet point above).
  • Its source code is hosted at GitHub and uses the MIT license.
  • Does provide a single git tag.
  • Written in TypeScript (Microsoft's superset of JavaScript).
  • Smallish, ca. 275 KiB comprising compiled JavaScript (three files), a bash script and an action.yaml.
  • Appears to be unmaintained.
  • Appears to be a generic caching solution for pulling external dependencies.
  • States to be adaptable, includes cache configurations for pip, npm and yarn.
  • Despite extensive documentation, I fail to quickly comprehend:
    • How to configure a different source (Docker Hub).
    • If it is also limited to downloads in the runner's "workspace".
  • Pulled (?) from the "GitHub marketplace" 2023-01-08, see github.com/marketplace/actions/cached-dependencies. 2023-01-07 it was still there and is still found via the search!?!
  • Its source code is hosted at GitHub and uses the MIT license.
  • Does provide stable releases and git tags (lots!).
  • Comprises a few TypeScript scripts (Microsoft's superset of JavaScript), which are compiled into two JavaScript scripts (main/index.js and post/index.js) each 1,17 MiB large (!), plus a tiny action.yaml file which calls these.
  • Appears to be well maintained.
  • Appears to be a generic caching solution for Docker images.
  • Explicitly denotes the use case "pull images from Docker Hub"!
  • Works technically fine, but uses docker save --output ~/.docker-images.tar , which results in write /home/runner/.docker_temp_XYZ: no space left on device even with the smallest SailfishOS Platform SDK images by Coderus (ca. 1 GB, but these pull in a few additional layers).

Down-selection of possible solutions to try

  1. Use Podman instead; it is preinstalled on GitHub's Ubuntu 22.04 runner image, too.
    When started by an non-root user, it uses $HOME/.local/share/containers/storage/ to store images, layers and their metadata, specifically the subdirectory <Storage Driver>-layers for the downloaded layers. This configuration can easily be adapted. But not all files are neccesarily redable by the user, despite being their owner, because they have no permissions set (e.g. an /etc/shadow in a conatiner image). Consequently the GitHub Actions cache and cache/save fail.
  2. Rootless Docker: https://github.com/ScribeMD/rootless-docker
    Very likely it exposes the same issue as rootless Podman, which is described in the prior point.
  3. Docker Cache: https://github.com/ScribeMD/docker-cache
    Easily runs out of space on a GitHub runner, see details in its section.
  4. download-frozen-image-v2.sh: https://github.com/moby/moby/tree/master/contrib#readme