Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker layer caching support #26

Closed
deevus opened this issue Jan 25, 2018 · 47 comments
Closed

docker layer caching support #26

deevus opened this issue Jan 25, 2018 · 47 comments

Comments

@deevus
Copy link

deevus commented Jan 25, 2018

How can I enable caching of docker layers between builds? Caching is one of the biggest benefits of multi-stage builds but in CodeBuild it runs every step every time.

@josephvusich
Copy link
Contributor

josephvusich commented Jan 27, 2018

CodeBuild does not currently have native support for Docker layer caching, though we are aware of the use case for it.

In the meantime, have you tried using docker save and docker load with CodeBuild's file cache functionality? You may be able to bring down your build time for complex layers in this way.

@deevus
Copy link
Author

deevus commented Jan 27, 2018

I'll try that and see how it goes. Thanks

@ewolfe
Copy link

ewolfe commented Feb 1, 2018

@deevus I'm curious if you had any luck with docker save and docker load -- I'm also in need of caching docker layers between builds.

@deevus
Copy link
Author

deevus commented Feb 1, 2018

@ewolfe I tried it and it doesn't seem very effective. The time it takes to load/save negates any benefits at least in my case.

Here are the scripts I have written. If you try them out perhaps you can find a way to make them work in your favour. It saves all the generated images after the build since the docker host is empty (I assume) when the build runs. Apologies for the lack of comments.

cache-load.sh

#!/bin/bash
set -e

echo 'Loading docker cache...'
mkdir -p $IMAGE_CACHE_PATH

DOCKER_IMAGES_CACHE=`mktemp`
find $IMAGE_CACHE_PATH -name *.tar.gz > $DOCKER_IMAGES_CACHE

while read file; do
    echo $file
    if ! docker load -i $file; then
        echo "Error loading docker image $file. Removing..."
        rm $file
    fi
done < $DOCKER_IMAGES_CACHE

rm $DOCKER_IMAGES_CACHE

cache-save.sh

#/bin/bash
set -e

mkdir -p $IMAGE_CACHE_PATH
DOCKER_IMAGES_NEW=`mktemp`
docker images -q --no-trunc | awk -F':' '{print $2}' | sort > $DOCKER_IMAGES_NEW

DOCKER_IMAGES_CACHE=`mktemp`
find $IMAGE_CACHE_PATH -name *.tar.gz -printf '%f\n' | awk -F. '{print $1}' | sort > $DOCKER_IMAGES_CACHE

DOCKER_IMAGES_DELETE=`mktemp`
DOCKER_IMAGES_SAVE=`mktemp`
comm -13 $DOCKER_IMAGES_NEW $DOCKER_IMAGES_CACHE > $DOCKER_IMAGES_DELETE
comm -23 $DOCKER_IMAGES_NEW $DOCKER_IMAGES_CACHE > $DOCKER_IMAGES_SAVE

if [ $(< $DOCKER_IMAGES_DELETE wc -l) -gt 0 ]; then
    echo Deleting docker images that are no longer current
    < $DOCKER_IMAGES_DELETE xargs -I % sh -c "echo Deleting extraneous image % && rm $IMAGE_CACHE_PATH/%.tar.gz"
    echo
fi

if [ $(< $DOCKER_IMAGES_SAVE wc -l) -gt 0 ]; then
    echo Saving missing images to docker cache
    < $DOCKER_IMAGES_SAVE xargs -I % sh -c "echo Saving image % && docker save % | gzip -c > '$IMAGE_CACHE_PATH/%.tar.gz'"
    echo
fi

rm $DOCKER_IMAGES_NEW $DOCKER_IMAGES_CACHE $DOCKER_IMAGES_DELETE $DOCKER_IMAGES_SAVE

I don't know if I'm missing something here but a couple of the intermediate containers still build from scratch anyway, which is what I was originally trying to avoid.

EDIT: You need to set IMAGE_CACHE_PATH in your buildspec to insde the path you're caching to S3. Mine is set to /root/.docker-cache/$IMAGE_REPO_NAME

@mastef
Copy link

mastef commented Feb 1, 2018

Do you run these on PRE_BUILD and POST_BUILD respectively?

@deevus
Copy link
Author

deevus commented Feb 1, 2018

Yes that's correct

@AdrieanKhisbe
Copy link

I also need to cache the layers between build but my attempts have been so far unsuccessful. (tried to cache /var/lib/docker/overlay, huge fail)


I have a question for you @jvusich, by

CodeBuild does not currently have native support for Docker layer caching, though we are aware of the use case for it.

Do you mean that this is somewhere on the codebuild roadmap? :)

@awsnitin
Copy link
Contributor

Do you mean that this is somewhere on the codebuild roadmap? :)

As @jvusich mentioned, we are aware that this use case is something that we do not support natively (without custom work arounds mentioned in this issue). We've also heard about this use case before from our other customers as well. Our roadmaps are decided primarily based on customer requests and use cases. So effectively its on our radar, we cannot comment when it will be addressed.

@jabalsad
Copy link

jabalsad commented Feb 28, 2018

Thanks @deevus! Your handy shell script made this easier. I had to make a small change to properly cache all the layers in the build: docker save needs a list of images used to build that image by running docker history.

I put my version of your script in these gists here:
cache-save.sh https://gist.github.com/jabalsad/fc72503243afa76e0fbbd1349a0e4023
cache-load.sh https://gist.github.com/jabalsad/52914db52eaa01002125da9c7f85bdc8

I also put these lines in my buildspec.yml:

env:
  variables:
    IMAGE_CACHE_ROOT: /root/.docker-cache 
cache:
  paths:
    - /root/.docker-cache/*

@deevus
Copy link
Author

deevus commented Feb 28, 2018

@jabalsad How well does it work with that change? If my original script was missing a bunch of layers I would expect a decent improvement with your changes

@jabalsad
Copy link

jabalsad commented Mar 1, 2018

It speeds up the actual build significantly, however the docker save and docker load commands now take a while to run, negating any speed improvements (possibly even slowing down the build).

The real reason I'm looking for the caching functionality is actually so that noop changes don't create a new image in ECR unnecessarily.

@monken
Copy link

monken commented Mar 3, 2018

this worked for me:

version: 0.2
phases:
  pre_build:
    commands:
      - docker version
      - $(aws ecr get-login --no-include-email)
      - docker pull $CONTAINER_REPOSITORY_URL:$REF_NAME || true
  build:
    commands:
      - docker build --cache-from $CONTAINER_REPOSITORY_URL:$REF_NAME --tag $CONTAINER_REPOSITORY_URL:$REF_NAME --tag $CONTAINER_REPOSITORY_URL:ref-$CODEBUILD_RESOLVED_SOURCE_VERSION .
  post_build:
    commands:
      - docker push $CONTAINER_REPOSITORY_URL

@oba11
Copy link

oba11 commented Mar 7, 2018

@monken This worked perfectly for me, my build time reduced from 23 mins 1 sec to 1 min 8 secs.
You are a real life saver

@adriansecretsource
Copy link

@monken After a couple of hours trying I found your solution just perfect, I managed to decrease build time a 60%!

@healarconr
Copy link

I think that the method mentioned by @monken (pull and cache-from) does not work with multi-stage builds because the pulled image does not have all the stages, but only the last one.

@tvb
Copy link

tvb commented Mar 26, 2018

@monken I can't get this to work. It keeps invalidate the cache even at the base image 😢

@kiernan
Copy link

kiernan commented Mar 26, 2018

@tvb double check your docker pull command is working, I noticed mine was failing due to not having the required IAM roles yet the build continued because the pull command is in the pre-build stage.

@dileep-p
Copy link

@monken Worked Perfectly.. 👍

@Rathgore
Copy link

Rathgore commented May 6, 2018

For anyone looking for a simple solution that works with multi-stage builds, I made a pretty simple build script that was able to meet my requirements. I was looking for a solution that would:

  • Work with multi-stage builds
  • Require no changes to the Dockerfile
  • Fully cache all layers (intermediate and final)
  • Not require a separate builder image repo or Dockerfile
  • Not require saving/loading archived image files
  • Ensure cached intermediate layers automatically stay up to date
  • Not add significant overhead to the build time
  • Not require any dependencies outside the build script itself
  • Be easy to understand and maintain

The basic process is this:

  1. Attempt to a pull a builder image tagged 'builder' from ECR.
  2. If the image does not exist, create it using Docker's --target option to only build the intermediate build stage from the multi-stage Dockerfile.
  3. If the image exists, pull it and rebuild it to bring in any changes. This rebuild step uses Docker's --cache-from option to cache from itself. This will pick up any changes to the build stage but will nicely result in a no-op if nothing has changed.
  4. Pull the latest tag of our target image from ECR, if it exists. This will cache the final build stage of our image. We now have Docker's cache primed with the build and final stages of our Dockerfile.
  5. Build our target Docker image using --cache-from twice to use the cache from both the build and final stages of the multi-stage build. If no code has changed, the entire build is a no-op from Docker's perspective.
  6. Push the target image to ECR.
  7. Push the newly created/updated builder image to ECR for use in subsequent builds.

Here's the basic script:

#!/usr/bin/env bash

readonly repo=${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/${REPO_NAME}

# Attempt to pull existing builder image
if docker pull ${repo}:builder; then
    # Update builder image
    docker build -t ${repo}:builder --target build --cache-from ${repo}:builder .
else
    # Create new builder image
    docker build -t ${repo}:builder --target build .
fi

# Attempt to pull latest target image
docker pull ${repo}:latest || true

# Build and push target image
docker build -t ${repo}:latest --cache-from ${repo}:builder --cache-from ${repo}:latest .
docker push ${repo}:latest

# Push builder image
docker push ${repo}:builder

The conditional logic is mainly there for clarity. The entire caching pattern can be simplified as:

docker pull ${repo}:builder || true
docker build -t ${repo}:builder --target build --cache-from ${repo}:builder .

docker pull ${repo}:latest || true
docker build -t ${repo}:latest --cache-from ${repo}:builder --cache-from ${repo}:latest .

docker push ${repo}:latest
docker push ${repo}:builder

This solution has been working well for me, and dramatically reduced our build times. It works with multiple concurrent builds and if any of the --cache-from options point to images that don't exist, Docker will just continue the build and won't use the cache for that run. The overhead of the caching system is very low and is pretty simple to understand. Thanks to @monken and others for inspiration.

@judahb
Copy link

judahb commented Jul 19, 2018

@monken just curious, is $REF_NAME pulling only the specific version/tag of that container in ECR, or are you pulling all the intermediate containers? if pulling all intermediates, can you describe how that works as that sounds good, but not sure it will work for my use case.

@monken
Copy link

monken commented Jul 25, 2018

@judahb it's only pulling the last container image (including all layers). It's more likely that you will have matching layers with the latest image than any image that's older. So there is probably not a huge gain in pulling all previous images.

@ngalaiko
Copy link

I also pull HEAD~1 of current branch to have something cached for the first build of HEAD commit.

docker pull ${repo}:$(git rev-parse HEAD) || docker pull ${repo}:$(git rev-parse HEAD~1) || true

@SpainTrain
Copy link

2101012

@dylanribb
Copy link

dylanribb commented Sep 12, 2018

For those using docker-compose, here's how I've solved the problem by adapting @monken's answer:

docker-compose.yml:

version:"3.4"
  services:
    app:
      image: my_app:latest
      build:
        cache_from: "${APP_REPOSITORY_URI:-my_app}:${DEPLOY_STAGE:-latest}" # Defaults are set so that if we build locally it will use the local cache
      environment:
        - APP_REPOSITORY_URI
        - DEPLOY_STAGE
# ... more docker-compose setup

buildspec.yml:

version: 0.2
env:
  variables:
    APP_IMAGE_REPO_NAME: "my_app"
phases:
  pre_build:
    - $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION)
    - APP_REPOSITORY_URI=`aws ssm get-parameters --names "/$DEPLOY_STAGE/ECR/app-uri" --with-decryption --region $AWS_DEFAULT_REGION --output text --query Parameters[0].Value`
    - docker pull $APP_REPOSITORY_URI:$DEPLOY_STAGE || true
    - docker pull $WEB_REPOSITORY_URI:$DEPLOY_STAGE || true
  build:
    - echo Build started on `date`
    - docker-compose build
    - echo Build completed on `date`
  post_build:
    - docker tag $APP_IMAGE_REPO_NAME:latest $APP_REPOSITORY_URI:$DEPLOY_STAGE
    - docker tag $WEB_IMAGE_REPO_NAME:latest $WEB_REPOSITORY_URI:$DEPLOY_STAGE
    - echo Images tagged
    - docker push $APP_REPOSITORY_URI:$DEPLOY_STAGE
    - docker push $WEB_REPOSITORY_URI:$DEPLOY_STAGE
    - echo Images pushed to repositories

In our case DEPLOY_STAGE is defined in the CodeBuild project as an environment variable so that we can setup different project stages fairly easily but still do somewhat dynamic builds.

@eino-makitalo
Copy link

Yes indeed.... waiting over 30 minutes with
Step 1/16 : FROM python:3.6
3.6: Pulling from library/python

This makes CodeBuild quite unusable...

@subinataws
Copy link
Contributor

We have added support for local caching, including Docker layers. Documentation: https://docs.aws.amazon.com/codebuild/latest/userguide/build-caching.html#caching-local

@dev-walker
Copy link

Sorry, it's not clear how the new feature is working.

  • From the docs You can use a Docker layer cache in the Linux environment only but it's working with the image aws/codebuild/ubuntu-base:14.04 but not with aws/codebuild/docker:18.09 (at least for me).

  • I set up pipeline with a CodeBuild. But it looks like every first run in a new day doesn't use caches (all subsequent runs on that day use caches).

Can anyone explain what's the reason? How make CodeBuild to use cache even if I run changes through pipeline in one week?

@Rathgore
Copy link

Rathgore commented Mar 15, 2019

My limited experience so far is that it caches for a very short period of time. If I start repeat builds within a few minutes of each other it seems to use the cache most of the time, but any longer than that and it usually doesn’t hit the cache at all.

@josephvusich
Copy link
Contributor

@dev-walker As explained in the documentation, the build cache is kept on local storage for maximum performance. When there are long intervals with no builds running, that underlying storage may be retired. Your first few builds in the morning may need to re-warm the cache if you ran very few builds overnight.

@gabrielenosso
Copy link

Can someone explain how to use @monken script step by step?

Should I use it during the creation of the image or as the buildspec on CodeBuild? (Which means my image should have docker inside?)

Sorry, I am far from being a DevOps guy..

I am using a custom Windows image, pusher on AWS ECR.

@deleugpn
Copy link
Contributor

deleugpn commented Jul 12, 2019

@josephvusich or @subinataws can we get a documentation about local cache bursting? Is it possible? Are there any plans to make it possible? Any recommended workaround?

I know I would love longer cache as I mentioned previously, but on very rare occasions I have the need to burst the local docker layers to get the build passing.

@subinataws
Copy link
Contributor

@deleugpn - you can override the cache setting (or for that matter any project setting) when invoking the startBuild. So if you don't want to use local cache for a particular build run, select "no cache". Replied to you on slack as well.

@aws aws deleted a comment from deleugpn Jul 16, 2019
@blorenz
Copy link

blorenz commented Aug 14, 2019

I'm hosting my custom built CodeBuild image on ECR and running off a base image hosted on ECR. The slow network transfer rate is what led me to caching. Local Caching seems to still be a blackbox. It's great when it hits, but when it misses, it's questionable why exactly it missed. I have tried to get more insight into the PROVISIONING stage to no avail. What exactly is going on with caching in terms of expiry and what it is caching? Could we have more visibility into the cache?

@StarpTech
Copy link

StarpTech commented Dec 9, 2019

Today, I tried to implement --cache-from in our pipelines but if failed (commands were re-executed) because of the fact that the local docker cache is not working. I could execute the same commands locally and benefit from the cache.

We use aws/codebuild/standard:3.0 and enabled privileged mode + the local docker cache. Any idea?

@dimitry
Copy link

dimitry commented Dec 16, 2019

@gabrielenosso did you ever figure out how to adapt this to a Windows image?

@0xMH
Copy link

0xMH commented Apr 8, 2020

@StarpTech Have you reached to a solution?

@dlozano
Copy link

dlozano commented Apr 26, 2020

docker 1.18.09 allowed buildkit's automatic pull for caches. Blog post
docker-compose 1.25.1 allows to use buildkit and therefore automatic pull from cache

There is any plan to update from 1.24 to 1.25.x ? Seems that would help with this issue.

@themsaid
Copy link

What's the point of local cache if 15 minutes is the maximum life span? Serious question!

@ivanmartos
Copy link

is there any recommended way how to cache docker layers for more than a life span of a codebuild?
my UC - for every build I pull mysql image from public docker registry to execute some local integration tests. How can I cache this image between builds (Let's say I have build frequency 2 times a day)

@n1ru4l
Copy link

n1ru4l commented Feb 8, 2021

Docker registries now allow layer caching. Unfortunately, this is not supported by ECR yet. aws/containers-roadmap#876

@kylemacfarlane
Copy link

kylemacfarlane commented Apr 20, 2021

None of the examples in this thread worked for me but I managed to find a solution.

They key is to enable the BuildKit inline cache.

A cut down example:

phases:
  install:
    runtime-versions:
      docker: 19
  build:
    commands:
      - echo Logging in to Amazon ECR...
      - $(aws ecr get-login --no-include-email --region $AWS_REGION)

      - echo Pulling image for cache...
      - docker pull $REPO_URI:$IMAGE_TAG || true

      - echo Building image...
      - DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 --tag $REPO_URI:$IMAGE_TAG --cache-from $REPO_URI:$IMAGE_TAG .

The first time this runs it will still rebuild entirely but it will place an inline cache in the finished image. Then every future build will use the inline cache and be much faster.

It doesn't seem to like some parallel stages, i.e. you build in one stage and only copy out the final binary. It makes sense why they wouldn't be stored in the inline cache so you either need to store intermediary images as shown earlier in the thread or make your Dockerfile more linear.

@tamsky
Copy link
Contributor

tamsky commented Apr 22, 2021

Given that you're using --cache-from $REPO_URI:$IMAGE_TAG, the following commands are not required (unless you want to force a complete pull for every build.)

      - echo Pulling image for cache...
      - docker pull $REPO_URI:$IMAGE_TAG || true

buildkit now knows how to pull cache layers on demand.

More info:

@kylemacfarlane
Copy link

Given that you're using --cache-from $REPO_URI:$IMAGE_TAG, the following commands are not required (unless you want to force a complete pull for every build.)

      - echo Pulling image for cache...
      - docker pull $REPO_URI:$IMAGE_TAG || true

I found removing this doesn't work. After ~20 mins once the CodeBuild cache expires it does a full rebuild again and there's no indication of anything getting pulled down.

I guess you could grep docker images and only pull if needed but you run the risk of slowing down frequent builds with an increasingly stale cache.

@RonaldTechnative
Copy link

Based on this bugreport and this source I came up with the below solution.

  • It's not superperfect because the caching is not defined in AWS CodeBuild itself and it requires the installation of buildx but we can cache that (see example).
  • Requires no changes in Docker. Uses full Docker semantics on caching.
  • Suports multi-builds (mode=max).
  • It's 'efficient' in that each layer is separately stored and fetched in S3.
version: 0.2

env:
  shell: bash
phases:
  install:
    commands:
      - mkdir -vp ~/.docker/cli-plugins
      - |
        [[ ! -e ~/.docker/cli-plugins/docker-buildx ]] \
        && curl -L -o ~/.docker/cli-plugins/docker-buildx https://github.com/docker/buildx/releases/download/v0.10.3/buildx-v0.10.3.linux-amd64
      - chmod a+rx ~/.docker/cli-plugins/docker-buildx
  build:
    commands:
      - |
          docker buildx build \
             --progress=plain \
             -f ./Dockerfile \
             --push -t image:hash \
             --cache-to type=s3,region=${AWS_REGION},bucket=${DOCKER_CACHE_BUCKET_NAME},mode=max,name="frontend" \
             --cache-from type=s3,region=${AWS_REGION},bucket=${DOCKER_CACHE_BUCKET_NAME},name="frontend" \
             .

cache:
  paths:
    - /root/.docker/cli-plugins

@jared-christensen
Copy link

I documented what worked for me here, very close to @RonaldTechnative solution.
https://jareddesign.medium.com/my-experience-getting-docker-images-to-cache-in-aws-codebuild-using-ecr-974c5d9428ec

basically I did not have to install buildx I just had to create a builder
docker buildx create --use --name mybuilder --driver docker-container,

@Janosch
Copy link

Janosch commented Sep 16, 2024

As an example for the registry cache storage backend is missing, I am providing mine. It is similar to the inline storage backend used by @Rathgore's script, but supports multi-stage natively without having to manage the cache of the stages manually. Key features are:

  • cache and image are separated
  • cache artifact can be stored in the same registry as the image (or in a different registry)
  • supports multi-stage images out of the box using min/max mode (as in the S3 example of @RonaldTechnative)
  • has some additional configuration options like cache compression
version: 0.2
env:
  variables:
    DOCKER_BUILDKIT: 1
    REPOSITORY_URI: "xxx.amazonaws.com"
    IMAGE_NAME: "my-image"
phases:
  install:
    commands:
      - docker buildx create --name container --driver=docker-container --driver-opt default-load=true
  pre_build:
    commands:
      - aws ecr get-login-password | docker login --username AWS --password-stdin $REPOSITORY_URI
  build:
    commands:
      - IMAGE_URL=$REPOSITORY_URI/$IMAGE_NAME
      - > 
        docker build 
        --cache-to mode=max,image-manifest=true,oci-mediatypes=true,type=registry,ref=$IMAGE_URL:build-cache
        --cache-from type=registry,ref=$IMAGE_URL:build-cache -t $IMAGE_URL:mytag --builder=container .
  post_build:
    commands:
      - docker push $IMAGE_URL:mytag

Installing buildx was not necessary, I assume it is already exists in the build environment aws/codebuild/amazonlinux2-x86_64-standard:5.0 which I use. The catch is that it only works with the default docker driver, if it is configured to use containerd image store. I did not manage to get this working, therefore I created a builder using the docker-container driver, which also supports the registry cache storage backend.

@droidlabour
Copy link

Thanks @Janosch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests