use k8s cluster default imagePullPolicy #50

arno01 · 2021-08-25T14:29:31Z

Reproducer

build docker image "test:123" which would just "echo 123" and "docker push" it to the registry;
create the deployment (send-manifest);
get the logs from the provider (provider lease-logs);
close the deployment
repeat first 4 steps except that do some modification to the image in step 1, say "echo 555", but keep the same image tag "test:123".

Update: this is mostly about the image pull policy for the images with the :latest and untagged images as seen from below discussion. Other tags, such as test:123 should stay test:123 1:1 (immutable).

Expected behavior:

Provider should pull the new image if it has :latest tag or untagged.

Actual behavior:

The provider will not pull the new image, it will start the old image.

Workaround:

There is no workaround to a changed default K8s behavior.

Provider in question:

I haven't tested different providers.

"equinix-metal-ams1","akash","mn2-ng","https://provider.provider-0.prod.ams1.akash.pub:8443","akash14c4ng96vdle6tae8r4hc2w4ujwrsh3x9tuudk0"

The text was updated successfully, but these errors were encountered:

arno01 · 2021-08-25T14:32:07Z

cc @dmikey

boz · 2021-08-25T14:47:08Z

@arno01, this is expected behavior. You need to use a different tag to deploy updates. Images are heavily cached, pushing changes to a single tag will cause both versions to be running very easily, even if we forced a pull every time.

arno01 · 2021-08-29T20:23:34Z

@arno01, this is expected behavior. You need to use a different tag to deploy updates. Images are heavily cached, pushing changes to a single tag will cause both versions to be running very easily, even if we forced a pull every time.

@boz this should be mentioned in the Akash Docs then, since I've already seen two cases where people would not figure why their image is not working at one provider but does at the other one. 1 2

And then this breaks the default Kubernetes behavior for the :latest image tag, since it should always be re-pulling images of this tag:

omit the imagePullPolicy and use :latest as the tag for the image to use; Kubernetes will set the policy to Always.

Source: https://kubernetes.io/docs/concepts/containers/images/#updating-images

I would actually be more inclined to let people specify in their deployment manifests the imagePullPolicy in order to get more flexible about this.

And then, most of people are going to be using same tags, e.g. nginx:stable, nginx:mainline, nginx:latest, nginx:1.21 (can replace nginx practically with any image).
The issue I see with that (the providers are heavily caching the images) is that the security issues in the libraries of the images (nor the apps themselves in some cases, i.e. nginx:stable, nginx:1, nginx:latest) will not get pushed as people re-deploy their deployments should they get alerted by CVE reports.

boz · 2021-08-29T21:44:47Z

it is best practice to use a version tag. we can add that to the docs.

boz · 2021-08-30T04:29:34Z

It says as much right in your source

Caution:
You should avoid using the latest tag when deploying containers in production, as it is harder to track which version of the image is running and more difficult to roll back to a working version.

Instead, specify a meaningful tag such as v1.42.0.

arno01 · 2021-08-31T10:19:44Z

@boz

It says as much right in your source

Caution:
You should avoid using the latest tag when deploying containers in production, as it is harder to track which version of the image is running and more difficult to roll back to a working version.

Instead, specify a meaningful tag such as v1.42.0.

I agree with that recommendation, however the concern I've raised in this issue has nothing to do with that recommendation.

This is a recommendation for running the production containers which many of people do not.
Many (probably even the majority) are going to be using no tag which defaults to :latest. (People are usually modifying their images then pushing them again and over again, with :latest tag.)
(Edit: And this is not only about the :latest tag. See Update 1 below)

As I'm typing this, there is a 3rd occurrence I'm seeing people are struggling because of this and I'm having to explain them the Akash providers are heavily caching the images, they should use a new tag for every new image update.

My point is not to lobby for using the :latest tag but rather to fix the imagePullPolicy which is not currently set to Always for the untagged images (or tagged with :latest tag) on the Akash providers.
I suggest to stick with the default imagePullPolicy
otherwise there are more and more people will be hitting this issue, there is no doubt about it.

Update 1:
And then, the same applies to the other than :latest image tags.
There are plenty of projects leverage CI/CD for constantly building and re-pushing the same tags, when that happens, the libraries are getting updated (i.e. openssl library in the nginx container).
In this case, the image tag stays always same, just as I've already described above https://github.com/ovrclk/akash/issues/1354#issuecomment-907866119
Which leads to security issues when Akash provider caches these images and is not re-pulling them.

Basically this deviation from the defaults is not just a nuisance but rather a security issue in not having to re-pull the images. (simplest example: nginx:stable won't get re-pulled, therefore, users won't get security updates on the redeployment nor other people who make their first nginx:stable deployment will likely to receive a stale cached copy of nginx:stable [which may turn out to not be as :stable as it sounds ;)])

zjor · 2021-10-16T09:05:26Z

100% agree with @arno01
I wish :latest worked out of the box

tidrolpolelsef · 2022-07-05T20:27:43Z

My recommendation back when mainnet launched was to block :latest, :stable etc. The labels don't actually mean anything unfortunately. For example, there is no guaranteed that :latest is in fact the latest.

andy108369 · 2022-07-14T22:40:52Z

For example, there is no guaranteed that :latest is in fact the latest.

It should guarantee that for the moment someone deploys the app, which isn't working that way as of now because of imagePullPolicy: IfNotPresent.
It won't mean it's the :latest some prolonged time from the moment someone deployed the app.

This is how K8s works and have always been working by default everywhere for the :latest tag, I've mentioned that above

I'd prefer sticking with the widely accepted defaults (for imagePullPolicy) rather than inventing something new.

And should we want :latest to actually mean they will always be :latest, we should probably look into implementing something like https://keel.sh which would automatically take care updating the images whenever there is a new behind the :latest tag. But there should be a toggle for that, i.e. services.<name>.image_auto_update: true for example.

As a compromise, the imagePullPolicy could be set via the SDL i.e. services.<name>.imagePullPolicy: always for example.

andy108369 · 2022-07-14T22:50:13Z

To get the default image pull policy, we should only remove this line https://github.com/ovrclk/akash/blob/v0.16.4/provider/cluster/kube/builder/workload.go#L56

From the doc

When you first create a Deployment, StatefulSet, Pod, or other object that includes a Pod template, then by default the pull policy of all containers in that pod will be set to IfNotPresent if it is not explicitly specified. This policy causes the kubelet to skip pulling an image if it already exists.

So K8s sets PullIfNotPresent by default, just like we do, except that when we do it, we ignore these special conditions:

Default image pull policy
When you (or a controller) submit a new Pod to the API server, your cluster sets the imagePullPolicy field when specific conditions are met:

if you omit the imagePullPolicy field, and the tag for the container image is :latest, imagePullPolicy is automatically set to Always;

if you omit the imagePullPolicy field, and you don't specify the tag for the container image, imagePullPolicy is automatically set to Always;

if you omit the imagePullPolicy field, and you specify the tag for the container image that isn't :latest, the imagePullPolicy is automatically set to IfNotPresent.

https://kubernetes.io/docs/concepts/containers/images/#imagepullpolicy-defaulting

As @tidrolpolelsef said, it's one of those weird config options where 'unset' has its own unique meaning :-)

andy108369 · 2022-11-28T14:07:32Z

PR akash-network/provider#54

andy108369 · 2022-11-28T20:49:22Z

FWIW, I've tested this further, apparently the HEAD requests are used for obtaining the image reference and they aren’t triggering the Docker Hub rate limiter upon Pod restart even with the imagePullPolicy: Always, as long as image reference haven't been updated on the remote. I.e., this triggers the rate limiter only when it actually pulls the new image, not just queries (HEAD) for the new one.

https://asciinema.org/a/541059

The Docker Hub API isn't too restrictive (100 pulls per 6 hours per IP address). Source

anilmurty · 2022-11-28T23:40:46Z

Wanted to update this thread after my discussion with @boz today:
Pros of supporting/ allowing users to use the "latest" tag in SDLs (and pulling a fresh uncached image when used or no tag specified):

Allows autoupdating of container images in cases like presearch
Allows us to potentially build a CI/CD style pipeline where new artifacts get deployed automatically to akash
Allows easier integration with solutions like Fleek.co
Is consistent with Kubernetes default behavior

For these reasons we are going to work to allow this. The only concern/ downside is making sure we document this behavior clearly and note that a running pod will not be updated unless the pod is terminated (forcing a new pod to be spun up, which will result in the "latest" latest image to be pulled down) - which I believe is how solutions like the presearch "autopdater" handle this anyway (their autoupdater is a fork of https://github.com/containrrr/watchtower)

andy108369 · 2022-11-28T23:51:46Z

forcing a new pod to be spun up, which will result in the "latest" latest image to be pulled down

And this (the pod restart) can be easily triggered from within the pod itself or the outside (through lease-shell, then kill <child process of PID 1> inside the pod).
Either way, this can also be scheduled, etc

vpavlin · 2024-08-19T08:25:55Z

Completely agree - this isn't just about latest, but any tag (test, dev, stage, rolling..) - I should have the freedom to be able to deploy from a tag and have deployed what image hash the tag is actually pointing to.

+100 for imagePullPolicy: always:)

boz closed this as completed Aug 25, 2021

arno01 reopened this Jun 11, 2022

andy108369 mentioned this issue Jul 15, 2022

use k8s cluster default imagePullPolicy akash-network/node#1668

Closed

andy108369 changed the title ~~provider would not pull the new image when the tag is same~~ use k8s cluster default imagePullPolicy Jul 15, 2022

andy108369 referenced this issue in akash-network/provider Nov 28, 2022

use k8s default imagePullPolicy (fixes ovrclk/akash#1354)

ca1b741

arno01 mentioned this issue Aug 31, 2021

use k8s cluster default imagePullPolicy #50

Open

troian added the repo/provider Akash provider-services repo issues label Feb 17, 2023

troian transferred this issue from akash-network/node Feb 17, 2023

andy108369 mentioned this issue Mar 31, 2023

feature request: support client-managed healthchecks (livenessProbe, readinessProbe) and imagePullPolicy #85

Open

andy108369 mentioned this issue Sep 1, 2023

Update deploy.yaml akash-network/awesome-akash#393

Merged

andy108369 mentioned this issue Aug 19, 2024

dria: add DKN compute node akash-network/awesome-akash#556

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use k8s cluster default imagePullPolicy #50

use k8s cluster default imagePullPolicy #50

arno01 commented Aug 25, 2021 •

edited by andy108369

Loading

arno01 commented Aug 25, 2021

boz commented Aug 25, 2021

arno01 commented Aug 29, 2021 •

edited

Loading

boz commented Aug 29, 2021

boz commented Aug 30, 2021

arno01 commented Aug 31, 2021 •

edited

Loading

zjor commented Oct 16, 2021

tidrolpolelsef commented Jul 5, 2022

andy108369 commented Jul 14, 2022 •

edited

Loading

andy108369 commented Jul 14, 2022 •

edited

Loading

andy108369 commented Nov 28, 2022

andy108369 commented Nov 28, 2022 •

edited

Loading

anilmurty commented Nov 28, 2022

andy108369 commented Nov 28, 2022 •

edited

Loading

vpavlin commented Aug 19, 2024

use k8s cluster default imagePullPolicy #50

use k8s cluster default imagePullPolicy #50

Comments

arno01 commented Aug 25, 2021 • edited by andy108369 Loading

Reproducer

Expected behavior:

Actual behavior:

Workaround:

Provider in question:

arno01 commented Aug 25, 2021

boz commented Aug 25, 2021

arno01 commented Aug 29, 2021 • edited Loading

boz commented Aug 29, 2021

boz commented Aug 30, 2021

arno01 commented Aug 31, 2021 • edited Loading

zjor commented Oct 16, 2021

tidrolpolelsef commented Jul 5, 2022

andy108369 commented Jul 14, 2022 • edited Loading

andy108369 commented Jul 14, 2022 • edited Loading

andy108369 commented Nov 28, 2022

andy108369 commented Nov 28, 2022 • edited Loading

anilmurty commented Nov 28, 2022

andy108369 commented Nov 28, 2022 • edited Loading

vpavlin commented Aug 19, 2024

arno01 commented Aug 25, 2021 •

edited by andy108369

Loading

arno01 commented Aug 29, 2021 •

edited

Loading

arno01 commented Aug 31, 2021 •

edited

Loading

andy108369 commented Jul 14, 2022 •

edited

Loading

andy108369 commented Jul 14, 2022 •

edited

Loading

andy108369 commented Nov 28, 2022 •

edited

Loading

andy108369 commented Nov 28, 2022 •

edited

Loading