Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Disable Image Scanning with a flag #2524

Closed
wants to merge 2 commits into from

Conversation

johnsushant
Copy link
Contributor

@johnsushant johnsushant commented Oct 17, 2019

Fixes: #2411

@johnsushant johnsushant changed the title Disable Image Scanning with a flag [WIP] Disable Image Scanning with a flag Oct 17, 2019
@johnsushant johnsushant changed the title [WIP] Disable Image Scanning with a flag Disable Image Scanning with a flag Oct 17, 2019
@johnsushant johnsushant changed the title Disable Image Scanning with a flag [WIP] Disable Image Scanning with a flag Oct 17, 2019
@johnsushant johnsushant changed the title [WIP] Disable Image Scanning with a flag Disable Image Scanning with a flag Oct 17, 2019
@2opremio 2opremio requested a review from squaremo November 6, 2019 12:06
@2opremio
Copy link
Contributor

2opremio commented Nov 6, 2019

@johnsushant Thanks for this! Sorry it took so long to get back to you. It looks good, but I would like @squaremo to also take a look.

Also, could you please add a test?

@squaremo
Copy link
Member

squaremo commented Nov 27, 2019

A cosmetic thing: I would prefer the flag were named --automation-disable, to fit better with existing flags. (Loth as I am to have a double negative (disable=false), it's better to have boolean flags default to false.) This will entail swapping the clauses in the description of the flag around too -- "disables automated manifest updates, including scanning image registries"

@stefanprodan
Copy link
Member

We should make sure that when this flag is used Flux can run without Memcached. @johnsushant can you test this please?

@2opremio
Copy link
Contributor

2opremio commented Dec 6, 2019

To test this PR, you can use image 2opremio/flux:disable-image-scan-a7e5d409, and supply flag --disable-image-scan

@ctryti
Copy link

ctryti commented Dec 7, 2019

I just tested this image with --disable-image-scan (also double checked with --disable-image-scan=true), but I'm still seeing logs like this:

(during startup)
caller=memcached.go:74 component=memcached err="Error setting memcache servers to 'memcached': lookup _memcached._tcp.memcached on 10.23.240.10:53: no such host"
...
(every minute)
caller=memcached.go:156 component=memcached err="error updating memcache servers: lookup _memcached._tcp.memcached on 10.23.240.10:53: no such host"
caller=memcached.go:156 component=memcached err="error updating memcache servers: lookup _memcached._tcp.memcached on 10.23.240.10:53: no such host"

Here are the interesting bits from the deployment.yaml:

      containers:
      - args:
        - --git-poll-interval=2m
        - --ssh-keygen-dir=/var/fluxd/keygen
        - --git-url=ssh://git@our-git-repo
        - --git-branch=develop
        - --git-label=flux
        - --git-user=flux
        - --git-email=team@our-email
        - --git-path=comma,separated,list,of,our,charts
        - --manifest-generation=true
        - --registry-exclude-image=*
        - --sync-state=secret
        - --git-readonly=true
        - --listen-metrics=:3031
        - --disable-image-scan=true
        image: 2opremio/flux:disable-image-scan-a7e5d409

No memcached pod running.

@2opremio
Copy link
Contributor

@johnsushant would you mind if I take over and finish the PR?

@2opremio 2opremio self-assigned this Dec 16, 2019
@2opremio
Copy link
Contributor

I would prefer the flag were named --automation-disable, to fit better with existing flags.

@squaremo I am taking this over and, although initially I liked --automation-disable, I am not sure it's the best name. I don't think it's precise enough, because disabling scanning also impacts things such us fluxctl list-images which is not part of automation in strict terms.

@2opremio
Copy link
Contributor

2opremio commented Dec 17, 2019

Actually, I think we are mixing things up (at least I am). There are a few things impacted by this PR, which are related, but don't need to necessarily be enabled or disabled in a single block:

  1. Image automatic updating
  2. Image scanning (aka cache warming)
  3. Image caching / memcached
  4. fluxctl list-images
  5. fluxctl release --update-image (update an image to certain tag)
  6. fluxctl release --update-all-images (update all images to their latest tag)

1 depends on 2
2 depends on 3
4 and 5 currently depend on 2 (since they depend on the registry, which is currently cached and the cache is warmed up)
6 depends on 2 because without scanning/warming you cannot currently know what's the latest image

This PR currently disables (1) and (2) and, in that context, it makes sense to rename --disable-image-scan to --automation-disable (so, I take that back, sorry @squaremo )

No. bear in mind that disabling (2) will also impact the performance of (3) and (4) since the image information won't be cached by default.

In fact, I think that disabling (2) currently also breaks (4) and (5) since, AFAICT, the cache is only filled by the warmer. If the warmer is off, the cache will always be empty and thus, there will be no images to list and no

That can be fixed by filling the cache on demand instead of using the warmer.

We could also simply disable (3) (as stefan suggests) and use a live registry, but it will have a performance impact.

Actions:

  1. Add a proper error for (6) when scanning/automation is disabled
  2. Decide what to do about (4) and (5). Options are:
    a. Add an error like for (6)
    b. Use a live registry (no cached registry, as @stefanprodan suggests)
    c. Use a cache but fill it up on demand (since scanning is disabled).
    d. (c), but with an in-memory cache, to get rid of memcached (but this would be quite a bit of work)

I would go for 2.b) (if users want higher performance, they can re-enable automation). @squaremo @stefanprodan @hiddeco thoughts?

Also, I think I would keep the --disable-image-scan flag since it reflects the performance implications better.

We could also have two separate flags --disable-automation (which only disables automatic updates) and --disable-image-scan (which implies --disable-automation and also stops scanning images).

@hiddeco
Copy link
Member

hiddeco commented Dec 17, 2019

@2opremio I agree that 2.b would theoretically be the best solution but I am not quite sure it will hold in real environments.

The fact that some registries have become stricter and stricter with rate limits, combined with images with a substantial amount of tags, makes me think that instead of a 'performance penalty' we simply won't be able to fulfil the request on time, most of the time. This is however a theory and can be debunked quite easily by just trying it out, if it doesn't work my vote goes to 2.a.

@squaremo
Copy link
Member

The fact that some registries have become stricter and stricter with rate limits, combined with images with a substantial amount of tags, makes me think that instead of a 'performance penalty' we simply won't be able to fulfil the request on time, most of the time. This is however a theory and can be debunked quite easily by just trying it out

Originally, the flux server did fetch all the image data when requested, and it was unworkable for all the reasons Hidde mentions. You could say the experiment has already been performed.

I suggested --automation-disable on the basis that image repo scanning, fluxctl release and even fluxctl list-{images,workloads} are part of release automation. From that standpoint, returning an error in response to fluxctl list-images would be justifiable -- but it would be friendlier to return, say, the workload information and a placeholder for image information (as is done when the database hasn't been populated yet).

My interpretation of automation may not be the most common understanding, so may be a better name for the flag, but that's the way I think it should work in any case.

@@ -744,7 +745,9 @@ func main() {
cacheWarmer.Priority = daemon.ImageRefresh
cacheWarmer.Trace = *registryTrace
shutdownWg.Add(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Add() should had been inside the if block

@2opremio
Copy link
Contributor

Closing this in favor of #2745

@2opremio 2opremio closed this Jan 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Turn off the docker scan option completely.
6 participants