-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store digest of latest image in ImagePolicy status #368
base: main
Are you sure you want to change the base?
Conversation
Using the new `:digest` suffix in policy markers one can now instruct IAC to store the digest of an image instead of the tag in the given manifest. This makes use of the the `.status.imageDigest` field introduced in fluxcd/image-reflector-controller#368. closes #165 Signed-off-by: Max Jonas Werner <[email protected]>
@darkowlzz I pushed a much more lightweight approach where the ImagePolicyReconciler does the digest fetching now so that only the digest for the latest tag is fetched. |
8b3335a
to
0ffd955
Compare
This is now ready for another round of reviews. I adapted the PR description to properly reflect the changes. /cc @stefanprodan @darkowlzz |
Using the new `:digest` suffix in policy markers one can now instruct IAC to store the digest of an image instead of the tag in the given manifest. This makes use of the the `.status.imageDigest` field introduced in fluxcd/image-reflector-controller#368. closes #165 Signed-off-by: Max Jonas Werner <[email protected]>
Using the new `:digest` suffix in policy markers one can now instruct IAC to store the digest of an image instead of the tag in the given manifest. This makes use of the the `.status.imageDigest` field introduced in fluxcd/image-reflector-controller#368. closes #165 Signed-off-by: Max Jonas Werner <[email protected]>
Using the new `:digest` suffix in policy markers one can now instruct IAC to store the digest of an image instead of the tag in the given manifest. This makes use of the the `.status.imageDigest` field introduced in fluxcd/image-reflector-controller#368. closes #165 Signed-off-by: Max Jonas Werner <[email protected]>
Using the new `:digest` suffix in policy markers one can now instruct IAC to store the digest of an image instead of the tag in the given manifest. This makes use of the the `.status.imageDigest` field introduced in fluxcd/image-reflector-controller#368. closes #165 Signed-off-by: Max Jonas Werner <[email protected]>
@darkowlzz all comments are addressed. Please take a look. |
The new API field `.status.latestDigest` in the `ImagePolicy` kind stores the digest of the image referred to by the the `.status.latestImage` field. The setting of this field is governed by the newly introduced field `.spec.digestReflectionPolicy` which takes one of the values `Always` or `IfNotPresent`. See the updated documentation under `docs/spec/` for details. The new status field can be used to pin an image to an immutable descriptor rather than to a potentially moving tag, increasing the security of workloads deployed on a cluster. The goal is to make use of the digest in IAC so that manifests can be updated with the actual image digest. Signed-off-by: Max Jonas Werner <[email protected]>
This new field summarizes all data reflecting an image reference, i.e. the repository name, tag and digest. Since this change changes the API in a backwards-incompatible way, the new API version v1beta3 is introduced. Signed-off-by: Max Jonas Werner <[email protected]>
This way we circumvent issues with server-side apply so that users can explicitly change this field instead of having to remove it. The latter case might lead to the API server not removing it if another field manager is registered for that field, causing an unintended drift. This commit also aligns the v1beta3 API with the latest changes done in v1beta2. Signed-off-by: Max Jonas Werner <[email protected]>
We agreed to make the changes in the existing v1beta2 API version. Signed-off-by: Max Jonas Werner <[email protected]>
Signed-off-by: Max Jonas Werner <[email protected]>
Signed-off-by: Max Jonas Werner <[email protected]>
Signed-off-by: Max Jonas Werner <[email protected]>
Signed-off-by: Max Jonas Werner <[email protected]>
.spec.image has no relevance in the given package, anymore. Signed-off-by: Max Jonas Werner <[email protected]>
Signed-off-by: Max Jonas Werner <[email protected]>
These must have been leftovers from previous iterations of this test. Signed-off-by: Max Jonas Werner <[email protected]>
1. Default digestReflectionPolicy to "Never" and add a getter. With the getter method we will never encounter an empty policy even if defaulting hasn't taken place. 2. Make status.latestRef a pointer to align with status.observedPreviousRef. Having both fields be pointers makes it easier to use them in code so we only have to compare to nil and not the zero value. Signed-off-by: Max Jonas Werner <[email protected]>
The field hasn't been set properly before. Correct behaviour is backed by associated unit tests. Signed-off-by: Max Jonas Werner <[email protected]>
The updated documentation has gotten lost due to the back and forth with v1beta3. Signed-off-by: Max Jonas Werner <[email protected]>
4ba8d9c
to
5cfa766
Compare
api/v1beta2/imagepolicy_types.go
Outdated
@@ -42,8 +42,26 @@ type ImagePolicySpec struct { | |||
// ordered and compared. | |||
// +optional | |||
FilterTags *TagFilter `json:"filterTags,omitempty"` | |||
// DigestReflectionPolicy governs the setting of the `.status.latestDigest` field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should have been
// DigestReflectionPolicy governs the setting of the `.status.latestDigest` field. | |
// DigestReflectionPolicy governs the setting of the `.status.latestRef` field. |
but maybe more specifically .status.latestRef.digest
as the description doesn't say anything about the digest. On that point, the description doesn't talk anything about the common/simple meaning of this field. Nor does the description of ReflectionPolicy
, the underlying type, provides any relevant information about the tag's digest. It mentions "a value from the registry in a certain resource field".
Wouldn't it be better to describe this field independent of what's in the status, a more general meaning for those who may not be familiar with status or why someone would set this. It's about including the tag digest in the resulting latest image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going through the latest changes, the concern about the definition of the field depending on the status is still not addressed.
Wouldn't it be better to describe this field independent of what's in the status, a more general meaning for those who may not be familiar with status or why someone would set this. It's about including the tag digest in the resulting latest image.
?
// If the old latest image and new latest image don't match, set the old | ||
// image as the observed previous image. | ||
// NOTE: The following allows the previous image to be set empty when | ||
// there's a failure and a subsequent recovery from it. This behavior helps | ||
// avoid creating an update event as there's no previous image to infer | ||
// from. Recovery from a failure shouldn't result in an update event. | ||
if oldObj.Status.LatestImage != obj.Status.LatestImage { | ||
if oldObj.Status.LatestRef == nil || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of an edge case that occurs when updating to the new version with new fields. This check of latest ref to be nil results in the object to have an observed previous image (old field) and latest image (and latest ref as well) to be equal. This is because while migrating/updating the latest ref of old object is nil during migrating as it's a new field. The latest image (old field) still has the old data but since that's not checked here, it results in this scenario.
An example to reproduce it:
An ImagePolicy with an old version of the controller
spec:
imageRepositoryRef:
name: podinfo
policy:
semver:
range: 5.0.x
status:
conditions:
- lastTransitionTime: "2023-09-20T22:15:50Z"
message: Latest image tag for 'ghcr.io/stefanprodan/podinfo' resolved to 5.0.3
observedGeneration: 1
reason: Succeeded
status: "True"
type: Ready
latestImage: ghcr.io/stefanprodan/podinfo:5.0.3
observedGeneration: 1
After updating the controller and API to the new version, the following is seen
spec:
digestReflectionPolicy: Never
imageRepositoryRef:
name: podinfo
policy:
semver:
range: 5.0.x
status:
conditions:
- lastTransitionTime: "2023-09-20T22:15:50Z"
message: Latest image tag for 'ghcr.io/stefanprodan/podinfo' resolved to 5.0.3
observedGeneration: 1
reason: Succeeded
status: "True"
type: Ready
latestImage: ghcr.io/stefanprodan/podinfo:5.0.3
latestRef:
image: ghcr.io/stefanprodan/podinfo
tag: 5.0.3
observedGeneration: 1
observedPreviousImage: ghcr.io/stefanprodan/podinfo:5.0.3
Note that there's no observed previous ref.
Calling it an "edge case" because this may not happen when the controller restarts, the database is empty, and ImagePolicy attempts to reconcile before ImageRepository populates the database. Failing ImagePolicy erases the latest image data completely and next time it succeeds, everything appears fine as if it's a new object and everything gets set properly. But logically, it's a bug which can be reproduced when running this locally (with make run
) or controller data is backed by persistent storage. I was able to reproduce this without a PVC, which means it can happen in the usual deployments as well.
I think for proper update/migration, all the new fields should copy the value from the old fields first and later apply any mutation if relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed it and added a test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through the fix and it doesn't look like it addressed the core issue as I pointed out above. Now it just ensures that the reconciler would not touch the previous observations if the latest ref of old object was empty. This only works for objects that didn't had the old observed previous image in the status. It doesn't work for objects that had observed previous image.
For example, before the migration, I had the following object status:
status:
conditions:
- lastTransitionTime: "2023-10-03T11:58:48Z"
message: Latest image tag for 'ghcr.io/stefanprodan/podinfo' updated from 5.0.3
to 5.1.4
observedGeneration: 2
reason: Succeeded
status: "True"
type: Ready
latestImage: ghcr.io/stefanprodan/podinfo:5.1.4
observedGeneration: 2
observedPreviousImage: ghcr.io/stefanprodan/podinfo:5.0.3
After migration:
status:
conditions:
- lastTransitionTime: "2023-10-03T11:59:13Z"
message: Latest image tag for 'ghcr.io/stefanprodan/podinfo' resolved to 5.1.4
observedGeneration: 2
reason: Succeeded
status: "True"
type: Ready
latestImage: ghcr.io/stefanprodan/podinfo:5.1.4
latestRef:
image: ghcr.io/stefanprodan/podinfo
tag: 5.1.4
observedGeneration: 2
observedPreviousImage: ghcr.io/stefanprodan/podinfo:5.0.3
The deprecated observedPreviousImage
exists but not the new observedPreviousRef
.
I still think that the proper fix is what I suggested in my previous comment
I think for proper update/migration, all the new fields should copy the value from the old fields first and later apply any mutation if relevant.
I couldn't comment about the notification message inline as that's not modified in the changes. The current code doesn't change the |
|
||
func (r *ImagePolicyReconciler) fetchDigest(ctx context.Context, repo *imagev1.ImageRepository, latest string, obj *imagev1.ImagePolicy) (string, error) { | ||
ref := strings.Join([]string{repo.Spec.Image, latest}, ":") | ||
tagRef, err := name.ParseReference(ref) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that ImagePolicy parses the image and makes an external request similar to ImageRepository, I think it should also handle basic sanity checks and prevent unnecessary retries when it's clear that retrying won't change anything.
When a bad image reference is provided, say ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa
the ImageRepository enters stalled state:
spec:
exclusionList:
- ^.*\.sig$
image: ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa
interval: 1h
provider: generic
status:
canonicalImageName: ghcr.io/stefanprodan/podinfo
conditions:
- lastTransitionTime: "2023-09-21T01:10:56Z"
message: 'could not parse reference: ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa'
observedGeneration: 17
reason: ImageURLInvalid
status: "True"
type: Stalled
- lastTransitionTime: "2023-09-21T01:10:56Z"
message: 'could not parse reference: ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa'
observedGeneration: 17
reason: ImageURLInvalid
status: "False"
type: Ready
lastScanResult:
latestTags:
- latest
- 6.4.1
...
scanTime: "2023-09-21T01:06:09Z"
tagCount: 46
observedExclusionList:
- ^.*\.sig$
observedGeneration: 17
But the associated ImagePolicy fails and enters a retry loop
spec:
digestReflectionPolicy: IfNotPresent
imageRepositoryRef:
name: podinfo
policy:
alphabetical:
order: asc
status:
conditions:
- lastTransitionTime: "2023-09-21T01:10:56Z"
message: reconciliation in progress
observedGeneration: 12
reason: ProgressingWithRetry
status: "True"
type: Reconciling
- lastTransitionTime: "2023-09-21T01:10:56Z"
message: 'failed fetching digest of ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa:latest:
failed parsing reference "ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa:latest":
could not parse reference: ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa:latest'
observedGeneration: 12
reason: Failed
status: "False"
type: Ready
latestImage: ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa:latest
latestRef:
image: ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa
tag: latest
observedGeneration: 12
observedPreviousImage: ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa:latest
observedPreviousRef:
image: ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa
tag: latest
With logs:
{"level":"error","ts":"2023-09-21T01:10:56.776Z","msg":"Reconciler error","controller":"imagepolicy","controllerGroup":"image.toolkit.fluxcd.io","controllerKind":"ImagePolicy","ImagePolicy":{"name":"podinfo","namespace":"default"},"namespace":"default","name":"podinfo","reconcileID":"7d5c92dc-aded-4bdc-a8b1-e7fb036e56a9","error":"failed fetching digest of ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa:latest: failed parsing reference \"ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa:latest\": could not parse reference: ghcr.io/stefanprodan/podinfo/foo:bar:zzz:qqq/aaa:latest"}
ImagePolicy doesn't check readiness of ImageRepository because it should be able to provide latest image based on the last valid configuration.
This behavior also wrongly sets the latest and observed previous ref of ImagePolicy to the bad image. I think it should only set valid values and avoid reconciling failing configurations so that the status will contain the last valid latest image even if the current configuration is failing.
It may be better to parse the image early in the ImagePolicy reconciler if digest is enabled and based on the that stall if the image reference is invalid. When the ImageRepository will be updated with good image reference, it'll trigger ImagePolicy to reconcile again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an issue that's not introduced through this PR so I've created a separate GH issue for it: #449
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#449 doesn't seem to be a bug/issue. That's how flux controllers work. Objects that have dependency on other objects usually take the last successful result. If a source object starts failing, kustomization will not fail, but use the last successful artifact from the source. This is why I mentioned
ImagePolicy doesn't check readiness of ImageRepository because it should be able to provide latest image based on the last valid configuration.
above. This is the reasoning behind the behavior, not a bug.
In my above comment, I'm describing a new issue introduced by this change because ImagePolicy reconciler now parses the image. This is a new behavior introduced by this change. Reading my comment again, I think the explanation is clear enough. Because of parsing image, ImagePolicy now enters into a failure state due to invalid image. I can explain again in some other way if the issue is really not clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#449 doesn't seem to be a bug/issue. That's how flux controllers work.
It is a bug. Look at the .status.latestImage
field value in the issue description. It's using the broken image reference and reports success on it. This is wrong (because obviously the tag hasn't been derived from that specific image reference) and misleading (because it doesn't tell the user that the image reference is actually wrong).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a bug, but that's not what #449 describes. It says that the condition should be not ready, but instead, it's a bug in where the image is read from. Bug and the issue title or description don't match.
Since the ImageRepository doesn't save the observed image, we can't reliably provide the correct image to ImagePolicy. The canonical image name still points to the right image, but that's not the same as the input image. We can either add a new field in ImageRepository, say observedImage
, that's updated only on successful reconciliation, or ImagePolicy reconciler should update the latest image only when ImageRepository is ready and skip reconciliation otherwise. I think a new observed image field would be better. This has to be addressed separately, out of scope of this change.
The above issue is not what I've tried to point out in this thread. The situation of getting into a failure retry loop is still due to a change introduced in this change and is still in scope of this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides @darkowlzz his remarks, I see no further blockers.
Signed-off-by: Max Jonas Werner <[email protected]>
Signed-off-by: Max Jonas Werner <[email protected]>
Signed-off-by: Max Jonas Werner <[email protected]>
While working on static HelmRepository OCI, Alerts and Providers support in CLI, I made some changes in fluxcd/flux2#4298 that allows to conditionally make objects reconciliable. I think we can use the same for ImagePolicy to allow reconciliation depending on the digest reflection policy. |
@makkes Hi, As you did all the work on this PR and the one in image-automation-controller (thank you) Is this something you still working on ? if not, should I try to understand more what is missing and write some change or is there any plan to make this feature happens ? thank you |
API changes
In addition to the latest selected tag, ImagePolicy now allows to also record the digest of the tag in
.status.latestRef.digest
. The setting of this field is governed by the newly introduced field.spec.digestReflectionPolicy
which takes one of the valuesAlways
,Never
orIfNotPresent
. See the updated documentation underdocs/spec/
for details.The new
.status.latestRef.digest
status field can be used to pin an image to an immutable descriptor rather than to a potentially moving tag, increasing the security of workloads deployed on a cluster.The goal is to make use of the digest in IAC so that manifests can be updated with the actual image digest.
ImagePolicy now gathers all information about the selected image in the
.status.latestRef
field, i.e..status.latestImage
is now split up into.status.latestRef.{image,tag,digest}
. The same is true for.status.observedPreviousImage
for which the accompanying new field is called.status.observedPreviousRef
. The existing fields are retained for backwards-compatibility.Implementation notes
Since both,
ImagePolicyReconciler
andImageRepositoryReconciler
now need to interact with registries, the interaction code (primarilysetAuthOptions
) was moved into the./internal/registry
package and theAuthOptionsGetter
type in it. AnAuthOptionsGetter
variable is created inmain.go
and injected into both reconcilers.closes #87
refs #214