Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolution fails when manually upgrading through legacy replaces chain beyond n+1 #1009

Closed
Tracked by #950
everettraven opened this issue Jul 3, 2024 · 3 comments · Fixed by #1017
Closed
Tracked by #950
Assignees

Comments

@everettraven
Copy link
Contributor

everettraven commented Jul 3, 2024

When attempting to manually step through upgrading versions of a package using the ClusterExtension using the replaces chain in the FBC channel, moving beyond the first replaced version results in the following status condition:

    - lastTransitionTime: "2024-07-03T19:44:12Z"
      message: 'error upgrading from currently installed version "0.1.0": no package
        "argocd-operator" matching version "0.3.0" in channel "alpha" found'
      observedGeneration: 4
      reason: ResolutionFailed
      status: "False"
      type: Resolved

In this case, I had followed the replaces chain from v0.1.0 --> v0.2.0 successfully. When attempting to go from v0.2.0 --> v0.3.0 the upgrade failed and contained the above resolution failure.

To verify this is a valid upgrade path, you can see the channel and upgrade edges with:

docker run --rm -it quay.io/operator-framework/opm:latest render quay.io/operatorhubio/catalog:latest | jq -s '.[] | select( .schema == "olm.channel" ) | select( .package == "argocd-operator")'
Output
{
  "schema": "olm.channel",
  "name": "alpha",
  "package": "argocd-operator",
  "entries": [
    {
      "name": "argocd-operator.v0.0.11",
      "replaces": "argocd-operator.v0.0.9"
    },
    {
      "name": "argocd-operator.v0.0.12",
      "replaces": "argocd-operator.v0.0.11"
    },
    {
      "name": "argocd-operator.v0.0.13",
      "replaces": "argocd-operator.v0.0.12"
    },
    {
      "name": "argocd-operator.v0.0.14",
      "replaces": "argocd-operator.v0.0.13"
    },
    {
      "name": "argocd-operator.v0.0.15",
      "replaces": "argocd-operator.v0.0.14"
    },
    {
      "name": "argocd-operator.v0.0.2"
    },
    {
      "name": "argocd-operator.v0.0.3",
      "replaces": "argocd-operator.v0.0.2"
    },
    {
      "name": "argocd-operator.v0.0.4",
      "replaces": "argocd-operator.v0.0.3"
    },
    {
      "name": "argocd-operator.v0.0.5",
      "replaces": "argocd-operator.v0.0.4"
    },
    {
      "name": "argocd-operator.v0.0.6",
      "replaces": "argocd-operator.v0.0.5"
    },
    {
      "name": "argocd-operator.v0.0.8",
      "replaces": "argocd-operator.v0.0.6"
    },
    {
      "name": "argocd-operator.v0.0.9",
      "replaces": "argocd-operator.v0.0.8"
    },
    {
      "name": "argocd-operator.v0.1.0",
      "replaces": "argocd-operator.v0.0.15"
    },
    {
      "name": "argocd-operator.v0.10.0",
      "replaces": "argocd-operator.v0.9.2"
    },
    {
      "name": "argocd-operator.v0.10.1",
      "replaces": "argocd-operator.v0.10.0"
    },
    {
      "name": "argocd-operator.v0.2.0",
      "replaces": "argocd-operator.v0.1.0"
    },
    {
      "name": "argocd-operator.v0.2.1",
      "replaces": "argocd-operator.v0.2.0"
    },
    {
      "name": "argocd-operator.v0.3.0",
      "replaces": "argocd-operator.v0.2.1"
    },
    {
      "name": "argocd-operator.v0.4.0",
      "replaces": "argocd-operator.v0.3.0"
    },
    {
      "name": "argocd-operator.v0.5.0",
      "replaces": "argocd-operator.v0.4.0"
    },
    {
      "name": "argocd-operator.v0.6.0",
      "replaces": "argocd-operator.v0.5.0"
    },
    {
      "name": "argocd-operator.v0.7.0",
      "replaces": "argocd-operator.v0.6.0"
    },
    {
      "name": "argocd-operator.v0.8.0",
      "replaces": "argocd-operator.v0.7.0"
    },
    {
      "name": "argocd-operator.v0.9.0",
      "replaces": "argocd-operator.v0.8.0"
    },
    {
      "name": "argocd-operator.v0.9.1",
      "replaces": "argocd-operator.v0.9.0"
    },
    {
      "name": "argocd-operator.v0.9.2",
      "replaces": "argocd-operator.v0.9.1"
    }
  ]
}
Reproduction Steps
  • Create a ClusterCatalog that references the operatorhub.io catalog image:
apiVersion: catalogd.operatorframework.io/v1alpha1
kind: ClusterCatalog
metadata:
  name: operatorhubio
spec:
  source:
    type: image
    image:
      ref: quay.io/operatorhubio/catalog:latest
      pollInterval: 24h
  • Create a ClusterExtension that installs the argocd-operator at v0.1.0:
apiVersion: olm.operatorframework.io/v1alpha1
kind: ClusterExtension
metadata:
  name: argocd
spec:
  packageName: argocd-operator
  installNamespace: default
  channel: alpha
  version: 0.1.0
  • Once the previously created ClusterExtension is successfully installed, manually update the version to v0.2.0 (note you will have to disable the recently added CRD Upgrade Safety check for this to work due to unknown changes to argo's CRDs):
apiVersion: olm.operatorframework.io/v1alpha1
kind: ClusterExtension
metadata:
  name: argocd
spec:
  packageName: argocd-operator
  installNamespace: default
  channel: alpha
  version: 0.2.0
  preflight:
    crdUpgradeSafety:
      disabled: true
  • Once the first manual upgrade is successfully installed, manually upgrade again to v0.3.0:
apiVersion: olm.operatorframework.io/v1alpha1
kind: ClusterExtension
metadata:
  name: argocd
spec:
  packageName: argocd-operator
  installNamespace: default
  channel: alpha
  version: 0.3.0
  preflight:
    crdUpgradeSafety:
      disabled: true

You should now be able to see the same resolution failure.

Full ClusterExtension output YAML
apiVersion: olm.operatorframework.io/v1alpha1
kind: ClusterExtension
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"olm.operatorframework.io/v1alpha1","kind":"ClusterExtension","metadata":{"annotations":{},"name":"argocd"},"spec":{"channel":"alpha","installNamespace":"default","packageName":"argocd-operator","preflight":{"crdUpgradeSafety":{"disabled":true}},"version":"0.3.0"}}
  creationTimestamp: "2024-07-03T19:41:50Z"
  finalizers:
  - olm.operatorframework.io/cleanup-unpack-cache
  - olm.operatorframework.io/delete-cached-bundle
  generation: 4
  name: argocd
  resourceVersion: "1979"
  uid: 86f0b4a0-d6a7-4856-8aa2-e2d39e40b2a9
spec:
  channel: alpha
  installNamespace: default
  packageName: argocd-operator
  preflight:
    crdUpgradeSafety:
      disabled: true
  upgradeConstraintPolicy: Enforce
  version: 0.3.0
status:
  conditions:
  - lastTransitionTime: "2024-07-03T19:41:52Z"
    message: ""
    observedGeneration: 3
    reason: Deprecated
    status: "False"
    type: Deprecated
  - lastTransitionTime: "2024-07-03T19:41:52Z"
    message: ""
    observedGeneration: 3
    reason: Deprecated
    status: "False"
    type: PackageDeprecated
  - lastTransitionTime: "2024-07-03T19:41:52Z"
    message: ""
    observedGeneration: 3
    reason: Deprecated
    status: "False"
    type: ChannelDeprecated
  - lastTransitionTime: "2024-07-03T19:41:52Z"
    message: ""
    observedGeneration: 3
    reason: Deprecated
    status: "False"
    type: BundleDeprecated
  - lastTransitionTime: "2024-07-03T19:44:12Z"
    message: 'error upgrading from currently installed version "0.1.0": no package
      "argocd-operator" matching version "0.3.0" in channel "alpha" found'
    observedGeneration: 4
    reason: ResolutionFailed
    status: "False"
    type: Resolved
  - lastTransitionTime: "2024-07-03T19:41:54Z"
    message: 'unpack successful: '
    observedGeneration: 3
    reason: UnpackSuccess
    status: "True"
    type: Unpacked
  - lastTransitionTime: "2024-07-03T19:43:53Z"
    message: Instantiated bundle argocd successfully
    observedGeneration: 3
    reason: Success
    status: "True"
    type: Installed

Just to note, this was originally found by OpenShift QE. I verified the bug was reproducible and used a different package for installation.

@joelanford
Copy link
Member

Definitely a blocker, IMO. Any idea why this is happening? My initial suspicion is that we somehow have an incorrect understanding of the currently installed version?

@everettraven
Copy link
Contributor Author

Definitely a blocker, IMO. Any idea why this is happening? My initial suspicion is that we somehow have an incorrect understanding of the currently installed version?

That is my suspicion as well. It seems like

installedBundle, err := r.InstalledBundleGetter.GetInstalledBundle(ctx, &ext)
may be returning only the initially installed version

@kevinrizza kevinrizza self-assigned this Jul 8, 2024
@kevinrizza
Copy link
Member

kevinrizza commented Jul 8, 2024

It's happening because the helm release for the upgraded bundle still has the first version label. When we install we set the version label:

rel, err = ac.Install(ext.GetName(), ext.Spec.InstallNamespace, chrt, values, func(install *action.Install) error {
install.CreateNamespace = false
install.Labels = map[string]string{labels.BundleNameKey: bundle.Name, labels.PackageNameKey: bundle.Package, labels.BundleVersionKey: bundleVersion.String()}
return nil
}, helmclient.AppendInstallPostRenderer(post))
if err != nil {
setInstalledStatusConditionFailed(ext, fmt.Sprintf("%s:%v", ocv1alpha1.ReasonInstallationFailed, err))
return ctrl.Result{}, err
}
case stateNeedsUpgrade:
rel, err = ac.Upgrade(ext.GetName(), ext.Spec.InstallNamespace, chrt, values, helmclient.AppendUpgradePostRenderer(post))
if err != nil {
setInstalledStatusConditionFailed(ext, fmt.Sprintf("%s:%v", ocv1alpha1.ReasonUpgradeFailed, err))
return ctrl.Result{}, err
}
case stateUnchanged:
if err := ac.Reconcile(rel); err != nil {
setInstalledStatusConditionFailed(ext, fmt.Sprintf("%s:%v", ocv1alpha1.ReasonResolutionFailed, err))

but we're not doing the same for the upgrade case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants