Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster controller tries to delete all ports on BYO network #1679

Closed
mdbooth opened this issue Sep 25, 2023 · 0 comments · Fixed by #1680
Closed

Cluster controller tries to delete all ports on BYO network #1679

mdbooth opened this issue Sep 25, 2023 · 0 comments · Fixed by #1680
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@mdbooth
Copy link
Contributor

mdbooth commented Sep 25, 2023

/kind bug

What steps did you take and what happened:
Created a cluster with the following spec:

spec:
  cloudName: openstack
  identityRef:
    name: cloud-config
    kind: Secret
  managedSecurityGroups: false
  externalNetworkId: 14c15d33-175c-424e-88ba-361a875e0c5c
  router:
    name: mbooth-psi-vj9sq-external-router
  network:
    name: mbooth-psi-vj9sq-openshift
  subnet:
    name: mbooth-psi-vj9sq-nodes
  tags:
  - openshiftClusterID=mbooth-psi-vj9sq

Deleted the OpenStackCluster object.

Cluster deletion hangs attempting to delete ports in network mbooth-psi-vj9sq-openshift which are still in use by non-CAPO resources:

E0925 15:03:14.351642       1 controller.go:324] "Reconciler error" err=<
        failed to delete ports: delete port 09dcd789-77a2-42dc-baf1-28a1793cf016 of network "3f1e7429-d2eb-4ce8-837f-cbfe703fc03a" failed : Expected HTTP response code [202 204] when accessing [DELETE https://rhos-d.infra.prod.upshift.rdu2.redhat.com:13696/v2.0/ports/09dcd789-77a2-42dc-baf1-28a1793cf016], but got 409 instead
        {"NeutronError": {"type": "PortInUseAsTrunkParent", "message": "Port 09dcd789-77a2-42dc-baf1-28a1793cf016 is currently a parent port for trunk 0d1e18e7-429b-4974-9639-6e48797e08d5.", "detail": ""}}
 > controller="openstackcluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackCluster" OpenStackCluster="openshift-cluster-api/mbooth-psi-vj9sq" namespace="openshift-cluster-api" name="mbooth-psi-vj9sq" reconcileID=1b038454-fdad-42ea-94a9-db781bca812b

Offending code seems to be called from here:

if err = networkingService.DeletePorts(openStackCluster); err != nil {
handleUpdateOSCError(openStackCluster, fmt.Errorf("failed to delete ports: %w", err))
return reconcile.Result{}, fmt.Errorf("failed to delete ports: %w", err)
}

I suspect it should not be called unless we're deleting the network.

Also, ideally it wouldn't be called at all, but IIRC it's contingency due to lack of robustness in the machine controller's port cleanup.

What did you expect to happen:
Network is deleted cleanly.

Environment:

  • Cluster API Provider OpenStack version (Or git rev-parse HEAD if manually built): v0.8.0
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Sep 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants