Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Route to kubevirt VMs using infra id as service label selector #2092

Merged
merged 1 commit into from
Mar 19, 2023

Conversation

davidvossel
Copy link
Contributor

As we're looking to expand the kubevirt platform to manage VMs on external infra (an ocp cluster external to the one hypershift runs on), it's possible that VMs from multiple clusters will get placed in the same namespace. As a result, we need to start labeling the components we place in these namespaces using the unique HostedCluster.Spec.InfraID.

This serves two purposes for us.

  1. We can ensure our ingress passthrough services only route to VMs within a specific cluster
  2. We can clearly identify what resources associated with a HostedCluster need to be cleaned up on the external cluster (routes, services, VMs, etc...)

@openshift-ci openshift-ci bot requested review from csrwng and nirarg January 30, 2023 22:41
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 30, 2023
@davidvossel davidvossel mentioned this pull request Jan 30, 2023
4 tasks
@davidvossel
Copy link
Contributor Author

/test kubevirt-e2e-kubevirt-aws-ovn

@davidvossel
Copy link
Contributor Author

/hold

I want to see the kubevirt e2e lane pass before we consider merging this.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 30, 2023
@davidvossel
Copy link
Contributor Author

/test kubevirt-e2e-kubevirt-aws-ovn

@davidvossel
Copy link
Contributor Author

/test kubevirt-e2e-kubevirt-aws-ovn
/test e2e-aws

@davidvossel
Copy link
Contributor Author

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 17, 2023
@davidvossel
Copy link
Contributor Author

/cc @orenc1 maybe you can review this? What I've done is add a new label to the VMI pods that allows us to identify them by the Hosted CLuster (using infraID) that the VMs belong to. This will let us create Services and Network Policies that target a specific cluster within a namespace rather than all VMs in a namespace...

I think this level of identification for vms/vmis/pods could be important to us with your external infra work. It could be possible for people to use the same namespace on an external infra cluster to host VMs from multiple guest clusters.

}
service.Spec.Type = corev1.ServiceTypeClusterIP

service.Labels["hypershift.kubevirt.io/infra-id"] = hcp.Spec.InfraID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the string "hypershift.kubevirt.io/infra-id" appears in 2 locations. wouldn't it be better to put it in a variable, e.g. infraIDLabelKey, as you did in MachineTemplateSpec function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, yeah definitely. I didn't realize i hadn't transitioned to a variable for that. good call.

Copy link
Contributor

@orenc1 orenc1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it makes total sense to me to distinguish the ingress service to target the VMs that belong to the relevant guest cluster only.

i have just one nit, other than that, lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 23, 2023
@orenc1
Copy link
Contributor

orenc1 commented Feb 23, 2023

/hold

i've now figured out we have another problem if multiple ingress services and routes are created in the same infra namespace for multiple guest clusters - the service and the route are created with an hardcoded name: default-ingress-passthrough-service and default-ingress-passthrough-route. in that case, subsequent ingress service and route will overwrite existing ones, making existing guest clusters unreachable.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 23, 2023
@orenc1
Copy link
Contributor

orenc1 commented Feb 23, 2023

/lgtm cancel

@davidvossel
Copy link
Contributor Author

this PR needs more work. We identified a few more area's the infra id needs to be applied

  1. make our label a const
  2. add the label so it's applied to LBs and PVCs mirrored into the guest as well

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Feb 23, 2023
@davidvossel
Copy link
Contributor Author

/hold cancel

@orenc1 I updated this PR. We now add the infra id to both the infra LBs and PVCs in order to help associate those resources with the HostedCluster.

I had to make a change to the cloud-provider-kubevirt config here [1] for this to work for the LBs. It doesn't matter what order these PRs get merged in though. Hypershift will attempt to set the labels on the cloud config, but they won't get actually applied to the LBs until [1] is merged. It doesn't break anything though.

  1. Add id custom labels to infra Load balancers kubevirt/cloud-provider-kubevirt#206

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 27, 2023
@orenc1
Copy link
Contributor

orenc1 commented Feb 28, 2023

/hold cancel

@orenc1 I updated this PR. We now add the infra id to both the infra LBs and PVCs in order to help associate those resources with the HostedCluster.

I had to make a change to the cloud-provider-kubevirt config here [1] for this to work for the LBs. It doesn't matter what order these PRs get merged in though. Hypershift will attempt to set the labels on the cloud config, but they won't get actually applied to the LBs until [1] is merged. It doesn't break anything though.

1. [Add id custom labels to infra Load balancers kubevirt/cloud-provider-kubevirt#206](https://github.com/kubevirt/cloud-provider-kubevirt/pull/206)

@davidvossel , looks like the ingress service and route still have a constant name:
service:

route:

if we're utilizing the same infra namespace for multiple guest clusters, we must have a unique service and route for each.

@davidvossel
Copy link
Contributor Author

if we're utilizing the same infra namespace for multiple guest clusters, we must have a unique service and route for each.

ugh... you're right. I missed this.

@netlify
Copy link

netlify bot commented Mar 2, 2023

Deploy Preview for hypershift-docs ready!

Name Link
🔨 Latest commit 51f8d7c
🔍 Latest deploy log https://app.netlify.com/sites/hypershift-docs/deploys/6408c25376605200085ebf46
😎 Deploy Preview https://deploy-preview-2092--hypershift-docs.netlify.app/reference/api
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

Comment on lines +778 to +790
cpService.Name = fmt.Sprintf("%s-%s",
manifests.IngressDefaultIngressPassthroughServiceName,
hcp.Spec.Platform.Kubevirt.GenerateID)

// Manifests for infra/mgmt cluster passthrough routes
cpPassthroughRoute := manifests.IngressDefaultIngressPassthroughRoute(hcpNamespace)

cpPassthroughRoute.Name = fmt.Sprintf("%s-%s",
manifests.IngressDefaultIngressPassthroughRouteName,
hcp.Spec.Platform.Kubevirt.GenerateID)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orenc1 i fixed the issue with the route and service not being unique by adding some unique generated strings to the end of these resources.

@davidvossel
Copy link
Contributor Author

/test kubevirt-e2e-kubevirt-aws-ovn

@davidvossel
Copy link
Contributor Author

/hold

i just want to look at this a little closer

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 7, 2023
@davidvossel
Copy link
Contributor Author

/hold cancel

I'm comfortable with this pr now.

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 7, 2023
@orenc1
Copy link
Contributor

orenc1 commented Mar 7, 2023

me too.
i'll lgtm once the ci lane passes.

@davidvossel
Copy link
Contributor Author

/test kubevirt-e2e-kubevirt-aws-ovn

@@ -680,6 +680,7 @@ type PlatformSpec struct {
}

// KubevirtPlatformSpec specifies configuration for kubevirt guest cluster installations
// +kubebuilder:validation:XValidation:rule="!has(oldSelf.generateID) || has(self.generateID)", message="Kubevirt GenerateID is required once set"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does it mean "is required once set" ?
if GenerateID is required, it must not be empty or missing in the first place.
so there is no meaning to "once set" in my opinion.

and same for the validation message below - "is immutable once set" - if that field can't be empty when the resource is created, the value is set on creation, and the "once set" is superfluousץ

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's an explanation, and it's where i took the example from. https://kubernetes.io/blog/2022/09/29/enforce-immutability-using-cel/#immutablility-after-first-modification

The idea is that I want GenerateID to be allowed to be set exactly once, and never changed after that. Including removing the value entirely.

The line you reference !has(oldSelf.generateID) || has(self.generateID) is kind of odd. I believe what it's saying is that it's okay for the value to be set only when the old value was empty.

@davidvossel
Copy link
Contributor Author

/hold

i think this is good to go, but i want to run the new api field by another hypershift approver first

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 8, 2023
@davidvossel
Copy link
Contributor Author

davidvossel commented Mar 8, 2023

/hold cancel

i talked with hypershift maintainers, we're good to go with this one.

@davidvossel

This comment was marked as duplicate.

@openshift-ci openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Mar 8, 2023
Copy link
Contributor

@orenc1 orenc1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 8, 2023
@orenc1
Copy link
Contributor

orenc1 commented Mar 9, 2023

just want to confirm we are good with:
/test kubevirt-e2e-kubevirt-aws-ovn

and that we are not breaking:
/test e2e-aws

@orenc1
Copy link
Contributor

orenc1 commented Mar 9, 2023

e2e-aws: looks like an infra issue:

error: failed to push image registry.build03.ci.openshift.org/ci-op-cr86fcsz/release:latest: unable to upload new layer (0): Patch "https://registry.build03.ci.openshift.org/v2/ci-op-cr86fcsz/release/blobs/uploads/58a71c99-1d8b-4aa6-8e2e-c9f3813bbe21?_state=g7fEb7XWMi0n_TYumAY0OFG45F5d-P5hIJh-cvv6b0x7Ik5hbWUiOiJjaS1vcC1jcjg2ZmNzei9yZWxlYXNlIiwiVVVJRCI6IjU4YTcxYzk5LTFkOGItNGFhNi04ZTJlLWM5ZjM4MTNiYmUyMSIsIk9mZnNldCI6MCwiU3RhcnRlZEF0IjoiMjAyMy0wMy0wOVQwOTo1ODozNC4wNjEyOTYxNDFaIn0%3D": operator "machine-config-operator" contained an invalid image-references file: no input image tag named "rhel-coreos"

retesting to see if it was resolved already
/test e2e-aws

@davidvossel
Copy link
Contributor Author

/test e2e-aws

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 9, 2023

@davidvossel: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-ibmcloud-roks 51f8d7c link false /test e2e-ibmcloud-roks
ci/prow/e2e-ibmcloud-iks 51f8d7c link false /test e2e-ibmcloud-iks

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@davidvossel
Copy link
Contributor Author

/retest-required

Copy link
Contributor

@orenc1 orenc1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm
/thanks

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 19, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: davidvossel, orenc1

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@orenc1
Copy link
Contributor

orenc1 commented Mar 19, 2023

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 19, 2023
@openshift-merge-robot openshift-merge-robot merged commit 3662800 into openshift:main Mar 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants