Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial govmomi translation for machine actuator #64

Merged
merged 4 commits into from
Oct 8, 2018
Merged

Initial govmomi translation for machine actuator #64

merged 4 commits into from
Oct 8, 2018

Conversation

sidharthsurana
Copy link
Contributor

This patch adds a full govmomi based implementation at parity with the current terraform based implementation.

* Adds a new field "APIStatus" in the ProviderStatus for Cluster type
* Adds logic in the cluster controller to keep the APIStatus up to date

Resolves #62
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 4, 2018
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Oct 4, 2018
@k8s-ci-robot
Copy link
Contributor

Hi @sidharthsurana. Thanks for your PR.

I'm waiting for a kubernetes-sigs or kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 4, 2018
Also switch the APIStatus from an int to string based enum for better
readability aspect.
This patch add a full govmomi based implementation at parity with
the current terraform based implementation

Resolves #5
Resolves #31
Resolves #32
Copy link
Member

@alvaroaleman alvaroaleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits, overall it looks very good!

What I would love to see is if you would update the files below clusterctl/examples so this can actually be tested :)

@@ -8,4 +12,19 @@ const (
LastUpdatedKey = "last-updated"
CreateEventAction = "Create"
DeleteEventAction = "Delete"
Provider_Datacenter = "datacenter"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for not using camelcase just for this five consts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! No reason, just oversight. I will fix these.

content: |
{{ .Script }}
permissions: '0755'
encoding: base64{{ if .IsMaster }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: You can put the {{ if .IsMaster }} on a dedicated line and avoid additional whitespaces by writing it like this `{{- if .IsMaster }}´ - Makes it much more readable IMHO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I completely forgot about the {{- prefix. I know this would work in jinja2 templating, but didn't realize it may work in the go template as well.

{{ .Script }}
permissions: '0755'
encoding: base64{{ if .IsMaster }}
- path: /etc/kubernetes/cloud-config/cloud-config.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this also required for the kubelet in order to be able to mount volumes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing terraform code, provisions in the same way and it seems to be working. However, I need to look closely in to the vsphere cloud provider to make sure if this is needed on the workers as well.

if err != nil {
return "", err
}
tmpconfig, err := vsphereutils.CreateTempFile(kubeconfig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be cleaned up afterwards

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup! missed that. I thought I did clean it up but seems I forgot to do that here.

ncluster.ObjectMeta.Annotations[constants.KubeadmToken] = token
// Even though this time might be off by few sec compared to the actual expiry on the token it should not have any impact
ncluster.ObjectMeta.Annotations[constants.KubeadmTokenExpiryTime] = time.Now().Add(constants.KubeadmTokenTtl).Format(time.RFC3339)
_, err = vc.clusterV1alpha1.Clusters(cluster.Namespace).UpdateStatus(ncluster)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this must be Update rather than UpdateStatus, since an annotation is part of the main object, not of its status

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@@ -39,7 +45,6 @@ func GetIP(_ *clusterv1.Cluster, machine *clusterv1.Machine) (string, error) {
return "", errors.New("could not get IP")
}
if ip, ok := machine.ObjectMeta.Annotations[constants.VmIpAnnotationKey]; ok {
glog.Infof("Returning IP from machine annotation %s", ip)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did not actually introduce this func but I was wondering, is there a reason to get the IP from an annotation rather than just using the api? https://github.com/kubermatic/machine-controller/blob/dbbc29796d291d9329780efeccea6659f11227d3/pkg/cloudprovider/provider/vsphere/provider.go#L525-L543

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason it was done using an annotation was to cache that info so that we don't have to make very frequent vsphere API calls. Considering once the IP on the VM is allocated, usually it would stay that way for long time. But I think we do need to have a reconciliation loop that would periodically ensure that the correct IP is tracked.

return err
}
glog.Infof("Using session %v", usersession)
task, _ := vsphereutils.GetActiveTasks(machine)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the GetActiveTasks never returns an error, but maybe either update the signature of GetActiveTasks to not actually contain an error as part of its return values or handle the error here instead of discarding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I will remove the error from the method signature

if nmachine.ObjectMeta.Annotations == nil {
nmachine.ObjectMeta.Annotations = make(map[string]string)
}
nmachine.ObjectMeta.Annotations[constants.VmIpAnnotationKey] = vmIP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you store this in an annotation instead of nmachine.Status.ProviderStatus?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason, except for the fact that this is how it was stored in the terraform based implementation, thus I kept it like that. But I agree in general most of the annotations that we have right now, ideally should be moved to the ProviderStatus. Maybe we can handle that in a subsequent PR dedicated to populating the ProviderStatus for provider related info across the board.

// Update the cluster status with updated time stamp for tracking purposes
ncluster := cluster.DeepCopy()
status := &vsphereconfig.VsphereClusterProviderStatus{LastUpdated: time.Now().UTC().String()}
out, err := json.Marshal(status)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unhandled error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! Will fix that.

if nmachine.ObjectMeta.Annotations == nil {
nmachine.ObjectMeta.Annotations = make(map[string]string)
}
nmachine.ObjectMeta.Annotations[constants.VmIpAnnotationKey] = vmIP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do store this in an annotation instead of machine.Status.ProviderStatus?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason, except for the fact that this is how it was stored in the terraform based implementation, thus I kept it like that. But I agree in general most of the annotations that we have right now, ideally should be moved to the ProviderStatus. Maybe we can handle that in a subsequent PR dedicated to populating the ProviderStatus for provider related info across the board.

Copy link
Contributor Author

@sidharthsurana sidharthsurana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the nice review @alvaroaleman
BTW see these changes

// machineActuator, err := vsphere.NewTerraformMachineActuator(client.ClusterV1alpha1(), si.Factory.Cluster().V1alpha1(), machineEventRecorder, *namedMachinesPath)
// if err != nil {
// glog.Fatalf("Could not create vSphere machine actuator: %v", err)
// }
machineActuator, err := vsphere.NewGovmomiMachineActuator(client.ClusterV1alpha1(), machineClientSet, si.Factory.Cluster().V1alpha1(), machineEventRecorder)
in this PR. This already switches the actuator implementation to the govmomi based implementation in this changeset. So you don't need any changes to the clusterctl/examples to use the govmomi implementation. You should just build a new image using this patch (run make images) and with that you should be able to use the existing clusterctl/examples.
I am pushing another revision with the fixes you requested.

Thanks!

ncluster.ObjectMeta.Annotations[constants.KubeadmToken] = token
// Even though this time might be off by few sec compared to the actual expiry on the token it should not have any impact
ncluster.ObjectMeta.Annotations[constants.KubeadmTokenExpiryTime] = time.Now().Add(constants.KubeadmTokenTtl).Format(time.RFC3339)
_, err = vc.clusterV1alpha1.Clusters(cluster.Namespace).UpdateStatus(ncluster)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@@ -8,4 +12,19 @@ const (
LastUpdatedKey = "last-updated"
CreateEventAction = "Create"
DeleteEventAction = "Delete"
Provider_Datacenter = "datacenter"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! No reason, just oversight. I will fix these.

content: |
{{ .Script }}
permissions: '0755'
encoding: base64{{ if .IsMaster }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I completely forgot about the {{- prefix. I know this would work in jinja2 templating, but didn't realize it may work in the go template as well.

{{ .Script }}
permissions: '0755'
encoding: base64{{ if .IsMaster }}
- path: /etc/kubernetes/cloud-config/cloud-config.yaml
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing terraform code, provisions in the same way and it seems to be working. However, I need to look closely in to the vsphere cloud provider to make sure if this is needed on the workers as well.

if err != nil {
return "", err
}
tmpconfig, err := vsphereutils.CreateTempFile(kubeconfig)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup! missed that. I thought I did clean it up but seems I forgot to do that here.

if nmachine.ObjectMeta.Annotations == nil {
nmachine.ObjectMeta.Annotations = make(map[string]string)
}
nmachine.ObjectMeta.Annotations[constants.VmIpAnnotationKey] = vmIP
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason, except for the fact that this is how it was stored in the terraform based implementation, thus I kept it like that. But I agree in general most of the annotations that we have right now, ideally should be moved to the ProviderStatus. Maybe we can handle that in a subsequent PR dedicated to populating the ProviderStatus for provider related info across the board.

@@ -39,7 +45,6 @@ func GetIP(_ *clusterv1.Cluster, machine *clusterv1.Machine) (string, error) {
return "", errors.New("could not get IP")
}
if ip, ok := machine.ObjectMeta.Annotations[constants.VmIpAnnotationKey]; ok {
glog.Infof("Returning IP from machine annotation %s", ip)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason it was done using an annotation was to cache that info so that we don't have to make very frequent vsphere API calls. Considering once the IP on the VM is allocated, usually it would stay that way for long time. But I think we do need to have a reconciliation loop that would periodically ensure that the correct IP is tracked.

return err
}
glog.Infof("Using session %v", usersession)
task, _ := vsphereutils.GetActiveTasks(machine)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I will remove the error from the method signature

if nmachine.ObjectMeta.Annotations == nil {
nmachine.ObjectMeta.Annotations = make(map[string]string)
}
nmachine.ObjectMeta.Annotations[constants.VmIpAnnotationKey] = vmIP
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason, except for the fact that this is how it was stored in the terraform based implementation, thus I kept it like that. But I agree in general most of the annotations that we have right now, ideally should be moved to the ProviderStatus. Maybe we can handle that in a subsequent PR dedicated to populating the ProviderStatus for provider related info across the board.

// Update the cluster status with updated time stamp for tracking purposes
ncluster := cluster.DeepCopy()
status := &vsphereconfig.VsphereClusterProviderStatus{LastUpdated: time.Now().UTC().String()}
out, err := json.Marshal(status)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! Will fix that.

@alvaroaleman
Copy link
Member

Thanks for addressing my comments @sidharthsurana !

I just tried it out and got till unable to update bootstrap cluster endpoint: unable to get master IP: could not get IP. IMHO the way the GetIP func currently works makes this unusable in practise.

Maybe just:

  • Merge this PR
  • Create an issue for GetIP func to at least populate the annotation if it does not exist
  • Create an issue for moving all the annotations into the Status

Does that sound reasonable?

@alvaroaleman
Copy link
Member

Side question: Who are the contacts for approving in this repo? Right now only @frapposelli is listed as cluster-api-vsphere-maintainer, will there be someone else added?

I'd really like to contribute to the cluster-api-provider-vsphere but I am a little afraid my PRs will then sit for a very long time

@frapposelli
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 8, 2018
@frapposelli
Copy link
Member

@alvaroaleman both @sidharthsurana and @sflxn are working on getting their org membership and be added as maintainers, for the time being, I'm acting as the sole gatekeeper.

I'm starting to be involved full time on this so there shouldn't be much lag in reviewing/approving PRs, definitely it shouldn't discourage you from contributing :)

Last week I was out, hence the delays, if you ever find yourself in a blocking situation when submitting a PR here, ping me on the Kubernetes slack (I'm @fabio).

We also have a bi-weekly call every monday (like today!) at 1pm pacific, hope to see you there 👍

@frapposelli
Copy link
Member

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: frapposelli, sidharthsurana

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 8, 2018
@k8s-ci-robot k8s-ci-robot merged commit 265e758 into kubernetes-sigs:master Oct 8, 2018
@alvaroaleman
Copy link
Member

@frapposelli Thanks for the heads up, I'll join you in the call then!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants