Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support autoscaling #39

Merged
merged 34 commits into from
Nov 16, 2022
Merged

Support autoscaling #39

merged 34 commits into from
Nov 16, 2022

Conversation

okozachenko1203
Copy link
Member

@okozachenko1203 okozachenko1203 commented Nov 11, 2022

Fix #3

@mnaser
Copy link
Member

mnaser commented Nov 11, 2022

I believe in this case you've actually ran the auto scaler on the workload cluster with a kubeconfig file to talk to the management cluster

even if things can be secure and locked down, this could be a security issue down the road. Therefore, I think it's better to run the auto scaler on the management cluster talking to the workload instead.

So I think you'll have to remove it from the ClusterResourceSet and get rid of the kubeconfig option, for the actual deployment in the management cluster, you can mount the clustername-kubeconfig secret to get access to the cluster directly.

Copy link
Member

@mnaser mnaser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commet

@okozachenko1203
Copy link
Member Author

I believe in this case you've actually ran the auto scaler on the workload cluster with a kubeconfig file to talk to the management cluster

even if things can be secure and locked down, this could be a security issue down the road. Therefore, I think it's better to run the auto scaler on the management cluster talking to the workload instead.

So I think you'll have to remove it from the ClusterResourceSet and get rid of the kubeconfig option, for the actual deployment in the management cluster, you can mount the clustername-kubeconfig secret to get access to the cluster directly.

when we run autoscaler in mgmt cluster, i wonder it means we need to deploy one autoscaler per one magnum cluster
That was my concern and the reason why i chose autosacler in workload cluster and tried to create clusterrole with resourceName binding.

@mnaser
Copy link
Member

mnaser commented Nov 11, 2022

I believe in this case you've actually ran the auto scaler on the workload cluster with a kubeconfig file to talk to the management cluster
even if things can be secure and locked down, this could be a security issue down the road. Therefore, I think it's better to run the auto scaler on the management cluster talking to the workload instead.
So I think you'll have to remove it from the ClusterResourceSet and get rid of the kubeconfig option, for the actual deployment in the management cluster, you can mount the clustername-kubeconfig secret to get access to the cluster directly.

when we run autoscaler in mgmt cluster, i wonder it means we need to deploy one autoscaler per one magnum cluster That was my concern and the reason why i chose autosacler in workload cluster and tried to create clusterrole with resourceName binding.

yes, indeed, actually we will need to create a autoscaler per CAPI cluster on the control plane, but I think the load added of this is kinda negligible so we can be OK with it.

@okozachenko1203
Copy link
Member Author

Ok, I have changed the clusterAPIMode as you mentioned.
Also I used flux helmrelease for autoscaler deployment.

Copy link
Member

@mnaser mnaser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@okozachenko1203 good progress

magnum_cluster_api/driver.py Outdated Show resolved Hide resolved
magnum_cluster_api/driver.py Outdated Show resolved Hide resolved
magnum_cluster_api/resources.py Outdated Show resolved Hide resolved
magnum_cluster_api/resources.py Show resolved Hide resolved
magnum_cluster_api/resources.py Outdated Show resolved Hide resolved
Now get_object method has an assert. Once this assertion is negative, deleteion is failed as well as creation.
By override deletion method using configmap name directly, it avoids deletion failure. Also it will reduce api requests.
magnum_cluster_api/driver.py Outdated Show resolved Hide resolved
magnum_cluster_api/driver.py Outdated Show resolved Hide resolved
magnum_cluster_api/resources.py Outdated Show resolved Hide resolved
magnum_cluster_api/resources.py Outdated Show resolved Hide resolved
magnum_cluster_api/resources.py Outdated Show resolved Hide resolved
magnum_cluster_api/resources.py Show resolved Hide resolved
Some K8S resources follow dns-1035 but not for openstack coe resources
magnum_cluster_api/driver.py Outdated Show resolved Hide resolved
Copy link
Member

@mnaser mnaser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@okozachenko1203 I pushed some changes.

I think the issue is that we're being hit by kubernetes-sigs/cluster-api#7088 -- without that in place, we're unable to use the metadata the way we want to.

I think what we should do is instead take advantage of update_cluster_status (or maybe update_node_group_status) to mutate and reconcile the annotations on the MachineDeployment.

Let's leave a comment pointing to the PR.

@okozachenko1203 okozachenko1203 changed the title [wip] Support autoscaling Support autoscaling Nov 16, 2022
@mnaser mnaser merged commit 93a6d8b into main Nov 16, 2022
@mnaser mnaser deleted the add-autoscaler branch November 16, 2022 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Autoscaling
2 participants