-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement OpenStack Cloud Controller Manager support #423
Implement OpenStack Cloud Controller Manager support #423
Conversation
/assign @enxebre |
enhancements/cloud-controller-manager/openstack-cloud-controller-manager.md
Outdated
Show resolved
Hide resolved
enhancements/cloud-controller-manager/openstack-cloud-controller-manager.md
Outdated
Show resolved
Hide resolved
enhancements/cloud-controller-manager/openstack-cloud-controller-manager.md
Outdated
Show resolved
Hide resolved
enhancements/cloud-controller-manager/openstack-cloud-controller-manager.md
Outdated
Show resolved
Hide resolved
enhancements/cloud-controller-manager/openstack-cloud-controller-manager.md
Show resolved
Hide resolved
enhancements/cloud-controller-manager/openstack-cloud-controller-manager.md
Show resolved
Hide resolved
enhancements/cloud-controller-manager/openstack-cloud-controller-manager.md
Show resolved
Hide resolved
|
||
**Note:** Example of manual testing: https://asciinema.org/a/303399?speed=2 | ||
|
||
#### Write Cluster Cloud Controller Manager Operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which team should be owning this long term? Would it make sense for the cloud-team to own this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, the cloud team. Because I expect there will be more cloud controller manager for other platforms.
But OpenStack team can maintain some related submodules by placing OWNERS file in appropriate places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a clear understanding of how the code structure will be laid out to support generic CCCMO vs platform specific so that we can properly partition the work and have the correct OWNERS file?
enhancements/cloud-controller-manager/openstack-cloud-controller-manager.md
Show resolved
Hide resolved
enhancements/cloud-controller-manager/openstack-cloud-controller-manager.md
Show resolved
Hide resolved
enhancements/cloud-controller-manager/openstack-cloud-controller-manager.md
Show resolved
Hide resolved
Can you share any thoughts on the intersection of this component with the Infrastructure type? I assume there is no additional configurability needed for OpenStack? |
|
||
Actions: | ||
|
||
- Make sure Cinder CSI driver is supported in OpenShift. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this CSI is related to CCM? seems those are different operator/repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we use in-tree cloud provider, that manages Cinder volumes as well. Cloud controller manager can't do it, since this functionality was moved to Cinder CSI driver. So, to maintain feature parity with the in-tree cloud provider we need to enable Cloud Controller Manager and deploy Cinder CSI driver.
|
||
- Manually install CCM's [daemonset](https://github.com/kubernetes/cloud-provider-openstack/blob/master/manifests/controller-manager/openstack-cloud-controller-manager-ds.yaml) on a working OCP cluster deployed on OpenStack. | ||
|
||
- Update configuration of `kube-apiserver`, `kube-controller-manager` and `kubelet` by replacing `--cloud-provider openstack` with `--cloud-provider external` and removing `--cloud-config` parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this operator installed optionally or mandatory?
I think you need have CCM running in order to make node
ready, which meaning this is not installed from OperateHub?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The operator will be installed mandatory. But depending on the platform, it will either deploy Cloud Controller Manager or not.
b048501
to
8f8c198
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Fedosin The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
||
- Implement the operator, using [library-go](https://github.com/openshift/library-go) primitives. | ||
|
||
#### Build the operator image by OCP automation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know which group in OCP will be responsible for the image building automation? Have we engaged them yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ART team, because it will be a cluster operator. They are not involved yet.
|
||
#### Write Cluster Cloud Controller Manager Operator | ||
|
||
The operator should be able to create, configure and manage different CCMs (including one for OpenStack) in OpenShift. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these operations the scope of the unit/CI tests? Are there more/others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main goal is to reach the feature parity with the in-tree cloud provider. So, if e2e-openstack tests pass, then we are good.
For sure, unit tests for the operator are required.
enhancements/cloud-controller-manager/openstack-cloud-controller-manager.md
Show resolved
Hide resolved
- Add CCM operator support in Cluster Version Operator. CCM operator will be deployed on early stages of installation and then it deploys and configures OpenStack CCM itself. | ||
|
||
- Change Kubelet configuration for OpenStack in [Machine Config Operator templates](https://github.com/openshift/machine-config-operator/blob/d044c74ea4b9900c269ee8de8131ed49ba6fedc8/templates/master/01-master-kubelet/openstack/units/kubelet.service.yaml#L30) to adopt external cloud provider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have someone from CVO and MCO review?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, 3CMO (cluster cloud controller manager operator) will be managed by CVO so their opinion is really appreciated.
There won't be so much interaction with MCO, I believe. We will need to change just one line of code there:
https://github.com/openshift/machine-config-operator/blob/master/templates/master/01-master-kubelet/openstack/units/kubelet.service.yaml#L30
But the more people look, the better :)
8f8c198
to
86fecb2
Compare
A lot of the transition for all out of tree providers is being worked with detail here #463. I'd suggest we conflate any Openstack specific info into 463 and close this PR to avoid confusion and scatter knowledge for the same topic. We can then later in time follow up on 463 as we add more providers. cc @Danil-Grigorev @Fedosin |
86fecb2
to
994cf2b
Compare
A: If CCM is not available, new nodes in the cluster will be left unschedulable. | ||
|
||
Q: Who talks to CCM? | ||
A: From architectural point of view, only Kube API server is able to communicate with CCM. [Source](https://kubernetes.io/docs/concepts/architecture/cloud-controller/#design). All requests to CCM are sent by `kubelet`. KCM doesn't interract with CCM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kube-apiserver or kubelet (have not yet read the link) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kube-apiserver as written below.
A: From architectural point of view, only Kube API server is able to communicate with CCM. [Source](https://kubernetes.io/docs/concepts/architecture/cloud-controller/#design). All requests to CCM are sent by `kubelet`. KCM doesn't interract with CCM. | ||
|
||
Q: Does CCM provide an API? How is it hosted? HTTPS endpoint? Through load balancer? | ||
A: CCM provides a gRPC interface on port [10258](https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/cloud-provider/ports.go#L22). Like KCM, CCM supports high available setup using leader election (which is on by default). [Source](https://kubernetes.io/docs/tasks/administer-cluster/running-cloud-controller/#requirements). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this means that clients of ccm (kubelet? kube-apiserver ?) have to find the leader? I.e. you need a load balancer? Or is it a kubernetes service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the former case you have to integrate with the cloud API LB, and also integrate with MCO and it gcp/azure-routes.sh mechanism. Those might even be duplicated because API server and ccm life-cycles don't match.
In the latter case, you depend on SDN. What would that mean? No scheduling, no SDN. No SDN, no ccm, no scheduling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you tell kube-apiserver about all the IPs similarly to etcd-operator today. Grpc can handle multiple endpoints and find one that works.
A: CCM provides a gRPC interface on port [10258](https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/cloud-provider/ports.go#L22). Like KCM, CCM supports high available setup using leader election (which is on by default). [Source](https://kubernetes.io/docs/tasks/administer-cluster/running-cloud-controller/#requirements). | ||
|
||
Q: What are the thoughts about certificate management? | ||
A: Certificate management should be performed by the CCM operator, like it's done in the KCM operator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The consumers of ccm have to know the CA. How should that work? Note, that kcm has no API beyond the metrics and health endpoints, i.e. a very different and simpler setup. What you describe sounds more like kube-apiserver.
A: No. Control plane nodes only. [Source](https://kubernetes.io/docs/concepts/overview/components/#cloud-controller-manager) | ||
|
||
Q: How does SDN depend on CCM? | ||
A: Most likely they are not related. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the LB requirement above.
|
||
2. How to migrate PVs created by the in-tree cloud provider? | ||
[CSIMigration](https://kubernetes.io/blog/2019/12/09/kubernetes-1-17-feature-csi-migration-beta/) looks like the best option, especially if it is GA in 1.19. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need a section about bootstrapping. Probably we need a bootstrap instance of ccm.
994cf2b
to
9c7789e
Compare
I close this PR now in favor of #463 |
No description provided.