-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Galera service A/P #229
Make Galera service A/P #229
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
were we going to do a CRD change to make this configurable? would be handy for us to be able to deploy A/A so that we can in fact have CI / tempest jobs that do this and we can slowly try to improve codebases
@zzzeek my plan is to do that in a subsequent PR, to avoid any CRD change that would force a bump in openstack-operator. |
Depending on the CRD change, I think we should do it now. |
controllers/galera_controller.go
Outdated
service.Spec = pkgsvc.Spec | ||
if present { | ||
service.Spec.Selector[ActivePodSelectorKey] = activePod | ||
} | ||
err := controllerutil.SetOwnerReference(instance, service, r.Client.Scheme()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why the instance is not set as the Controller (same for L483)? currently the controller won't reconcile if the service gets deleted.
I guess we need it now for this PR as otherwise the controller will reconcile on the selector update and returns it to what it think it should be?
With this, there is a scenario where we'd loose the a-p selector. If someone deletes the service (for whatever reason), first of all the controller currently won't reconcile and re-create it. So a user would have to restart the mariadb operator, or do some other change to trigger a reconcile, but in this scenario, we don't have the active pod selector as it is a new service and there won't be a change to the galera cluster and the script won't run. With this we are in the multi master config.
Another approach could be (not sure yet which one is better) that the script adds an annotation to the statefulset on which one is the master (I guess on a pod restart it will always run at least once?), the controller would then reconcile and can update the service.
There is also a downside in this approach, the operator needs to run. If the mariadb-operator is down when there is a galera change, those would not be reflected properly. This could happen e.g. if the operator and the deployment pod which was the master are located on a worker which failed. Depending on how log it takes to re-schedule the operator the db would be not available ... so ... maybe it can be a mix, like checking if there is already the statefulset and take the info from there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why the instance is not set as the Controller (same for L483)? currently the controller won't reconcile if the service gets deleted.
I'm not sure I got this one. Isn't it what lines 509 and 483 supposed to do? When I do e.g. oc describe service openstack
, I can see a label owner=mariadb-operator
.
I guess we need it now for this PR as otherwise the controller will reconcile on the selector update and returns it to what it think it should be?
Hmmm you're correct we'd lose this label in this case.
With this, there is a scenario where we'd loose the a-p selector. If someone deletes the service (for whatever reason), first of all the controller currently won't reconcile and re-create it. So a user would have to restart the mariadb operator, or do some other change to trigger a reconcile, but in this scenario, we don't have the active pod selector as it is a new service and there won't be a change to the galera cluster and the script won't run. With this we are in the multi master config.
You're correct, the label won't be recreated automatically. Restarting a pod would be sufficient for galera of a cluster change and to recreate the right label, although that would be a manual action.
Another approach could be (not sure yet which one is better) that the script adds an annotation to the statefulset on which one is the master (I guess on a pod restart it will always run at least once?), the controller would then reconcile and can update the service. There is also a downside in this approach, the operator needs to run. If the mariadb-operator is down when there is a galera change, those would not be reflected properly. This could happen e.g. if the operator and the deployment pod which was the master are located on a worker which failed. Depending on how log it takes to re-schedule the operator the db would be not available ... so ... maybe it can be a mix, like checking if there is already the statefulset and take the info from there
So I'm against trying to duplicate the state in the CR's status because I don't want to ensure that the states are in sync. However we could handle this deletion case by selecting any one of the healthy pod as the current active endpoint. This wouldn't conflict with the script as it only performs conditional writes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why the instance is not set as the Controller (same for L483)? currently the controller won't reconcile if the service gets deleted.
I'm not sure I got this one. Isn't it what lines 509 and 483 supposed to do? When I do e.g.
oc describe service openstack
, I can see a labelowner=mariadb-operator
.
They are adding an owner, but not as the controller. Owns(&corev1.Service{}).
in the NewControllerManagedBy()
will only reconcile on the ones are also the controller, like in https://github.com/openstack-k8s-operators/mariadb-operator/blob/main/controllers/galera_controller.go#L447 .
I guess we need it now for this PR as otherwise the controller will reconcile on the selector update and returns it to what it think it should be?
Hmmm you're correct we'd lose this label in this case.
With this, there is a scenario where we'd loose the a-p selector. If someone deletes the service (for whatever reason), first of all the controller currently won't reconcile and re-create it. So a user would have to restart the mariadb operator, or do some other change to trigger a reconcile, but in this scenario, we don't have the active pod selector as it is a new service and there won't be a change to the galera cluster and the script won't run. With this we are in the multi master config.
You're correct, the label won't be recreated automatically. Restarting a pod would be sufficient for galera of a cluster change and to recreate the right label, although that would be a manual action.
right, restart one of the pods would also work.
Another approach could be (not sure yet which one is better) that the script adds an annotation to the statefulset on which one is the master (I guess on a pod restart it will always run at least once?), the controller would then reconcile and can update the service. There is also a downside in this approach, the operator needs to run. If the mariadb-operator is down when there is a galera change, those would not be reflected properly. This could happen e.g. if the operator and the deployment pod which was the master are located on a worker which failed. Depending on how log it takes to re-schedule the operator the db would be not available ... so ... maybe it can be a mix, like checking if there is already the statefulset and take the info from there
So I'm against trying to duplicate the state in the CR's status because I don't want to ensure that the states are in sync. However we could handle this deletion case by selecting any one of the healthy pod as the current active endpoint. This wouldn't conflict with the script as it only performs conditional writes.
No, I am not saying to store it in the CR status. Right now the script updates the service/endpoint. it could also add an annotation to the statefulset object with the information. And the controller here takes it from there instead of the service. since the statefulset remains in the mentioned scenario, the information would survive.
But we could handle this in a follow up. just wanted to bring up that there are situations where we may get back to the a-a config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why the instance is not set as the Controller (same for L483)? currently the controller won't reconcile if the service gets deleted.
I'm not sure I got this one. Isn't it what lines 509 and 483 supposed to do? When I do e.g.
oc describe service openstack
, I can see a labelowner=mariadb-operator
.They are adding an owner, but not as the controller.
Owns(&corev1.Service{}).
in theNewControllerManagedBy()
will only reconcile on the ones are also the controller, like in https://github.com/openstack-k8s-operators/mariadb-operator/blob/main/controllers/galera_controller.go#L447 .
Oh I see now... Thanks for pointing that out. I don't see why we shouldn't own it, I am going to update
the PR.
No, I am not saying to store it in the CR status. Right now the script updates the service/endpoint. it could also add an annotation to the statefulset object with the information. And the controller here takes it from there instead of the service. since the statefulset remains in the mentioned scenario, the information would survive.
But we could handle this in a follow up. just wanted to bring up that there are situations where we may get back to the a-a config.
I think it's sufficient to always set the first pod as the current active endpoint, as 1) if that information becomes incorrect, it is guaranteed to get updated by galera as soon as it detects that this galera node is unresponsive. 2) if it is a healthy pod, it can be used as the active endpoint anyway, and the script will still be able to perform failover next time it is needed.
controllers/galera_controller.go
Outdated
@@ -800,6 +814,7 @@ func (r *GaleraReconciler) SetupWithManager(mgr ctrl.Manager) error { | |||
Owns(&corev1.Endpoints{}). | |||
Owns(&corev1.ConfigMap{}). | |||
Owns(&corev1.ServiceAccount{}). | |||
Owns(&corev1.Service{}). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already in L813
/retest |
c7af585
to
5fb18af
Compare
As discussed with @stuggi, we now do SetControllerReference on all Service CRs to get notified of deleting and react appropriately. |
The service created for a Galera cluster always balances traffic to all available Galera replicas, which means the cluster is effectively configured as an A/A service. In order to limit the burden on clients of supporting A/A semantics, change the way the service is configured to act as A/P instead. A new callback script is introduced in pods. It is called by mysqld automatically every time the Galera library detects a change in the cluster (node join, crash, network partition...). This callback script reconfigures the service CR via label selectors to drive traffic to a single pod at a time. Jira: OSPRH-7405
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dciabrin, stuggi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
7d997dc
into
openstack-k8s-operators:main
This reverts commit ab95f15. The openstack-k8s-operators/mariadb-operator#229 switched the mariadb-operator to Active/Passive mode instead of multimaster. This means we don't need to force synchronization from the client config any more.
This reverts commit a5b0bf4. The openstack-k8s-operators/mariadb-operator#229 switched the mariadb-operator to Active/Passive mode instead of multimaster. This means we don't need to force synchronization from the client config any more.
This reverts commit a5b0bf4. The openstack-k8s-operators/mariadb-operator#229 switched the mariadb-operator to Active/Passive mode instead of multimaster. This means we don't need to force synchronization from the client config any more.
This reverts commit ab95f15. The openstack-k8s-operators/mariadb-operator#229 switched the mariadb-operator to Active/Passive mode instead of multimaster. This means we don't need to force synchronization from the client config any more.
This reverts commit 980c318. The openstack-k8s-operators/mariadb-operator#229 switched the mariadb-operator to deploy in Active/Passive mode per default, instead of multimaster.
This reverts commit 67b2877. The openstack-k8s-operators/mariadb-operator#229 switched the mariadb-operator to deploy in Active/Passive mode per default, instead of multimaster.
The openstack-k8s-operators/mariadb-operator#229 switched the mariadb-operator to deploy in Active/Passive mode per default, instead of multimaster. So we should no longer set `mysql_wsrep_sync_wait`. This reverts commit cf4f3f6.
Revert "Cinder wait for DB writes on reads" The openstack-k8s-operators/mariadb-operator#229 switched the mariadb-operator to deploy in Active/Passive mode per default, instead of multimaster. So we should no longer set mysql_wsrep_sync_wait. Related PR: openstack-k8s-operators/cinder-operator#401 This reverts commit cf4f3f6. Reviewed-by: John Fulton <[email protected]>
This reverts commit e17640c. No longer needed because of openstack-k8s-operators/mariadb-operator#229
The service created for a Galera cluster always balances traffic to all available Galera replicas, which means the cluster is effectively configured as an A/A service.
In order to limit the burden on clients of supporting A/A semantics, change the way the service is configured to act as A/P instead.
A new callback script is introduced in pods. It is called by mysqld automatically every time the Galera library detects a change in the cluster (node join, crash, network partition...). This callback script reconfigures the service CR via label selectors to drive traffic to a single pod at a time.
Jira: OSPRH-7405