Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate from databaseUsername to databaseAccount and fully use MariaDBAccount #322

Conversation

zzzeek
Copy link
Contributor

@zzzeek zzzeek commented Feb 20, 2024

This moves heat to fully use MariaDBAccount based on the dev work being done for mariadb-operator:

Lead Jira: OSPRH-4095

  1. controller calls EnsureMariaDBAccount up front to make sure MariaDBAccount is present
  2. error code from EnsureMariaDBAccount is handled, Conditions are amended when error is returned
  3. controller calls NewDatabaseForAccount instead of NewDatabase
  4. GetAccountAndSecret is used to retrieve account /secret to populate template
  5. GetDatabaseByName() , normally used for delete finalizers, replaced with GetDatabaseByNameAndAccount
  6. CreateOrPatchAll() used to patch the Database, replacing CreateOrPatchDB / CreateOrPatchDBByName
  7. controller calls DeleteUnusedMariaDBAccountFinalizers when launched pods are definitely running on a new MariaDBAccount, returns error code if present
  8. PasswordSelectors that refer to database are removed
  9. all databaseUser replaced with databaseAccount inside of all XYZ_types.go
  10. all databaseUser replaced with databaseAccount inside of all kuttl/*.yaml
  11. all default databaseUser names become databaseAccount, replacing underscores with dashes inside of all XYZ_types.go
  12. all default databaseUser names become databaseAccount, replacing underscores with dashes inside of all kuttl/*.yaml
  13. MariaDBAccountSuiteTests are used in controller ginkgo tests if it has them
  14. Use configsecrets for database URLs; remove from job hash
  15. 184 is merged and replaces from go.mod are removed

@openshift-ci openshift-ci bot requested review from dprince and rabi February 20, 2024 19:57
@zzzeek zzzeek force-pushed the integrate_oo_accounts branch 7 times, most recently from a05e76e to 006db2d Compare February 21, 2024 03:08
@zzzeek zzzeek force-pushed the integrate_oo_accounts branch 3 times, most recently from 447e708 to 5985ad8 Compare February 27, 2024 20:48
@zzzeek zzzeek requested a review from gibizer February 28, 2024 00:18
@zzzeek
Copy link
Contributor Author

zzzeek commented Feb 28, 2024

@gibizer if you want to look at this one, with a few of these that had database username/password sent as env to all job templates, I had to change that since a rotation of username/password meant the dbsync job would run again, etc. I moved these to use configmaps with a "databaseConnection" token containing the URL in them the way nova/neutron/keystone do, but also, I moved from "configmap" to "configsecret" (also how you can see keystone doing it already) since these contain passwords in them and the whole point of what we're doing is trying to increase username/password security. so that has implications for the volume mounts which become "secret" mounts. designate is another one where I did this, which i haven't updated to the latest yet.

return ctrl.Result{}, err
}

// Add a prefix to the var name to avoid accidental collision with other non-secret
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is wrong. the "prefix" thing is done in the designate repo, but here we aren't ujsing it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, are we removing this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this merged before i got to change it :/

@zzzeek zzzeek force-pushed the integrate_oo_accounts branch from 572fcd7 to 316002f Compare March 6, 2024 14:39
Comment on lines +571 to +582
// remove finalizers from previous MariaDBAccounts for which we have
// switched.
// TODO(zzzeek) - It's not clear if this is called too early here.
// at the moment, heat_controller_test.go doesn't seem to have fixtures
// I can use to simulate getting all the way to the end of a reconcile
// for an instance. Basically this should be called when any pods have
// been restarted to run on an updated set of DB credentials, and the old
// ones are no longer needed. This would allow the scenario where
// a new MariaDBAccount is created and an old MariaDBAccount is marked
// deleted at once, where the finalizer will keep the old one around until
// it's safe to drop.
err = mariadbv1.DeleteUnusedMariaDBAccountFinalizers(ctx, helper, instance.Name, instance.Spec.DatabaseAccount, instance.Namespace)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so the roadblock on the functional tests getting to the end of the deployment was with the gophercloud dependencies we have on calling Keystone. There's an old abandoned PR that I can dig up for reference, but we can focus on kuttl if we want to validate anything at the end of the deployment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I have a suite of ginkgo tests that exercises the finalizers moving from one MariaDBAccount to another, once services are redeployed with an updated MariaDBAccount / username/password. That's the suite thats added in the tests here. It wont work if the DeleteUnusedMariaDBAccountFinalizers call above isn't called, since that's part of what it's testing.

Right now, having the call where it is is mostly fine, you can rotate a new MariaDBAccount in and all will be fine. The one thing you can't do here is delete the old MariaDBAccount before the new one is fully reconciled and running, without the risk of a brief race condition where existing pods are using the old database login that would be pulled out, since the finalizer here is removed slightly too early to be guaranteed of that. The issue would of course be worked out within a few seconds in the average case and wouldn't be noticed, and we as yet dont have any system that is actually rotating out old/new MariaDBAccount objects, so there will be no impact from this for now. However, the comment here should be kept that we would like to move this call downwards to be after we are ensured all pods have been re-deployed with new credentials.

return ctrl.Result{}, err
}

// Add a prefix to the var name to avoid accidental collision with other non-secret
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, are we removing this comment?

func (r *HeatEngineReconciler) getSecret(
ctx context.Context,
h *helper.Helper,
instance *heatv1beta1.HeatEngine,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of this being duplicated for each controller. I believe the instance here would satisfy the client.Object interface, which would mean we can create a single function that isn't a method of the reconciler, where the instance passed in is of type client.Object.

But we can merge this now and then address that point in a follow up PR. I assume all of the other PR's are doing the same thing as defined here. So for consistency sake, it's best to just merge and we'll look at improving the implementation later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, we get Status. That isn't a client.Object field. But we still might be able to find a common type that satisfies this requirement and enables us to make this code more generic. I'll dig into it and see if I can propose something that we can collab on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so all the other PRs are not changing code in the same way, because each operator seems to have a slightly different way of sharing state between the "top level" controller and the "subordinate" controllers, where for example we have the "nova" controller and its subordinates novaapi, novaconductor, or ironic and its subordinates ironicapi, ironicconductor, etc. they each pass along data differently and refer to their top level instance either explicitly or implicitly. also most of the controllers were already putting the database URL in a configSecret, rather than sending along a fixed DB username with PW environment to the container env directly.

in the case of heat, the code looked most similar to designate, which I also had to convert to not send username in container env, and for setting up the configmap (which I changed to configsecret) each subordinate controller has its very own getSecret method: https://github.com/openstack-k8s-operators/designate-operator/blob/073a542389ea53ae9f77efb03ffcc8fbab65b552/controllers/designateapi_controller.go#L789 https://github.com/openstack-k8s-operators/designate-operator/blob/073a542389ea53ae9f77efb03ffcc8fbab65b552/controllers/designateproducer_controller.go#L552 etc, so I emulated that. I would agree it would be better to have just one method that is shared.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have three more of these PRs to do, so far every single one has had some new thing I've had to learn so we'll see how barbican / manila / telemetry go

@openshift-ci openshift-ci bot added the lgtm label Mar 7, 2024
Copy link
Contributor

openshift-ci bot commented Mar 7, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bshephar, zzzeek

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Mar 7, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit f66e438 into openstack-k8s-operators:main Mar 7, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants