Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tlse] TLS database connection #331

Merged
merged 1 commit into from
Mar 14, 2024

Conversation

stuggi
Copy link
Contributor

@stuggi stuggi commented Mar 7, 2024

The my.cnf file gets added to the secret holding the service configs. The content of my.cnf is centrally managed in the mariadb-operator and retrieved calling db.GetDatabaseClientConfig(tlsCfg)

Depends-On: openstack-k8s-operators/mariadb-operator#190
Depends-On: openstack-k8s-operators/mariadb-operator#191

Jira: OSPRH-4547

@openshift-ci openshift-ci bot requested review from lewisdenny and vyzigold March 7, 2024 17:13
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/1f54dd75ccae4fb8a111ac316509f770

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 11m 54s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 07m 01s
telemetry-operator-multinode-autoscaling-tempest FAILURE in 1h 32m 25s

@jlarriba
Copy link
Collaborator

jlarriba commented Mar 8, 2024

There is an error in the operator with the current changes while trying to run aodh:

2024-03-07T19:56:50Z	INFO	Controllers.Autoscaling	Reconciling Service 'aodh'	{"controller": "autoscaling", "controllerGroup": "telemetry.openstack.org", "controllerKind": "Autoscaling", "Autoscaling": {"name":"autoscaling","namespace":"openstack"}, "namespace": "openstack", "name": "autoscaling", "reconcileID": "13e65351-d3f3-425d-8578-52506d697264"}
2024-03-07T19:56:50Z	ERROR	Reconciler error	{"controller": "autoscaling", "controllerGroup": "telemetry.openstack.org", "controllerKind": "Autoscaling", "Autoscaling": {"name":"autoscaling","namespace":"openstack"}, "namespace": "openstack", "name": "autoscaling", "reconcileID": "13e65351-d3f3-425d-8578-52506d697264", "error": "Failed to get aodh database openstack  *v1beta1.Autoscaling openstack/autoscaling: MariaDBDatabase.mariadb.openstack.org \"aodh\" not found"}

customData := map[string]string{common.CustomServiceConfigFileName: instance.Spec.Aodh.CustomServiceConfig}

// the aodh controller currently creates the db with the user
db, err := mariadbv1.GetDatabaseByName(ctx, h, instance.Spec.Aodh.DatabaseUser)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that this should be instance.Spec.Aodh.DatabaseName

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was intentional as DatabaseUser is also used in https://github.com/openstack-k8s-operators/telemetry-operator/blob/main/controllers/aodh_controller.go#L159 . will have a look in a bit. I guess the issue is that the Database is created in a different controller.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are not consistent on how the database is called, I have submitted some changes to use DatabaseUser as the database name everywhere

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuggi The split between autoscaling_controller.go and aodh_controller.go doesn't mean anything, its one controller split into 2 files. The reason is, that historically the autoscaling controller was a lot bigger (it included parts of the current metric storage controller).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing with @vyzigold, we realized that the databasename should be fixed, and the issue is to call it as DatabaseUser. This has been fixed now and everywhere it is referred to by the constant autoscaling.DatabaseName

@jlarriba jlarriba force-pushed the tls_db branch 2 times, most recently from c1998f6 to 58fbe55 Compare March 8, 2024 09:38
@stuggi
Copy link
Contributor Author

stuggi commented Mar 8, 2024

I missed to relocate the DB creation. we need to create the DB before the config gets rendered

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/5f5d47b08af847a88991665fb7224e7e

openstack-k8s-operators-content-provider FAILURE in 22m 13s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ telemetry-operator-multinode-autoscaling-tempest SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@stuggi
Copy link
Contributor Author

stuggi commented Mar 8, 2024

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/5ff5bda50b6d4e07b835e27207980586

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 59m 30s
podified-multinode-edpm-deployment-crc FAILURE in 1h 26m 49s
telemetry-operator-multinode-autoscaling-tempest FAILURE in 1h 33m 11s

@jlarriba
Copy link
Collaborator

jlarriba commented Mar 8, 2024

There is something wrong with reconciliation, it never seems to end:

2024-03-08T13:25:58Z	INFO	Controllers.Autoscaling	Reconciling Service 'aodh'	{"controller": "autoscaling", "controllerGroup": "telemetry.openstack.org", "controllerKind": "Autoscaling", "Autoscaling": {"name":"autoscaling","namespace":"openstack"}, "namespace": "openstack", "name": "autoscaling", "reconcileID": "f86dd085-5b6e-43ee-b3c8-a71d029200be"}
2024-03-08T13:25:58Z	INFO	Controllers.Autoscaling	Database object for MariaDBDatabase aodh does not have a MariaDBAccount CR name configured. Assuming legacy use of the API, will use the same name for the MariaDBAccount.	{"controller": "autoscaling", "controllerGroup": "telemetry.openstack.org", "controllerKind": "Autoscaling", "Autoscaling": {"name":"autoscaling","namespace":"openstack"}, "namespace": "openstack", "name": "autoscaling", "reconcileID": "f86dd085-5b6e-43ee-b3c8-a71d029200be"}
2024-03-08T13:25:58Z	INFO	Controllers.Autoscaling	Applied new databasehostname openstack.openstack.svc to MariaDBDatabase aodh	{"controller": "autoscaling", "controllerGroup": "telemetry.openstack.org", "controllerKind": "Autoscaling", "Autoscaling": {"name":"autoscaling","namespace":"openstack"}, "namespace": "openstack", "name": "autoscaling", "reconcileID": "f86dd085-5b6e-43ee-b3c8-a71d029200be"}
2024-03-08T13:25:58Z	INFO	Controllers.Autoscaling	Waiting for MariaDBDatabase aodh to be fully reconciled	{"controller": "autoscaling", "controllerGroup": "telemetry.openstack.org", "controllerKind": "Autoscaling", "Autoscaling": {"name":"autoscaling","namespace":"openstack"}, "namespace": "openstack", "name": "autoscaling", "reconcileID": "f86dd085-5b6e-43ee-b3c8-a71d029200be", "ObjectType": "*v1beta1.MariaDBDatabase", "ObjectNamespace": "openstack", "ObjectName": "aodh"}

@stuggi
Copy link
Contributor Author

stuggi commented Mar 8, 2024

There is something wrong with reconciliation, it never seems to end:

2024-03-08T13:25:58Z	INFO	Controllers.Autoscaling	Reconciling Service 'aodh'	{"controller": "autoscaling", "controllerGroup": "telemetry.openstack.org", "controllerKind": "Autoscaling", "Autoscaling": {"name":"autoscaling","namespace":"openstack"}, "namespace": "openstack", "name": "autoscaling", "reconcileID": "f86dd085-5b6e-43ee-b3c8-a71d029200be"}
2024-03-08T13:25:58Z	INFO	Controllers.Autoscaling	Database object for MariaDBDatabase aodh does not have a MariaDBAccount CR name configured. Assuming legacy use of the API, will use the same name for the MariaDBAccount.	{"controller": "autoscaling", "controllerGroup": "telemetry.openstack.org", "controllerKind": "Autoscaling", "Autoscaling": {"name":"autoscaling","namespace":"openstack"}, "namespace": "openstack", "name": "autoscaling", "reconcileID": "f86dd085-5b6e-43ee-b3c8-a71d029200be"}
2024-03-08T13:25:58Z	INFO	Controllers.Autoscaling	Applied new databasehostname openstack.openstack.svc to MariaDBDatabase aodh	{"controller": "autoscaling", "controllerGroup": "telemetry.openstack.org", "controllerKind": "Autoscaling", "Autoscaling": {"name":"autoscaling","namespace":"openstack"}, "namespace": "openstack", "name": "autoscaling", "reconcileID": "f86dd085-5b6e-43ee-b3c8-a71d029200be"}
2024-03-08T13:25:58Z	INFO	Controllers.Autoscaling	Waiting for MariaDBDatabase aodh to be fully reconciled	{"controller": "autoscaling", "controllerGroup": "telemetry.openstack.org", "controllerKind": "Autoscaling", "Autoscaling": {"name":"autoscaling","namespace":"openstack"}, "namespace": "openstack", "name": "autoscaling", "reconcileID": "f86dd085-5b6e-43ee-b3c8-a71d029200be", "ObjectType": "*v1beta1.MariaDBDatabase", "ObjectNamespace": "openstack", "ObjectName": "aodh"}

the dbName had the wrong label. changed it in my last update. lets see

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/285636c1a12549e08066d375c181073b

openstack-k8s-operators-content-provider RETRY_LIMIT in 31m 11s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ telemetry-operator-multinode-autoscaling-tempest SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@stuggi
Copy link
Contributor Author

stuggi commented Mar 8, 2024

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/382b384dfb114029a25750cf4bf7e2b9

openstack-k8s-operators-content-provider TIMED_OUT in 31m 17s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ telemetry-operator-multinode-autoscaling-tempest SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@jlarriba
Copy link
Collaborator

Yesterday a PR was submitted to move the database to the new "DBAccounts" API: #333.

@stuggi I suggest to wait until that is merged and rebase this on top of that, WDYT?

@stuggi
Copy link
Contributor Author

stuggi commented Mar 11, 2024

Yesterday a PR was submitted to move the database to the new "DBAccounts" API: #333.

@stuggi I suggest to wait until that is merged and rebase this on top of that, WDYT?

yes sounds good

@stuggi stuggi marked this pull request as draft March 11, 2024 10:06
@jlarriba
Copy link
Collaborator

Yesterday a PR was submitted to move the database to the new "DBAccounts" API: #333.
@stuggi I suggest to wait until that is merged and rebase this on top of that, WDYT?

yes sounds good

@stuggi #333 was merged, could you please rebase on top of it?

@jlarriba
Copy link
Collaborator

/unhold

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/31cb6ddfefb4439aa8b5a19d2442d4e7

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 56m 52s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 04m 27s
telemetry-operator-multinode-autoscaling-tempest FAILURE in 1h 33m 02s

The my.cnf file gets added to the secret holding the service configs.
The content of my.cnf is centrally managed in the mariadb-operator
and retrieved calling db.GetDatabaseClientConfig(tlsCfg)

Depends-On: openstack-k8s-operators/mariadb-operator#190
Depends-On: openstack-k8s-operators/mariadb-operator#191

Jira: OSPRH-4547
@stuggi stuggi marked this pull request as ready for review March 12, 2024 16:01
@openshift-ci openshift-ci bot requested review from csibbitt and jlarriba March 12, 2024 16:01
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/e2b696addf0041edb8b19b750498622d

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 29m 49s
podified-multinode-edpm-deployment-crc RETRY_LIMIT in 59m 01s
✔️ telemetry-operator-multinode-autoscaling-tempest SUCCESS in 1h 01m 51s

@jlarriba
Copy link
Collaborator

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/78b9731f24f2401c8aedb06fddb22baf

openstack-k8s-operators-content-provider FAILURE in 19m 55s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ telemetry-operator-multinode-autoscaling-tempest SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@stuggi
Copy link
Contributor Author

stuggi commented Mar 12, 2024

recheck

1 similar comment
@stuggi
Copy link
Contributor Author

stuggi commented Mar 13, 2024

recheck

@stuggi
Copy link
Contributor Author

stuggi commented Mar 13, 2024

    case.go:364: failed in step 2-deploy
    case.go:366: Internal error occurred: failed calling webhook "mautoscaling.kb.io": failed to call webhook: Post "https://telemetry-operator-controller-manager-service.openstack-operators.svc:443/mutate-telemetry-openstack-org-v1beta1-autoscaling?timeout=10s": no endpoints available for service "telemetry-operator-controller-manager-service"

@stuggi
Copy link
Contributor Author

stuggi commented Mar 13, 2024

/test telemetry-operator-build-deploy-kuttl

@openshift-ci openshift-ci bot added the lgtm label Mar 14, 2024
Copy link
Contributor

openshift-ci bot commented Mar 14, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlarriba, stuggi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 5590312 into openstack-k8s-operators:main Mar 14, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants