-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OSPRH-8406] Switch to using ScrapeConfigs #446
[OSPRH-8406] Switch to using ScrapeConfigs #446
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vyzigold The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
err = r.Client.Delete(ctx, svcMonitor) | ||
for _, monitor := range monitorList.Items { | ||
if object.CheckOwnerRefExist(instance.ObjectMeta.UID, monitor.ObjectMeta.OwnerReferences) { | ||
err = r.Client.Delete(ctx, monitor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will delete all ServiceMonitors owned by the telemetry-operator. It's here to cleanup the ServiceMonitors created by previous versions of the operator. Do we want this here? Or is there some better place to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be on the reconcileUpdate func, however we are not yet using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll create the function and add it there then since I have the code ready. We can figure out how / when to call the function later I guess.
} | ||
} | ||
if !rabbitmqExists { | ||
err = r.Client.Delete(ctx, svcMonitor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deletion of ServiceMonitors (or ScrapeConfigs) for non-existing RabbitMQs isn't needed with ScrapeConfigs. Previously each instance of RabbitMQ would have its own ServiceMonitor, which would need to be deleted when the RabbitMQ instance was deleted. Now there is only one ScrapeConfig for all RabbitMQ instances. The targets inside the ScrapeConfig are updated each reconciliation loop, so it's always up to date.
2683248
to
08754a1
Compare
53f8446
to
37b4032
Compare
ObjectMeta: metav1.ObjectMeta{ | ||
Name: fmt.Sprintf("%s-%s", instance.Name, ceilometerServerName), | ||
Name: fmt.Sprintf("%s-ceilometer", instance.Name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be using telemetry.ServiceName here instead of instance.Name. The instance is called metric-storage, so the ScrapeConfigs get created as metric-storage-ceilometer and such. I think it is a much better idea to have them created as telemetry-ceilometer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
37b4032
to
c8316f5
Compare
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/3c2df121d625416fadb295798a1f691d ❌ openstack-k8s-operators-content-provider FAILURE in 8m 57s |
We recently discovered issues with authentication, IPv6 and ServiceMonitors in STF. This PR is proactively switching to use ScrapeConfigs instead of ServiceMonitors. The functinality should be equivalent to before. Old ServiceMonitors owned by the MetricStorage controller are deleted. There is a slight difference in the labels associated with the collected metrics. - The Node Exporter metrics are now missing the "job" label, which didn't seem useful and it follows how ceilometer and rabbit metrics are collected. - Ceilometer and RabbitMQ metrics don't have the "service" label anymore, because ScrapeConfigs don't have the information to create that label. Instead they now have the "instance" label. The "instance" label is now used to differentiate between different Rabbit clusters in dashboards instead of the "service" label. I used this opportunity to move the ScrapeConfig creation code into its own function, following the example of dashboard code.
c8316f5
to
5eb13cf
Compare
/lgtm |
9004ebc
into
openstack-k8s-operators:main
We recently discovered issues with authentication, IPv6 and ServiceMonitors in STF. This PR is proactively switching to use ScrapeConfigs instead of ServiceMonitors. The functinality should be equivalent to before. Old ServiceMonitors owned by the MetricStorage controller are deleted.
There is a slight difference in the labels associated with the collected metrics.
The "instance" label is now used to differentiate between different Rabbit clusters in dashboards instead of the "service" label.
I used this opportunity to move the ScrapeConfig creation code into its own function, following the example of dashboard code.