-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kubernetes_state] - Collect job metrics #686
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the rebasing, I hate when that happens. Sorry.
Last one, then we're ready for final testing and merging.
Once the tagging logic is reworked, can you please make sure job
and namespace
tags are reported as they should? I'll work on some unit tests during QA week.
kubernetes_state/check.py
Outdated
@@ -217,17 +223,37 @@ def kube_job_complete(self, message, **kwargs): | |||
for metric in message.metric: | |||
tags = [] | |||
for label in metric.label: | |||
tags.append(self._format_tag(label.name, label.value)) | |||
trimmed_job = self._trim_job_tag(label.value) | |||
tags.append(self._format_tag(label.name, trimmed_job)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last thing before merge: the good tag logic was the one you deleted: you should:
- iterate on labels
- if label ==
job
, trim, format and append - else format and append
If your version of the code, if a namespace
value matches the pattern, it will be incorrectly trimmed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies. I misunderstood your original comment, but as I read it again, I see what you're talking about.
Thanks for your patience
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes, that LGTM!
I'll test it Monday morning and merge, in time for 5.17.
Hi Chris! I tested how the metric reacts with One can plot the value difference to get the good value, but that is unintuitive. I'll patch the check to submit a In the meantime, do you think submitting a rate will be OK with your use case? One can go back to the absolute number by computing the |
Hi Christ, I just pushed a commit using the
I'll be merging before the feature freeze, but I'd love some feedback during this week, as we will be in release-candidate. Thanks for your contribution, and sorry for the back and forth, ksm metrics are quite tricky. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
Hey Xavier- Thanks for adjusting and merging this. I haven't used the I will pull these changes down and give them a go. Thanks again for your help on this and merging it before the 5.17 freeze. |
Hi Chris,
To handle that possible race condition, we'd need to switch to store old job names and ignore them once they've been reported as succeeded, but that would be non-trivial. I'd love some feedback on whether the current code works OK for your use case. Regards |
Get @xvello - I tried this out and I it should work out for us. Thanks again for your input and help. Is there an expected release date for 5.17? I assume it's soon. Thanks |
Thanks for your input Chris! RC testing is progressing steadily, so 5.17 should be out in a few days. Regards |
What does this PR do?
Add basic Job metrics from Kubernetes State
See #653 for background info
Motivation
We'd currently like to monitor when jobs fail. In the future, we may want to monitor duration, but that's out of scope.
Testing Guidelines
An overview on testing
is available in our contribution guidelines.
Versioning
manifest.json
CHANGELOG.md
Additional Notes
Anything else we should know when reviewing?