-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more galera metrics #10675
Add more galera metrics #10675
Conversation
Hallo, this PR is a draft. Let us know when it's ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some remarks:
Codecov Report
Flags with carried forward coverage won't be shown. Click here to find out more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for this useful PR !
with one open question about metric type:
mysql/datadog_checks/mysql/const.py
Outdated
'wsrep_replicated_bytes': ('mysql.galera.wsrep_replicated_bytes', GAUGE), | ||
'wsrep_received_bytes': ('mysql.galera.wsrep_received_bytes', GAUGE), | ||
'wsrep_received': ('mysql.galera.wsrep_received', GAUGE), | ||
'wsrep_local_state': ('mysql.galera.wsrep_local_state', GAUGE), | ||
'wsrep_local_cert_failures': ('mysql.galera.wsrep_local_cert_failures', GAUGE), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if these 5 new metrics shouldn't be defined as MONOTONIC
, because they are all counters of "how many XX (events or bytes) occurred since the node started".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmeunier28 : what do you think of this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question I would say that makes sense for things that are events e.g.: wsrep_local_cert_failures
should be a count
I think wsrep_local_state
makes sense as a gauge bc its a state that fluctuates between defined numbers. As far as bytes written/received go, I think it also makes sense to just leave it as a GAUGE assuming the last number you receive from the query is a Total, which it looks like it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks, I updated wsrep_local_cert_failures
to MONOTONIC
type and let the others as GAUGE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think wsrep_local_state makes sense as a gauge bc its a state that fluctuates between defined numbers.
Ha yes, I didn't check precisely enough last time.
But you're totally right, this metric will go up and down, so gauge 👍
As far as bytes written/received go, I think it also makes sense to just leave it as a GAUGE assuming the last number you receive from the query is a Total, which it looks like it is
Yes, mysql/mariadb returns the total traffic sent since the start of the process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@jmeunier28 thanks for your review what is the next step to merge this PR ? |
@aymeric-ledizes sorry for late response here, but if you want to address #10675 (comment) then I can get your PR merged Thanks for adding these new metrics! They look super useful! |
4edcabf
to
9adb0ce
Compare
Thanks for the approve @jmeunier28 how to merge the PR ? Are the labels ok ? |
* add more galera useful metric for mysql * add wresp_received, wresp_local_state and wresp_local_cert_failures * set wsrep_local_cert_failures as MONOTONIC
What does this PR do?
This PR add more useful Galera metrics:
wsrep_local_state
wsrep_replicated_bytes
wsrep_received_bytes
wsrep_received
wsrep_local_cert_failures
Motivation
wsrep_local_state
to determine the Galera state (Synced,Joined,Donor/Desynced,Joining)wsrep_received
,wsrep_replicated_bytes
andwsrep_received_bytes
to estimate the gcache size we need to know the amount of data transitingwsrep_local_cert_failures
to monitor the "conflicts" that we never want to have (because it refuses commits) on a production clusterAdditional Notes
Review checklist (to be filled by reviewers)
changelog/
andintegration/
labels attached