Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more galera metrics #10675

Merged
merged 3 commits into from
Dec 7, 2021
Merged

Conversation

aymeric-ledizes
Copy link
Contributor

@aymeric-ledizes aymeric-ledizes commented Nov 18, 2021

What does this PR do?

This PR add more useful Galera metrics:
wsrep_local_state
wsrep_replicated_bytes
wsrep_received_bytes
wsrep_received
wsrep_local_cert_failures

Motivation

wsrep_local_state to determine the Galera state (Synced,Joined,Donor/Desynced,Joining)
wsrep_received, wsrep_replicated_bytes and wsrep_received_bytes to estimate the gcache size we need to know the amount of data transiting
wsrep_local_cert_failures to monitor the "conflicts" that we never want to have (because it refuses commits) on a production cluster

Additional Notes

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have changelog/ and integration/ labels attached

@hithwen
Copy link
Contributor

hithwen commented Nov 18, 2021

Hallo, this PR is a draft. Let us know when it's ready for review

Copy link
Contributor

@frivoire frivoire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some remarks:

mysql/datadog_checks/mysql/const.py Outdated Show resolved Hide resolved
mysql/datadog_checks/mysql/const.py Show resolved Hide resolved
mysql/datadog_checks/mysql/const.py Show resolved Hide resolved
@hithwen hithwen requested a review from a team November 22, 2021 12:54
@aymeric-ledizes aymeric-ledizes marked this pull request as ready for review November 22, 2021 14:58
@aymeric-ledizes aymeric-ledizes requested a review from a team as a code owner November 22, 2021 14:58
@codecov
Copy link

codecov bot commented Nov 22, 2021

Codecov Report

Merging #10675 (9adb0ce) into master (b9af988) will increase coverage by 0.01%.
The diff coverage is n/a.

Flag Coverage Δ
mysql 86.96% <ø> (+0.72%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Copy link
Contributor

@frivoire frivoire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for this useful PR !
with one open question about metric type:

Comment on lines 238 to 242
'wsrep_replicated_bytes': ('mysql.galera.wsrep_replicated_bytes', GAUGE),
'wsrep_received_bytes': ('mysql.galera.wsrep_received_bytes', GAUGE),
'wsrep_received': ('mysql.galera.wsrep_received', GAUGE),
'wsrep_local_state': ('mysql.galera.wsrep_local_state', GAUGE),
'wsrep_local_cert_failures': ('mysql.galera.wsrep_local_cert_failures', GAUGE),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if these 5 new metrics shouldn't be defined as MONOTONIC, because they are all counters of "how many XX (events or bytes) occurred since the node started".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmeunier28 : what do you think of this ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question I would say that makes sense for things that are events e.g.: wsrep_local_cert_failures should be a count

I think wsrep_local_state makes sense as a gauge bc its a state that fluctuates between defined numbers. As far as bytes written/received go, I think it also makes sense to just leave it as a GAUGE assuming the last number you receive from the query is a Total, which it looks like it is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks, I updated wsrep_local_cert_failures to MONOTONIC type and let the others as GAUGE

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think wsrep_local_state makes sense as a gauge bc its a state that fluctuates between defined numbers.

Ha yes, I didn't check precisely enough last time.
But you're totally right, this metric will go up and down, so gauge 👍

As far as bytes written/received go, I think it also makes sense to just leave it as a GAUGE assuming the last number you receive from the query is a Total, which it looks like it is

Yes, mysql/mariadb returns the total traffic sent since the start of the process.

jmeunier28
jmeunier28 previously approved these changes Nov 30, 2021
Copy link
Contributor

@jmeunier28 jmeunier28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@aymeric-ledizes
Copy link
Contributor Author

@jmeunier28 thanks for your review what is the next step to merge this PR ?

@jmeunier28
Copy link
Contributor

@aymeric-ledizes sorry for late response here, but if you want to address #10675 (comment) then I can get your PR merged

Thanks for adding these new metrics! They look super useful!

@aymeric-ledizes
Copy link
Contributor Author

Thanks for the approve @jmeunier28 how to merge the PR ? Are the labels ok ?

@jmeunier28 jmeunier28 merged commit 806d816 into DataDog:master Dec 7, 2021
cswatt pushed a commit that referenced this pull request Jan 5, 2022
* add more galera useful metric for mysql

* add wresp_received, wresp_local_state and wresp_local_cert_failures

* set wsrep_local_cert_failures as MONOTONIC
@yzhan289 yzhan289 changed the title add more galera useful metrics for mysql Add more galera metrics Jan 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants