Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MariaDB InnoDB pending flush metric parsing #13658

Merged

Conversation

Jack97
Copy link
Contributor

@Jack97 Jack97 commented Jan 9, 2023

What does this PR do?

This PR fixes the parsing of InnoDB pending flush metrics for modern MariaDB hosts (10.8+).

Motivation

We have been observing the following error when metrics are collected from our MariaDB host:

2023-01-09 10:25:16 UTC | CORE | ERROR | (pkg/collector/worker/check_logger.go:69 in Error) | check:mysql | Error running check: [{"message": "list index out of range", "traceback": "Traceback (most recent call last):\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 1122, in run\n    self.check(instance)\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/mysql/mysql.py\", line 248, in check\n    raise e\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/mysql/mysql.py\", line 229, in check\n    self._collect_metrics(db, tags=tags)\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/mysql/mysql.py\", line 378, in _collect_metrics\n    results.update(self.innodb_stats.get_stats_from_innodb_status(db))\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/mysql/innodb_metrics.py\", line 207, in get_stats_from_innodb_status\n    results['Innodb_pending_log_flushes'] = long(row[4])\nIndexError: list index out of range\n"}]

The issue is within the output of SHOW /*!50000 ENGINE*/ INNODB STATUS, the section containing the metrics for pending flushes is subtly different for newer versions of MariaDB than it is for MySQL.

Additional Notes

N/A

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have changelog/ and integration/ labels attached
  • If the PR doesn't need to be tested during QA, please add a qa/skip-qa label.

@alexandre-normand
Copy link
Contributor

Hey @Jack97 . Thanks for the contribution 🙇 . This change looks good but may I ask how you've tested the fix? It looks like there's no official docker image yet for MariaDB 11.0.0. Ideally, we'd be able to add the version so the tests run in CI for that version.

@Jack97
Copy link
Contributor Author

Jack97 commented Jan 9, 2023

Hi @alexandre-normand, no worries at all. I tested the change with a crude python script that modified the InnoDBMetrics class to scan the output that we receive from our database. I verified that it didn't throw the error we've observed, and that it appended the expected metrics to the results dictionary.

I looked at adding these expectations as unit tests, but the existing tests are fairly sparse for this area of the codebase so I decided against.

We're currently running 10.9.4, but the change has been present since since 10.8. So it should be possible to test an affected version - any preference on which version(s) should be added? 10.8, 10.9, and 10.10 are all candidates.

@alexandre-normand
Copy link
Contributor

Hi @alexandre-normand, no worries at all. I tested the change with a crude python script that modified the InnoDBMetrics class to scan the output that we receive from our database. I verified that it didn't throw the error we've observed, and that it appended the expected metrics to the results dictionary.

I looked at adding these expectations as unit tests, but the existing tests are fairly sparse for this area of the codebase so I decided against.

We're currently running 10.9.4, but the change has been present since since 10.8. So it should be possible to test an affected version - any preference on which version(s) should be added? 10.8, 10.9, and 10.10 are all candidates.

Ah, thanks for clarifying, @Jack97 . We probably don't want an explosion of versions so having the latest official release (10.10) is probably what makes the most sense.

@Jack97 Jack97 force-pushed the fix/mariadb-innodb-pending-flush-metrics branch from 3f6282f to c6f8e4e Compare January 10, 2023 09:02
Copy link
Contributor

@iliakur iliakur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we'd have unit tests for this, but if database monitoring folks are ok without, so are we ;)

@iliakur iliakur merged commit f97050f into DataDog:master Mar 27, 2023
steveny91 pushed a commit that referenced this pull request Mar 30, 2023
* Fix MariaDB InnoDB pending flush metric parsing

* Add MariaDB 10.10 to test version matrix

---------

Co-authored-by: Jack Robertson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants