Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBMON-2051: Adding a deadlock metric to MySQL databases #16904

Merged
merged 32 commits into from
Feb 28, 2024
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
f6da4ae
DBMON-2051: Adding a deadlock metric to Maria DB
Feb 19, 2024
f0a7bcf
removing print outs
kozlovb Feb 20, 2024
402fa6f
removed prints
kozlovb Feb 20, 2024
cdf9819
trying to create a proper deadlock
kozlovb Feb 20, 2024
89728de
Fix thread pool limit
kozlovb Feb 20, 2024
af655a4
Add a check
kozlovb Feb 20, 2024
9f04c81
Add wait to let update db threads to finish
kozlovb Feb 20, 2024
c87e65c
Add select query
kozlovb Feb 20, 2024
e1c2b7a
fix query
kozlovb Feb 20, 2024
6b1eb22
revert changes to conftest
kozlovb Feb 21, 2024
0c6b230
fix test
kozlovb Feb 21, 2024
478063b
Add events
kozlovb Feb 21, 2024
03efd98
remove obsolete diff
kozlovb Feb 21, 2024
deeddbb
cleaned up pr
kozlovb Feb 22, 2024
dc5693d
moved tests to the end
kozlovb Feb 22, 2024
6c7dc2e
Put back empty lines
kozlovb Feb 22, 2024
1492844
put back empty line
kozlovb Feb 22, 2024
7441ad0
added a changelog
kozlovb Feb 22, 2024
edd227c
changed pr number in changelog
kozlovb Feb 22, 2024
8abc8bf
removed patched config
kozlovb Feb 22, 2024
ffd08e1
reverting back conftest
kozlovb Feb 22, 2024
5c4e7ba
reverted common py
kozlovb Feb 22, 2024
49582d0
applied linter
kozlovb Feb 22, 2024
daac79f
Added the correct changelog
kozlovb Feb 22, 2024
ee10ca3
add a check for innnodb
kozlovb Feb 23, 2024
e295f28
removed superfluous method
kozlovb Feb 23, 2024
8e89bf4
removed whitespace
kozlovb Feb 23, 2024
8bce957
changed to monotonic_count
kozlovb Feb 23, 2024
e72f952
removed check from constructor
kozlovb Feb 26, 2024
126881e
modified variable names
kozlovb Feb 26, 2024
77d44a1
fixed changelog and test error
kozlovb Feb 27, 2024
e513c14
improved the changelog
kozlovb Feb 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions mysql/changelog.d/16904.added
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DBMON-2051: Adding a deadlock metric to Maria DB
kozlovb marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not include the ticket in the changelog since this is mostly for customers and they can't have access to our Jira board

Copy link
Contributor Author

@kozlovb kozlovb Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, now, I remembered in my previous company we had two separate entries one for customers and one internal with the ticket. Ok, let me remove it, thx.

13 changes: 10 additions & 3 deletions mysql/datadog_checks/mysql/mysql.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@
from .innodb_metrics import InnoDBMetrics
from .metadata import MySQLMetadata
from .queries import (
QUERY_DEADLOCKS,
QUERY_USER_CONNECTIONS,
SQL_95TH_PERCENTILE,
SQL_AVG_QUERY_RUN_TIME,
Expand Down Expand Up @@ -140,6 +141,8 @@ def __init__(self, name, init_config, instances):
# go through the agent internal metrics submission processing those tags
self._non_internal_tags = copy.deepcopy(self.tags)
self.set_resource_tags()
with self._connect() as db:
self._is_innodb_engine_enabled = self._is_innodb_engine_enabled(db)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to put this under def check after db connection is established

with self._connect() as db:
.
We can also rename the function _is_innodb_engine_enabled to something like _check_innodb_engine_enabled.
In the __init__ constructor, we can just initialize a variable self.innodb_engine_enabled = None as None to indicate the check has not been done.

Copy link
Contributor Author

@kozlovb kozlovb Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx, for renaming will do.

as for moving it to check - that was my initial idea too. But I didn't like that I had to add db as an argument to runtime_queries. Although, now, I greped the code and no one is calling this from outside of the class. I thought if it's without _ then it's public and I cant change the signature.

The other option is to modify _check_innodb_engine_enabled, I could've check there if self.conn is None or not, and if not open a connection there. Pls let me know which one is better or something else.

What I don't like in my solution and kind of in existing one is the absence of clarity - how one can know that _check_innodb_engine_enabled has to be called and not _is_innodb_engine_enabled. Same for the runtime_queries property vs _runtime_queries. May be we call _runtime_queries -> _runtime_queries_cached and same for _is_innodb_engine_enabled -> _is_innodb_engine_enabled_cached. Like this it would be clear that this field might be None.


def execute_query_raw(self, query):
with closing(self._conn.cursor(pymysql.cursors.SSCursor)) as cursor:
Expand Down Expand Up @@ -345,6 +348,9 @@ def runtime_queries(self):

queries = []

if self._is_innodb_engine_enabled:
queries.extend([QUERY_DEADLOCKS])

if self.performance_schema_enabled:
queries.extend([QUERY_USER_CONNECTIONS])

Expand Down Expand Up @@ -445,9 +451,10 @@ def _collect_metrics(self, db, tags):
with tracked_query(self, operation="variables_metrics"):
results.update(self._get_stats_from_variables(db))

if not is_affirmative(
self._config.options.get('disable_innodb_metrics', False)
) and self._is_innodb_engine_enabled(db):
if (
not is_affirmative(self._config.options.get('disable_innodb_metrics', False))
and self._is_innodb_engine_enabled
):
with tracked_query(self, operation="innodb_metrics"):
results.update(self.innodb_stats.get_stats_from_innodb_status(db))
self.innodb_stats.process_innodb_stats(results, self._config.options, metrics)
Expand Down
13 changes: 13 additions & 0 deletions mysql/datadog_checks/mysql/queries.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,19 @@
SELECT plugin_status
FROM information_schema.plugins WHERE plugin_name='group_replication'"""

QUERY_DEADLOCKS = {
'name': 'information_schema.INNODB_METRICS.lock_deadlocks',
'query': """
SELECT
count as deadlocks
FROM
information_schema.INNODB_METRICS
WHERE
NAME='lock_deadlocks'
""".strip(),
'columns': [{'name': 'mysql.innodb.deadlocks', 'type': 'gauge'}],
}

QUERY_USER_CONNECTIONS = {
'name': 'performance_schema.threads',
'query': """
Expand Down
1 change: 1 addition & 0 deletions mysql/metadata.csv
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
metric_name,metric_type,interval,unit_name,per_unit_name,description,orientation,integration,short_name,curated_metric
mysql.info.schema.size,gauge,,mebibyte,,"Size of schemas in MiB",0,mysql,mysql schema size,
mysql.info.table.rows.read,count,,row,,"Total number of rows read per table (Percona userstat only)",0,mysql,mysql table rows read,
Expand All @@ -12,6 +12,7 @@
mysql.innodb.current_row_locks,gauge,,lock,,The number of current row locks.,-1,mysql,current row locks,
mysql.innodb.data_reads,gauge,,read,second,The rate of data reads.,0,mysql,data reads,
mysql.innodb.data_writes,gauge,,write,second,The rate of data writes.,0,mysql,data writes,
mysql.innodb.deadlocks,monotonic_count,,lock,,"The number of deadlocks.",0,mysql,number of deadlocks,
mysql.innodb.mutex_os_waits,gauge,,event,second,The rate of mutex OS waits.,0,mysql,mutex os waits,
mysql.innodb.mutex_spin_rounds,gauge,,event,second,The rate of mutex spin rounds.,0,mysql,mutex spin rounds,
mysql.innodb.mutex_spin_waits,gauge,,event,second,The rate of mutex spin waits.,0,mysql,mutex spin waits,
Expand Down
82 changes: 82 additions & 0 deletions mysql/tests/test_query_activity.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from copy import copy
from datetime import datetime
from os import environ
from threading import Event

import mock
import pymysql
Expand Down Expand Up @@ -478,6 +479,87 @@ def _expected_dbm_job_err_tags(dbm_instance):
]


@pytest.mark.integration
@pytest.mark.usefixtures('dd_environment')
def test_if_deadlock_metric_is_collected(aggregator, dd_run_check, dbm_instance):
check = MySql(CHECK_NAME, {}, [dbm_instance])
dd_run_check(check)
deadlock_metric = aggregator.metrics("mysql.innodb.deadlocks")

assert len(deadlock_metric) == 1, "there should be one deadlock metric"


@pytest.mark.skipif(environ.get('MYSQL_FLAVOR') == 'mariadb', reason='Deadock count is not updated in MariaDB')
@pytest.mark.integration
@pytest.mark.usefixtures('dd_environment')
def test_deadlocks(aggregator, dd_run_check, dbm_instance):
check = MySql(CHECK_NAME, {}, [dbm_instance])
dd_run_check(check)
deadlocks_start = 0
deadlock_metric_start = aggregator.metrics("mysql.innodb.deadlocks")

assert len(deadlock_metric_start) == 1, "there should be one deadlock metric"

deadlocks_start = deadlock_metric_start[0].value

first_query = "SELECT * FROM testdb.users WHERE id = 1 FOR UPDATE;"
second_query = "SELECT * FROM testdb.users WHERE id = 2 FOR UPDATE;"

def run_first_deadlock_query(conn, event1, event2):
conn.begin()
try:
conn.cursor().execute("START TRANSACTION;")
conn.cursor().execute(first_query)
event1.set()
event2.wait()
conn.cursor().execute(second_query)
conn.cursor().execute("COMMIT;")
except Exception:
# Exception is expected due to a deadlock
pass
conn.commit()

def run_second_deadlock_query(conn, event1, event2):
conn.begin()
try:
event1.wait()
conn.cursor().execute("START TRANSACTION;")
conn.cursor().execute(second_query)
event2.set()
conn.cursor().execute(first_query)
conn.cursor().execute("COMMIT;")
except Exception:
# Exception is expected due to a deadlock
pass
conn.commit()

bob_conn = _get_conn_for_user('bob')
fred_conn = _get_conn_for_user('fred')

executor = concurrent.futures.thread.ThreadPoolExecutor(2)

event1 = Event()
event2 = Event()

futures_first_query = executor.submit(run_first_deadlock_query, bob_conn, event1, event2)
futures_second_query = executor.submit(run_second_deadlock_query, fred_conn, event1, event2)
futures_first_query.result()
futures_second_query.result()
# Make sure innodb is updated.
time.sleep(0.3)
bob_conn.close()
fred_conn.close()
executor.shutdown()

dd_run_check(check)

deadlock_metric_end = aggregator.metrics("mysql.innodb.deadlocks")

assert (
len(deadlock_metric_end) == 2 and deadlock_metric_end[1].value - deadlocks_start == 1
), "there should be one new deadlock"


def _get_conn_for_user(user, _autocommit=False):
return pymysql.connect(host=HOST, port=PORT, user=user, password=user, autocommit=_autocommit)

Expand Down
1 change: 1 addition & 0 deletions mysql/tests/variables.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@
'mysql.innodb.buffer_pool_read_requests',
'mysql.innodb.buffer_pool_reads',
'mysql.innodb.buffer_pool_utilization',
'mysql.innodb.deadlocks',
]

COMPLEX_INNODB_VARS = [
Expand Down
Loading