Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect SHOW ENGINE INNODB MUTEX #147

Open
grypyrg opened this issue Aug 1, 2016 · 8 comments
Open

Collect SHOW ENGINE INNODB MUTEX #147

grypyrg opened this issue Aug 1, 2016 · 8 comments

Comments

@grypyrg
Copy link

grypyrg commented Aug 1, 2016

Collect the output of SHOW ENGINE INNODB MUTEX.

Example Output

Example output:

Type    Name    Status
InnoDB  dict0mem.c:92   os_waits=5
InnoDB  dict0mem.c:92   os_waits=15
InnoDB  dict0mem.c:92   os_waits=3
InnoDB  dict0mem.c:92   os_waits=6
InnoDB  dict0mem.c:92   os_waits=14
InnoDB  dict0mem.c:92   os_waits=1
InnoDB  dict0mem.c:92   os_waits=1
InnoDB  dict0mem.c:92   os_waits=1
InnoDB  trx0purge.c:250 os_waits=8
InnoDB  trx0rseg.c:210  os_waits=89
InnoDB  trx0rseg.c:210  os_waits=94
InnoDB  trx0rseg.c:210  os_waits=92
InnoDB  trx0rseg.c:210  os_waits=97
InnoDB  ibuf0ibuf.c:537 os_waits=9
InnoDB  ibuf0ibuf.c:534 os_waits=3467068
InnoDB  ibuf0ibuf.c:531 os_waits=353548
InnoDB  dict0dict.c:716 os_waits=22363
InnoDB  trx0sys.c:196   os_waits=564
InnoDB  trx0sys.c:1327  os_waits=40
InnoDB  log0log.c:775   os_waits=40086
InnoDB  log0log.c:771   os_waits=14899464
InnoDB  buf0buf.c:1221  os_waits=462
InnoDB  buf0buf.c:1186  os_waits=113179298
InnoDB  fil0fil.c:1645  os_waits=912159
InnoDB  srv0start.c:1301    os_waits=189856
InnoDB  srv0srv.c:1027  os_waits=3079
InnoDB  srv0srv.c:1024  os_waits=323266475
InnoDB  combined buf0buf.c:935  os_waits=803521
InnoDB  dict0dict.c:1750    os_waits=2
InnoDB  dict0dict.c:1750    os_waits=1167
InnoDB  dict0dict.c:1750    os_waits=266
InnoDB  dict0dict.c:1750    os_waits=10
InnoDB  dict0dict.c:1750    os_waits=93
InnoDB  dict0dict.c:1750    os_waits=28
InnoDB  dict0dict.c:1750    os_waits=845
InnoDB  dict0dict.c:1750    os_waits=118
InnoDB  dict0dict.c:1750    os_waits=76
InnoDB  dict0dict.c:1750    os_waits=35788
InnoDB  fil0fil.c:1303  os_waits=9
InnoDB  dict0dict.c:1750    os_waits=6
InnoDB  dict0dict.c:1750    os_waits=40
InnoDB  dict0dict.c:1750    os_waits=494
InnoDB  dict0dict.c:1750    os_waits=1208
InnoDB  dict0dict.c:1750    os_waits=74346
InnoDB  dict0dict.c:1750    os_waits=3333
InnoDB  dict0dict.c:1750    os_waits=114169
InnoDB  fil0fil.c:1303  os_waits=2
InnoDB  fil0fil.c:1303  os_waits=77
InnoDB  fil0fil.c:1303  os_waits=1
InnoDB  fil0fil.c:1303  os_waits=6
InnoDB  trx0purge.c:246 os_waits=2146
InnoDB  dict0dict.c:1750    os_waits=10202084
InnoDB  dict0dict.c:739 os_waits=441
InnoDB  dict0dict.c:739 os_waits=23
InnoDB  dict0dict.c:739 os_waits=25
InnoDB  dict0dict.c:739 os_waits=171
InnoDB  dict0dict.c:729 os_waits=78
InnoDB  fil0fil.c:1303  os_waits=14533
InnoDB  log0log.c:832   os_waits=345967
InnoDB  btr0sea.c:178   os_waits=32118152
InnoDB  trx0i_s.c:1379  os_waits=51
InnoDB  combined buf0buf.c:936  os_waits=279594950

Notes

To Implement This

  • Collect name and the amount of os_waits
  • Many duplicate name occurs, we need to sum all the os_waits for these.

Warnings

  • on larger buffer pools, it can take +1 second to finish . So this should not be ran with an interval of 1 minute or so
  • Based on feedback of Percona Server Lead, It does not impact production in 5.6 and newer. (<5.6 was not checked) (no locks, but does use CPU)
  • maybe disable this feature by default.

Processed Example

Example (ignore second column with %, that is used to represent easier in text):

$ cat data/mutexes.rst | gawk -f pcs-innodb-mutexes-sum.awk
trx0purge.c:250                 0.0 %   8
ibuf0ibuf.c:537                 0.0 %   9
trx0sys.c:1327                  0.0 %   40
dict0mem.c:92                   0.0 %   46
trx0i_s.c:1379                  0.0 %   51
dict0dict.c:729                 0.0 %   78
trx0rseg.c:210                  0.0 %   372
buf0buf.c:1221                  0.0 %   462
trx0sys.c:196                   0.0 %   564
dict0dict.c:739                 0.0 %   660
trx0purge.c:246                 0.0 %   2146
srv0srv.c:1027                  0.0 %   3079
fil0fil.c:1303                  0.0 %   14628
dict0dict.c:716                 0.0 %   22363
log0log.c:775                   0.0 %   40086
srv0start.c:1301                0.0 %   189856
log0log.c:832                   0.0 %   345967
ibuf0ibuf.c:531                 0.0 %   353548
combined buf0buf.c:935          0.1 %   803521
fil0fil.c:1645                  0.1 %   912159
ibuf0ibuf.c:534                 0.4 %   3467068
dict0dict.c:1750                1.3 %   10434073
log0log.c:771                   1.9 %   14899464
btr0sea.c:178                   4.1 %   32118152
buf0buf.c:1186                  14.5%   113179298
combined buf0buf.c:936          35.9%   279594950
srv0srv.c:1024                  41.5%   323266475
@roman-vynar
Copy link
Contributor

Related #135

@SuperQ
Copy link
Member

SuperQ commented Aug 1, 2016

Great, thanks for the detailed feature request.

@siavashs
Copy link
Contributor

I can look into this. Should we call the metric engine_innodb_mutex_os_waits_total{name="<name>"}?

@siavashs
Copy link
Contributor

I did a short test on the following servers:

  • 5.7.20 MySQL Community Server (GPL)
  • 5.7.19-17 Percona Server (GPL), Release '17', Revision 'e19a6b7b73f'
  • 5.5.5-10.2.9-MariaDB-10.2.9+maria~jessie mariadb.org binary distribution

The outputs seem consistent but different from what was previously mention on this issue, for example:

+--------+---------------------------+-----------------------------+
| Type   | Name                      | Status                      |
+--------+---------------------------+-----------------------------+
| InnoDB | SRV_SYS                   | spins=682,waits=22,calls=66 |
| InnoDB | TRX_SYS                   | spins=31,waits=1,calls=47   |
| InnoDB | rwlock: dict0dict.cc:1112 | waits=1                     |
| InnoDB | rwlock: log0log.cc:753    | waits=9                     |
+--------+---------------------------+-----------------------------+

so os_waits is waits now.
Also names like SRV_SYS, TRX_SYS, ... have multiple statuses. Should we also parse these rows and expose them as multiple metrics?

@grypyrg
Copy link
Author

grypyrg commented Oct 20, 2017

Also names like SRV_SYS, TRX_SYS, ... have multiple statuses. Should we also parse these rows and expose them as multiple metrics?

@siavashs : Ideally, yes.

@siavashs
Copy link
Contributor

According to MySQL documentation the status metrics were changed as of MySQL 5.7.8.
I think the best way to handle these metrics for current and future versions of MySQL is to define them like this:

engine_innodb_mutex_total{name="<name>", status="<status>"}

@siavashs
Copy link
Contributor

Some samples:

From my local test instance(with spins, calls):

# HELP mysql_engine_innodb_mutex_total InnoDB mutex and rw-lock statistics
# TYPE mysql_engine_innodb_mutex_total counter
mysql_engine_innodb_mutex_total{name="BUF_POOL_FLUSH_STATE",status="calls"} 273
mysql_engine_innodb_mutex_total{name="BUF_POOL_FLUSH_STATE",status="spins"} 467
mysql_engine_innodb_mutex_total{name="BUF_POOL_FLUSH_STATE",status="waits"} 2
mysql_engine_innodb_mutex_total{name="FIL_SYSTEM",status="calls"} 1
mysql_engine_innodb_mutex_total{name="FIL_SYSTEM",status="spins"} 3
mysql_engine_innodb_mutex_total{name="FIL_SYSTEM",status="waits"} 0
mysql_engine_innodb_mutex_total{name="FLUSH_LIST",status="calls"} 11
mysql_engine_innodb_mutex_total{name="FLUSH_LIST",status="spins"} 5
mysql_engine_innodb_mutex_total{name="FLUSH_LIST",status="waits"} 0
mysql_engine_innodb_mutex_total{name="LOG_SYS",status="calls"} 104
mysql_engine_innodb_mutex_total{name="LOG_SYS",status="spins"} 221
mysql_engine_innodb_mutex_total{name="LOG_SYS",status="waits"} 0
mysql_engine_innodb_mutex_total{name="SRV_SYS_TASKS",status="calls"} 3
mysql_engine_innodb_mutex_total{name="SRV_SYS_TASKS",status="spins"} 6
mysql_engine_innodb_mutex_total{name="SRV_SYS_TASKS",status="waits"} 0
mysql_engine_innodb_mutex_total{name="rwlock: log0log.cc:901",status="waits"} 6

From a busy db(with duplicate names):

# HELP mysql_engine_innodb_mutex_total InnoDB mutex and rw-lock statistics
# TYPE mysql_engine_innodb_mutex_total counter
mysql_engine_innodb_mutex_total{name="rwlock: btr0sea.cc:230",status="waits"} 689290
mysql_engine_innodb_mutex_total{name="rwlock: dict0dict.cc:1188",status="waits"} 7335
mysql_engine_innodb_mutex_total{name="rwlock: dict0dict.cc:2691",status="waits"} 841071
mysql_engine_innodb_mutex_total{name="rwlock: fil0fil.cc:1392",status="waits"} 282042
mysql_engine_innodb_mutex_total{name="rwlock: hash0hash.cc:353",status="waits"} 6815
mysql_engine_innodb_mutex_total{name="rwlock: ibuf0ibuf.cc:575",status="waits"} 2181
mysql_engine_innodb_mutex_total{name="rwlock: log0log.cc:907",status="waits"} 21390
mysql_engine_innodb_mutex_total{name="sum rwlock: buf0buf.cc:1455",status="waits"} 5.549235e+06

From the second sample MySQL is showing sum rwlock: buf0buf.cc:1455, should we drop sum from the label values(since we have _total in metric name)?

@grypyrg
Copy link
Author

grypyrg commented Oct 23, 2017

From the second sample MySQL is showing sum rwlock: buf0buf.cc:1455, should we drop sum from the label values(since we have _total in metric name)?

Ideally every buf0buf.cc:1455 row gets it's own label, as each occurrence of it is actually a different mutex, but a different instance of the same one (for example every buffer pool instance has the same mutexes).
Being able to see the values separate shows you the difference between these mutexes. However, I don't see a good way to figure out just by having the output of SHOW ENGINE INNODB MUTEX.

Hence I suggested to just sum them up. In a way, it's also interesting to have a way to show that there were multiple instances and that is somehow done with adding sum to the name. Not sure if there's a better way to do it.

I suggest to follow this guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants