-
Notifications
You must be signed in to change notification settings - Fork 43
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
LuaJIT platform profiler documentation
Introduce a new document on LuaJIT platform profiler * LuaJIT platform profiler is a new feature implemented in Tarantool 2.10.0. The document describes the profiler's behavior as of this and next Tarantool versions. * The document is placed in the Tooling chapter. Closes #2587
- Loading branch information
Showing
2 changed files
with
321 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,4 +11,5 @@ to work with Tarantool. | |
tcm/index | ||
interactive_console | ||
luajit_memprof | ||
luajit_sysprof | ||
luajit_getmetrics |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,320 @@ | ||
.. _luajit_sysprof: | ||
|
||
LuaJIT platform profiler | ||
======================== | ||
|
||
The default profiling options for LuaJIT are not fine enough to | ||
get an understanding of performance. For example, performance only | ||
able to show host stack, so all the Lua calls are seen as single | ||
``pcall()``. Oppositely, the ``jit.p`` module provided with LuaJIT | ||
is not able to give any information about the host stack. | ||
|
||
Starting from version :doc:`2.10.0 </release/2.10.0>`, Tarantool | ||
has a built‑in module called ``misc.syprof`` that implements a | ||
LuaJIT sampling profiler (which we will just call *the profiler* | ||
in this section). The profiler is able to capture both guest and | ||
host stacks simultaneously, along with virtual machine states, so | ||
it can show the whole picture. | ||
|
||
The following profiling modes are available: | ||
|
||
* **Default**: only virtual machine state counters. | ||
* **Leaf**: shows the last frame on the stack. | ||
* **Callchain**: performs a complete stack dump. | ||
|
||
The profiler comes with the default parser, which produces output in | ||
a `flamegraph.pl`-suitable format. | ||
|
||
.. contents:: | ||
:local: | ||
:depth: 2 | ||
|
||
.. _profiler_usage: | ||
|
||
Working with the profiler | ||
------------------------- | ||
|
||
Usage of the profiler involves two steps: | ||
|
||
1. :ref:`Collecting <profiler_usage_get>` a binary profile of | ||
stacks, (further, *binary sampling profile* or *binary profile* | ||
for short). | ||
2. :ref:`Parsing <profiler_usage_parse>` the collected binary | ||
profile to get a human-readable profiling report. | ||
|
||
.. _profiler_usage_get: | ||
|
||
Collecting binary profile | ||
~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
To collect a binary profile for a particular part of the Lua and C code, | ||
you need to place this part between two ``misc.sysprof`` functions, | ||
namely, ``misc.sysprof.start()`` and ``misc.sysprof.stop()``, and | ||
then execute the code under Tarantool. | ||
|
||
Below is a chunk of Lua code named ``test.lua`` to illustrate this. | ||
|
||
.. _profiler_usage_example01: | ||
|
||
.. code-block:: lua | ||
:linenos: | ||
local function payload() | ||
local function fib(n) | ||
if n <= 1 then | ||
return n | ||
end | ||
return fib(n - 1) + fib(n - 2) | ||
end | ||
return fib(32) | ||
end | ||
payload() | ||
local res, err = misc.sysprof.start({mode = 'C', interval = 1, path = 'sysprof.bin'}) | ||
assert(res, err) | ||
payload() | ||
res, err = misc.sysprof.stop() | ||
assert(res, err) | ||
The Lua code for starting the profiler -- as in line 1 in the | ||
``test.lua`` example above -- is: | ||
|
||
.. code-block:: lua | ||
local str, err = misc.sysprof.start({mode = 'C', interval = 1, path = 'sysprof.bin'}) | ||
where ``mode`` is a profiling mode, ``interval`` is a sampling interval, | ||
and ``sysprof.bin`` is the name of the binary file where | ||
profiling events are written. | ||
|
||
If the operation fails, for example if it is not possible to open | ||
a file for writing or if the profiler is already running, | ||
``misc.sysprof.start()`` returns ``nil`` as the first result, | ||
an error-message string as the second result, | ||
and a system-dependent error code number as the third result. | ||
If the operation succeeds, ``misc.sysprof.start()`` returns ``true``. | ||
|
||
The Lua code for stopping the profiler -- as in line 15 in the | ||
``test.lua`` example above -- is: | ||
|
||
.. code-block:: lua | ||
local res, err = misc.sysprof.stop() | ||
If the operation fails, for example if there is an error when the | ||
file descriptor is being closed or if there is a failure during | ||
reporting, ``misc.sysprof.stop()`` returns ``nil`` as the first | ||
result, an error-message string as the second result, | ||
and a system-dependent error code number as the third result. | ||
If the operation succeeds, ``misc.sysprof.stop()`` returns ``true``. | ||
|
||
.. _profiler_usage_generate: | ||
|
||
To generate the file with memory profile in binary format | ||
(in the :ref:`test.lua code example above <profiler_usage_example01>` | ||
the file name is ``sysprof.bin``), execute the code under Tarantool: | ||
|
||
.. code-block:: console | ||
$ tarantool test.lua | ||
Tarantool collects the allocation events in ``sysprof.bin``, puts | ||
the file in its :ref:`working directory <cfg_basic-work_dir>`, | ||
and closes the session. | ||
|
||
.. _profiler_usage_parse: | ||
|
||
Parsing binary profile and generating profiling report | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
.. _profiler_usage_parse_command: | ||
|
||
After getting the platform profile in binary format, the next step is | ||
to parse it to get a human-readable profiling report. You can do this | ||
via Tarantool by using the following command | ||
(mind the hyphen ``-`` before the filename): | ||
|
||
.. code-block:: console | ||
$ tarantool -e 'require("sysprof")(arg)' - sysprof.bin > tmp | ||
$ curl -O https://raw.githubusercontent.com/brendangregg/FlameGraph/refs/heads/master/flamegraph.pl | ||
$ perl flamegraph.pl tmp > sysprof.svg | ||
where ``sysprof.bin`` is the binary profile | ||
:ref:`generated earlier <profiler_usage_generate>` by ``tarantool test.lua``. | ||
(Warning: there is a slight behavior change here, the ``tarantool -e ...`` | ||
command was slightly different in Tarantool versions prior to Tarantool 2.8.1.) | ||
Resulted SVG image contains a flamegraph with collected stacks and can be opened | ||
by modern web-browser for analysis. | ||
|
||
As for investigating the Lua code with the help of profiling reports, | ||
it is always code-dependent and there can't be hundred per cent definite | ||
recommendations in this regard. Nevertheless, you can see some of the things | ||
in the :ref:`Profiling report analysis example <profiler_analysis>` later. | ||
|
||
.. _profiler_api: | ||
|
||
The C API | ||
~~~~~~~~~ | ||
|
||
The platform profiler provides a low-level C interface: | ||
|
||
.. code-block:: console | ||
int luaM_sysprof_set_writer(sp_writer writer). Sets writer function for sysprof. | ||
int luaM_sysprof_set_on_stop(sp_on_stop on_stop). Sets on stop callback for sysprof to clear resources. | ||
int luaM_sysprof_set_backtracer(sp_backtracer backtracer). Sets backtracking function. If backtracer arg is NULL, the default backtracer is set. | ||
There is no need to call the configuration functions multiple times, if you are starting and stopping profiler several times in a single program. Also, it is not necessary to configure sysprof for the default mode, however, one MUST configure it for the other modes. | ||
int luaM_sysprof_start(lua_State *L, const struct luam_Sysprof_Options *opt) | ||
int luaM_sysprof_stop(lua_State *L) | ||
int luaM_sysprof_report(struct luam_Sysprof_Counters *counters). Writes profiling counters for each vmstate. | ||
All of the functions return 0 on success and an error code on failure. | ||
|
||
The configuration C types are: | ||
|
||
.. code-block:: console | ||
/* Profiler configurations. */ | ||
/* | ||
** Writer function for profile events. Must be async-safe, see also | ||
** `man 7 signal-safety`. | ||
** Should return amount of written bytes on success or zero in case of error. | ||
** Setting *data to NULL means end of profiling. | ||
** For details see <lj_wbuf.h>. | ||
*/ | ||
typedef size_t (*sp_writer)(const void **data, size_t len, void *ctx); | ||
/* | ||
** Callback on profiler stopping. Required for correctly cleaning | ||
** at VM finalization when profiler is still running. | ||
** Returns zero on success. | ||
*/ | ||
typedef int (*sp_on_stop)(void *ctx, uint8_t *buf); | ||
/* | ||
** Backtracing function for the host stack. Should call `frame_writer` on | ||
** each frame in the stack in the order from the stack top to the stack | ||
** bottom. The `frame_writer` function is implemented inside the sysprof | ||
** and will be passed to the `backtracer` function. If `frame_writer` returns | ||
** NULL, backtracing should be stopped. If `frame_writer` returns not NULL, | ||
** the backtracing should be continued if there are frames left. | ||
*/ | ||
typedef void (*sp_backtracer)(void *(*frame_writer)(int frame_no, void *addr)); | ||
Profiler options are the following: | ||
|
||
.. code-block:: console | ||
struct luam_Sysprof_Options { | ||
/* Profiling mode. */ | ||
uint8_t mode; | ||
/* Sampling interval in msec. */ | ||
uint64_t interval; | ||
/* Custom buffer to write data. */ | ||
uint8_t *buf; | ||
/* The buffer's size. */ | ||
size_t len; | ||
/* Context for the profile writer and final callback. */ | ||
void *ctx; | ||
}; | ||
Profiling modes: | ||
|
||
.. code-block:: console | ||
/* | ||
** DEFAULT mode collects only data for luam_sysprof_counters, which is stored | ||
** in memory and can be collected with luaM_sysprof_report after profiler | ||
** stops. | ||
*/ | ||
#define LUAM_SYSPROF_DEFAULT 0 | ||
/* | ||
** LEAF mode = DEFAULT + streams samples with only top frames of host and | ||
** guests stacks in format described in <lj_sysprof.h> | ||
*/ | ||
#define LUAM_SYSPROF_LEAF 1 | ||
/* | ||
** CALLGRAPH mode = DEFAULT + streams samples with full callchains of host | ||
** and guest stacks in format described in <lj_sysprof.h> | ||
*/ | ||
#define LUAM_SYSPROF_CALLGRAPH 2 | ||
Counters structure for the luaM_Sysprof_Report: | ||
|
||
.. code-block:: console | ||
struct luam_Sysprof_Counters { | ||
uint64_t vmst_interp; | ||
uint64_t vmst_lfunc; | ||
uint64_t vmst_ffunc; | ||
uint64_t vmst_cfunc; | ||
uint64_t vmst_gc; | ||
uint64_t vmst_exit; | ||
uint64_t vmst_record; | ||
uint64_t vmst_opt; | ||
uint64_t vmst_asm; | ||
uint64_t vmst_trace; | ||
/* | ||
** XXX: Order of vmst counters is important: it should be the same as the | ||
** order of the vmstates. | ||
*/ | ||
uint64_t samples; | ||
}; | ||
Caveats: | ||
|
||
* Providing writers, backtracers, etc; in the Default mode is pointless, since | ||
it just collect counters. | ||
* There is NO default configuration for sysprof, so the ``luaM_Sysprof_Configure`` | ||
must be called before the first run of the sysprof. Mind the async-safety. | ||
|
||
The Lua API | ||
~~~~~~~~~~~ | ||
|
||
* ``misc.sysprof.start(opts)`` | ||
* ``misc.sysprof.stop()`` | ||
* ``misc.sysprof.report()`` | ||
|
||
First two functions return boolean ``res`` and ``err``, which is | ||
``nil`` on success and contains an error message on failure. | ||
|
||
``misc.sysprof.report`` returns a Lua table containing the | ||
following counters: | ||
|
||
.. code-block:: console | ||
{ | ||
"samples" = int, | ||
"INTERP" = int, | ||
"LFUNC" = int, | ||
"FFUNC" = int, | ||
"CFUNC" = int, | ||
"GC" = int, | ||
"EXIT" = int, | ||
"RECORD" = int, | ||
"OPT" = int, | ||
"ASM" = int, | ||
"TRACE" = int | ||
} | ||
Parameter opts for the ``misc.sysprof.start`` can contain the | ||
following parameters: | ||
|
||
.. code-block:: console | ||
{ | ||
mode = 'D'/'L'/'C', -- 'D' = DEFAULT, 'L' = LEAF, 'C' = CALLGRAPH | ||
interval = 10, -- sampling interval in msec. | ||
path = '/path/to/file' -- location to store profile data. | ||
} | ||
Mode MUST be provided always, interval and path are optional. | ||
The default interval is 10 msec, default path is ``sysprof.bin``. |