diff --git a/doc/tooling/index.rst b/doc/tooling/index.rst index b078a6a6a..218586f31 100644 --- a/doc/tooling/index.rst +++ b/doc/tooling/index.rst @@ -11,4 +11,5 @@ to work with Tarantool. tcm/index interactive_console luajit_memprof + luajit_sysprof luajit_getmetrics diff --git a/doc/tooling/luajit_sysprof.rst b/doc/tooling/luajit_sysprof.rst new file mode 100644 index 000000000..0c2b94192 --- /dev/null +++ b/doc/tooling/luajit_sysprof.rst @@ -0,0 +1,320 @@ +.. _luajit_sysprof: + +LuaJIT platform profiler +======================== + +The default profiling options for LuaJIT are not fine enough to +get an understanding of performance. For example, performance only +able to show host stack, so all the Lua calls are seen as single +``pcall()``. Oppositely, the ``jit.p`` module provided with LuaJIT +is not able to give any information about the host stack. + +Starting from version :doc:`2.10.0 `, Tarantool +has a built‑in module called ``misc.syprof`` that implements a +LuaJIT sampling profiler (which we will just call *the profiler* +in this section). The profiler is able to capture both guest and +host stacks simultaneously, along with virtual machine states, so +it can show the whole picture. + +The following profiling modes are available: + +* **Default**: only virtual machine state counters. +* **Leaf**: shows the last frame on the stack. +* **Callchain**: performs a complete stack dump. + +The profiler comes with the default parser, which produces output in +a `flamegraph.pl`-suitable format. + +.. contents:: + :local: + :depth: 2 + +.. _profiler_usage: + +Working with the profiler +------------------------- + +Usage of the profiler involves two steps: + +1. :ref:`Collecting ` a binary profile of + stacks, (further, *binary sampling profile* or *binary profile* + for short). +2. :ref:`Parsing ` the collected binary + profile to get a human-readable profiling report. + +.. _profiler_usage_get: + +Collecting binary profile +~~~~~~~~~~~~~~~~~~~~~~~~~ + +To collect a binary profile for a particular part of the Lua and C code, +you need to place this part between two ``misc.sysprof`` functions, +namely, ``misc.sysprof.start()`` and ``misc.sysprof.stop()``, and +then execute the code under Tarantool. + +Below is a chunk of Lua code named ``test.lua`` to illustrate this. + +.. _profiler_usage_example01: + +.. code-block:: lua + :linenos: + + local function payload() + local function fib(n) + if n <= 1 then + return n + end + return fib(n - 1) + fib(n - 2) + end + return fib(32) + end + + payload() + + local res, err = misc.sysprof.start({mode = 'C', interval = 1, path = 'sysprof.bin'}) + assert(res, err) + + payload() + + res, err = misc.sysprof.stop() + assert(res, err) + +The Lua code for starting the profiler -- as in line 1 in the +``test.lua`` example above -- is: + +.. code-block:: lua + + local str, err = misc.sysprof.start({mode = 'C', interval = 1, path = 'sysprof.bin'}) + +where ``mode`` is a profiling mode, ``interval`` is a sampling interval, +and ``sysprof.bin`` is the name of the binary file where +profiling events are written. + +If the operation fails, for example if it is not possible to open +a file for writing or if the profiler is already running, +``misc.sysprof.start()`` returns ``nil`` as the first result, +an error-message string as the second result, +and a system-dependent error code number as the third result. +If the operation succeeds, ``misc.sysprof.start()`` returns ``true``. + +The Lua code for stopping the profiler -- as in line 15 in the +``test.lua`` example above -- is: + +.. code-block:: lua + + local res, err = misc.sysprof.stop() + +If the operation fails, for example if there is an error when the +file descriptor is being closed or if there is a failure during +reporting, ``misc.sysprof.stop()`` returns ``nil`` as the first +result, an error-message string as the second result, +and a system-dependent error code number as the third result. +If the operation succeeds, ``misc.sysprof.stop()`` returns ``true``. + +.. _profiler_usage_generate: + +To generate the file with memory profile in binary format +(in the :ref:`test.lua code example above ` +the file name is ``sysprof.bin``), execute the code under Tarantool: + +.. code-block:: console + + $ tarantool test.lua + +Tarantool collects the allocation events in ``sysprof.bin``, puts +the file in its :ref:`working directory `, +and closes the session. + +.. _profiler_usage_parse: + +Parsing binary profile and generating profiling report +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. _profiler_usage_parse_command: + +After getting the platform profile in binary format, the next step is +to parse it to get a human-readable profiling report. You can do this +via Tarantool by using the following command +(mind the hyphen ``-`` before the filename): + +.. code-block:: console + + $ tarantool -e 'require("sysprof")(arg)' - sysprof.bin > tmp + $ curl -O https://raw.githubusercontent.com/brendangregg/FlameGraph/refs/heads/master/flamegraph.pl + $ perl flamegraph.pl tmp > sysprof.svg + +where ``sysprof.bin`` is the binary profile +:ref:`generated earlier ` by ``tarantool test.lua``. +(Warning: there is a slight behavior change here, the ``tarantool -e ...`` +command was slightly different in Tarantool versions prior to Tarantool 2.8.1.) +Resulted SVG image contains a flamegraph with collected stacks and can be opened +by modern web-browser for analysis. + +As for investigating the Lua code with the help of profiling reports, +it is always code-dependent and there can't be hundred per cent definite +recommendations in this regard. Nevertheless, you can see some of the things +in the :ref:`Profiling report analysis example ` later. + +.. _profiler_api: + +The C API +~~~~~~~~~ + +The platform profiler provides a low-level C interface: + +.. code-block:: console + + int luaM_sysprof_set_writer(sp_writer writer). Sets writer function for sysprof. + + int luaM_sysprof_set_on_stop(sp_on_stop on_stop). Sets on stop callback for sysprof to clear resources. + + int luaM_sysprof_set_backtracer(sp_backtracer backtracer). Sets backtracking function. If backtracer arg is NULL, the default backtracer is set. + There is no need to call the configuration functions multiple times, if you are starting and stopping profiler several times in a single program. Also, it is not necessary to configure sysprof for the default mode, however, one MUST configure it for the other modes. + + int luaM_sysprof_start(lua_State *L, const struct luam_Sysprof_Options *opt) + + int luaM_sysprof_stop(lua_State *L) + + int luaM_sysprof_report(struct luam_Sysprof_Counters *counters). Writes profiling counters for each vmstate. + +All of the functions return 0 on success and an error code on failure. + +The configuration C types are: + +.. code-block:: console + + /* Profiler configurations. */ + /* + ** Writer function for profile events. Must be async-safe, see also + ** `man 7 signal-safety`. + ** Should return amount of written bytes on success or zero in case of error. + ** Setting *data to NULL means end of profiling. + ** For details see . + */ + + typedef size_t (*sp_writer)(const void **data, size_t len, void *ctx); + /* + ** Callback on profiler stopping. Required for correctly cleaning + ** at VM finalization when profiler is still running. + ** Returns zero on success. + */ + typedef int (*sp_on_stop)(void *ctx, uint8_t *buf); + /* + ** Backtracing function for the host stack. Should call `frame_writer` on + ** each frame in the stack in the order from the stack top to the stack + ** bottom. The `frame_writer` function is implemented inside the sysprof + ** and will be passed to the `backtracer` function. If `frame_writer` returns + ** NULL, backtracing should be stopped. If `frame_writer` returns not NULL, + ** the backtracing should be continued if there are frames left. + */ + typedef void (*sp_backtracer)(void *(*frame_writer)(int frame_no, void *addr)); + +Profiler options are the following: + +.. code-block:: console + + struct luam_Sysprof_Options { + /* Profiling mode. */ + uint8_t mode; + /* Sampling interval in msec. */ + uint64_t interval; + /* Custom buffer to write data. */ + uint8_t *buf; + /* The buffer's size. */ + size_t len; + /* Context for the profile writer and final callback. */ + void *ctx; + }; + +Profiling modes: + +.. code-block:: console + + /* + ** DEFAULT mode collects only data for luam_sysprof_counters, which is stored + ** in memory and can be collected with luaM_sysprof_report after profiler + ** stops. + */ + #define LUAM_SYSPROF_DEFAULT 0 + /* + ** LEAF mode = DEFAULT + streams samples with only top frames of host and + ** guests stacks in format described in + */ + #define LUAM_SYSPROF_LEAF 1 + /* + ** CALLGRAPH mode = DEFAULT + streams samples with full callchains of host + ** and guest stacks in format described in + */ + #define LUAM_SYSPROF_CALLGRAPH 2 + +Counters structure for the luaM_Sysprof_Report: + +.. code-block:: console + + struct luam_Sysprof_Counters { + uint64_t vmst_interp; + uint64_t vmst_lfunc; + uint64_t vmst_ffunc; + uint64_t vmst_cfunc; + uint64_t vmst_gc; + uint64_t vmst_exit; + uint64_t vmst_record; + uint64_t vmst_opt; + uint64_t vmst_asm; + uint64_t vmst_trace; + /* + ** XXX: Order of vmst counters is important: it should be the same as the + ** order of the vmstates. + */ + uint64_t samples; + }; + +Caveats: + +* Providing writers, backtracers, etc; in the Default mode is pointless, since + it just collect counters. +* There is NO default configuration for sysprof, so the ``luaM_Sysprof_Configure`` + must be called before the first run of the sysprof. Mind the async-safety. + +The Lua API +~~~~~~~~~~~ + +* ``misc.sysprof.start(opts)`` +* ``misc.sysprof.stop()`` +* ``misc.sysprof.report()`` + +First two functions return boolean ``res`` and ``err``, which is +``nil`` on success and contains an error message on failure. + +``misc.sysprof.report`` returns a Lua table containing the +following counters: + +.. code-block:: console + + { + "samples" = int, + "INTERP" = int, + "LFUNC" = int, + "FFUNC" = int, + "CFUNC" = int, + "GC" = int, + "EXIT" = int, + "RECORD" = int, + "OPT" = int, + "ASM" = int, + "TRACE" = int + } + +Parameter opts for the ``misc.sysprof.start`` can contain the +following parameters: + +.. code-block:: console + + { + mode = 'D'/'L'/'C', -- 'D' = DEFAULT, 'L' = LEAF, 'C' = CALLGRAPH + interval = 10, -- sampling interval in msec. + path = '/path/to/file' -- location to store profile data. + } + +Mode MUST be provided always, interval and path are optional. +The default interval is 10 msec, default path is ``sysprof.bin``.