-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BTF for kernel modules #2315
BTF for kernel modules #2315
Conversation
7c2db59
to
e130b1b
Compare
This pull request fixes 1 alert when merging e130b1b into 05cfa8e - view on LGTM.com fixed alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionality-wise, this is great! I do worry about how it scales with modules/complexity of BTF though, a lot of the work is eager/iterates over everything. There may be a future "make all this lazy" issue, especially if we start dumping module types.
Indeed, I have the same impression. I gave it some more thought and came with some observations/ideas:
Does this make sense? Any ideas how we could do even better? |
Okay, how about this:
This would save us from walking all BTF types, though we'd still have to iterate over all modules because of the ID, FD tuple.
I think there's value in this syntax extension regardless! However, I think it should be |
And what about parsing
While I see your point, I think we should stay consistent with other probe types. E.g. tracepoints have the form |
That's what I meant by "filter to attachable" above. I didn't realize it contains the module name, so, sure, we don't need kallsyms for this filtering. Most reasons to look at kallsyms are relevant for kprobe anyway, which doesn't matter here.
Sure, my opinion is very weakly held. 😃 Besides, this module specifier would be mostly useful for wildcard resolution (duplicate symbols are not something that people would normally encounter), in which case function/pattern at the end makes sense. |
e130b1b
to
a2be663
Compare
Version 2 with several major improvements:
More implementation details can be found in commit messages. The PR is now quite large, so I'm happy to break it if it simplifies reviewing. Note: there is a BPF verifier bug which causes incorrect attachment when dealing with multiple functions from different modules (or vmlinux) having the same name. In practice, it will always attach to the first function of that name found in kallsyms. I'm working on a fix but I think that we can proceed with this independently as the situation should occur rarely. If someone finds it useful, we can show a warning or mention it in the reference guide. |
a2be663
to
0ccc434
Compare
0ccc434
to
92f1d9f
Compare
586591f
to
9b6c4f8
Compare
9b6c4f8
to
834f846
Compare
This pull request fixes 1 alert when merging 834f846 into d470220 - view on LGTM.com fixed alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the as-lazy-as-possible approach! The new interactions between FieldAnalyser and ClangParser were a little hard to figure out but I think that's just me not being familiar with the older code.
More implementation details can be found in commit messages.
I wish there was a nice architectural diagram somewhere that we could change as part of this rework. Oh well.
I'm very much of the opinion that we should merge this and bake it pre-release instead of trying to iron out every possible edge case in the PR.
When listing kfunc params, the params are indented with two spaces. The test checks that there exists a line starting with whitespaces, but that also matches the newline character from the previous line, leading to false negative results. Fix this by checking for spaces only (instead of any whitespace). Also, remove an incorrect grep call which would filter out the lines with parameters.
There are 2 places where we basically reimplement the same thing that libbpf does: - finding and parsing vmlinux BTF data - this is implemented by btf__load_vmlinux_btf - finding BTF id of a function type in BTF::resolve_args - this can be done using btf__find_by_name_kind Using the mentioned libbpf's functions allows to remove large portions of code.
This is a preparation step for supporting BTF for kernel modules. Several notable refactorings are done: - Operations based on iterating through all the types in vmlinux BTF are refactored in a way that the BTF to iterate may be supplied as a parameter. The original methods are called with the vmlinux BTF, i.e., there should be no functional change for now. At the same time, BTF::btf is renamed to BTF::vmlinux_btf. - A function for looking up a BTF entry is created. It'll be later expanded by searching the kernel modules BTF, too. - The above changes often require to pass a BTF id along with the source BTF containing the id definition. To facilitate this, a new type BTF::BTFId is created, which encapsulates these information.
When parsing /sys/kernel/debug/tracing/available_filter_functions, store also the module name for each symbol. Note that multiple modules may define a function with the same name. BPFtrace::traceable_funcs_ becomes a map function-->set of modules. This will be useful for handling BTF for modules as we'll know which modules are used and won't have to iterate BTF for all of them.
Almost all the features that require BTF are now supported for kernel modules. These include namely: - k(ret)func are available for module functions, including access to function arguments by name - listing of types defined in modules The only feature that is not supported is automatic resolution of types defined in modules. The reason is that this requires BTF type dumping, which is complicated for modules as we would have to deal with type redefinition conflicts. Also, for a lot of modules, dumping may take too much time, so it is better to resolve this differently anyway. The solution will come in the upcoming commits. There is a major change in the way BTF is parsed. Instead of parsing it directly from the constructor of BPFtrace, it is rather parsed on-demand (BTF is now stored inside a unique_ptr in the BPFtrace class). The reason for this is that we first parse the attachpoints, gather the list of kernel modules that are used, and then only parse BTF for those modules. This keeps the approach quite fast as we usually need to parse only several modules. BTF for vmlinux is parsed every time. The list of modules for the attached functions is obtained from the list of traceable functions (available_filter_functions from tracefs). When wildcards are used, this approach causes a chicken-and-egg problem in the probe matcher as it doesn't have BTF when parsing the attachpoints for the first time (to get the list of modules). So, we let it read the list of symbols for k(ret)func from available_filter_functions if BTF is not available. Note that this is not entirely precise (not all traceable functions have a BTF entry) but it is sufficient to get the list of modules.
Even though bpftrace represents enums using int type, the size of an enum must be parsed in a different way from BTF.
Resolve struct fields directly when parsing BTF instead of offloading the job to ClangParser. This allows to work with types whose definitions cannot be resolved using the Clang parser, e.g. types defined in kernel modules (as struct def dumping from module BTFs is not supported now). Bitfields and data_loc fields are not supported for now. Also, since querying BTF may now take longer with kernel modules, FieldAnalyser always tries to resolve types from DWARF first.
If ClangParser has nothing to resolve (i.e. there are no user includes, no user type definitions, and no types to resolve from BTF), it doesn't have to run, which will save some execution time. This requires some changes to FieldAnalyser, the most important one is that it only adds those types to BPFtrace::btf_set_ which it cannot resolve on its own. In practice, this means that ClangParser will run in a much smaller amount of cases than before. Note: this brings some speedup in FieldAnalyser thanks to the fact that it now tracks types using SizedType (instead of type name) and doesn't have to call BTF::type_of that often.
The tests check basic BTF-based features, namely attachement to kfunc, access to args by name, and resolution of struct types from BTF. The tests attach to the nft_trans_alloc_gfp function from the nf_tables module which is called every time a new nftables table is created. Each test creates a temporary table called "bpftrace" which is afterwards deleted. Requires nf_tables module to be loaded and 'nft' command to be installed in /usr/sbin/nft, otherwise the tests are skipped. The module is also manually loaded in CI to make sure that the tests are executed.
Since FieldAnalyser is getting more and more functionality, it makes sense to make a separate unit test suite for it and move existing tests there. There are 2 kinds of tests corresponding to the main responsibilities of FieldAnalyser: - Tests for resolution of function args from BTF/DWARF (originally located in SemanticAnalyser tests). - Tests for resolution of struct type fields from BTF/DWARF (originally located in ClangParser tests). These are also expanded to check that FieldAnalyser is now able to correctly parse struct types (without running ClangParser).
When BTF for a kfunc cannot be loaded, fail with appropriate error message instead of generic "loading program failed". In addition, when trying to attach to multiple attachpoints, do not fail hard if one of them does not exist (this behaviour is same to kprobes).
With BTF for kernel modules supported, it may be useful to allow users to specify the module along with the function name for kfuncs. This brings several new possibilities such as: - attaching to all functions in a given module - specifying a module in case there are multiple functions/types with the same name "vmlinux" can be used as module name to pull functions from vmlinux. Specifying the module is optional - if no module is specified, all loaded modules are searched. This should guarantee backwards compatibility of all existing scripts. The only visible change from the current behaviour is in function listing since the format "kfunc:module:function" is yielded now: # bpftrace -l 'kfunc:vfs_*' kfunc:vmlinux:vfs_cancel_lock kfunc:vmlinux:vfs_clean_context kfunc:vmlinux:vfs_cleanup_quota_inode ... This is also reflected in the reference guide and the manpage. There is one known issue - when trying to attach to multiple functions of the same name (in different modules), BPF verifier only allows to attach the first one and fails on attaching the others. We try to detect such a situation and ask the user to specify a module explicitly, to at least partially mitigate the issue.
The BTF class sets the libbpf print function which is not the right place to do it for 2 reasons: (1) when using BTF as a library, there is no reason to change libbpf's behaviour and (2) other parts of bpftrace use libbpf, too. Therefore, move setting of the print function to main.cpp.
834f846
to
f0a9daf
Compare
There is a nice picture in internals_development.md, although it will be partially outdated by now. Still, I can have a look into it while updating the development guide. |
This pull request fixes 1 alert when merging f0a9daf into 216d2cb - view on LGTM.com fixed alerts:
|
I read through most of this PR and it looks quite reasonable. |
Thanks for the reviews, let's not prolong this and merge. |
This introduces support for BTF information for kernel modules.
BTF for kernel modules is nowadays shipped by many distros (it is exposed via
/sys/kernel/btf/<module-name>
). Using it enables BTF-depending features for modules, in particular:This is quite tricky to test, though, as it either requires to insmod a custom module before running the tests or to rely on some common module that should be available on most systems. In the end, I went with the second way - the runtime tests hook into the
nft_trans_alloc_gfp
function from thenf_tables
module which is pretty common and is invoked by thenft
utility. The module is now also loaded in CI to make sure that the tests are executed there.This PR also features one important change - FieldAnalyser now fully resolves structure types from BTF/DWARF (including struct fields) and Clang parser is only called if there are some user-defined types/includes or if FieldAnalyser was not able to resolve something (for instance, it doesn't support bitfields at the moment). This should bring some execution speedup and is also crucial for type resolution from kernel modules since BTF dumping doesn't work properly for them.
In addition, there are several minor changes done:
REQUIRES
fields (all must be satisfied for the test to run)FieldAnalyser
were moved into a separate test suite (they were previously scattered betweenClangParser
andSemanticAnalyser
tests)Checklist
man/adoc/bpftrace.adoc
and if needed indocs/reference_guide.md
CHANGELOG.md