Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uprobes: make C++ symbol demangling explicit #2657

Merged
merged 7 commits into from
Jul 20, 2023

Conversation

viktormalik
Copy link
Contributor

Introduce a new part of uprobe specification which allows to explicitly enable language-specific features. For now, C++ prefixes (:c++ and :cpp) are supported. Specifying one of these turns on symbol demangling which is now disabled by default.

Some examples of the current behaviour

# cat cpptest.cpp
extern "C" int fun() { return 1; }
void fun(int x) {}
void function() {}
class Foo { void fun() {} };
class Bar { void fun() {} };

# nm cpptest
0000000000401126 T fun
0000000000401148 T _Z3funi
0000000000401131 T _Z8functionv
0000000000401184 W _ZN3Bar3funEv
0000000000401178 W _ZN3Foo3funEv

# 
bpftrace -e 'u:./cpptest:fun { exit(); }'
Attaching 1 probe...    (C "fun" only)

bpftrace -e  'u:./cpptest:c++:fun { exit(); }
Attaching 3 probes...   ("fun", "fun(int)", "function")

bpftrace -e 'u:./cpptest:c++"fun(int)" { exit(); }' 
Attaching 1 probe...    ("fun(int)")

bpftrace -e 'u:./cpptest:c++:*fun { exit(); }'
Attaching 5 probes...   (all the functions)

bpftrace -e 'u:./cpptest:c++:"*::fun" { exit(); }'
Attaching 2 probes...   ("Foo::fun" and "Bar::fun")

While this is not perfect (e.g. the second example should not match function), it's better than what it was before (see #2186). I'd prefer using this as a starting point and then following with more PRs rather than doing everything at once.

Missing features include:

  • more sophisticated matching of mangled symbols (i.e. fix the above 2nd example)
  • better support for namespaces (e.g. support ::fun)
  • support for demangling in probe listing

Note that this is a breaking change b/c demangling is not done by default now. However, since it was never documented and it sometimes had incorrect and unexpected behaviour, I believe that this is the correct step forward.

Resolves #2186.

Checklist
  • Language changes are updated in man/adoc/bpftrace.adoc and if needed in docs/reference_guide.md
  • User-visible and non-trivial changes updated in CHANGELOG.md
  • The new behaviour is covered by tests

@viktormalik viktormalik requested review from ajor, danobi and fbs as code owners June 27, 2023 12:05
@mmarchini
Copy link
Contributor

overall LGTM, definitely a breaking change as you said (but that seems fine in this case). Have you looked into how much work it would be to fix the 2nd example? That's the only missing thing I think would be worth including in this PR if the fix is not a large patch.

@viktormalik
Copy link
Contributor Author

Have you looked into how much work it would be to fix the 2nd example? That's the only missing thing I think would be worth including in this PR if the fix is not a large patch.

For this particular case, it's as simple as adding an extra ( to the end of the query. There are other cases, though, that'll require more thinking. Let me have a look, if we manage to review and merge this in the meantime, I'll send the fix as a follow-up PR ASAP.

@viktormalik
Copy link
Contributor Author

Have you looked into how much work it would be to fix the 2nd example? That's the only missing thing I think would be worth including in this PR if the fix is not a large patch.

For this particular case, it's as simple as adding an extra ( to the end of the query. There are other cases, though, that'll require more thinking. Let me have a look, if we manage to review and merge this in the meantime, I'll send the fix as a follow-up PR ASAP.

It turned out not to be too complicated to fix the bugs that I know of, so I updated this PR (see 17beb2c).

src/parser.yy Outdated Show resolved Hide resolved
src/probe_matcher.cpp Outdated Show resolved Hide resolved
docs/reference_guide.md Outdated Show resolved Hide resolved
Add a new parameter to ProbeMatcher::get_matches_in_stream that allows
to control whether symbols should be demangled. For now, always set the
parameter to true.

The purpose of this is to prepare for an improved handling of symbol
demangling where we will only turn it on in certain cases.
The BPFtrace::resolve_uname function doesn't search for function symbols
which causes BPFtrace::add_probe to always perform a ProbeMatcher
search, even if the attach target exists in the binary. This is rather
inefficient so better expand the symbol search to functions, too.

Note that this is a potentially breaking change for binaries combining C
and C++ symbols. The reason is that if the attachment target precisely
matches a symbol name in the binary, the ProbeMatcher search will now be
skipped and we'll potentially not attach to some mangled C++ functions
that we attached to before. This situation should occur very rarely,
though.
Add a new part of probe definition which allows to specify the language
in which the traced binary is written. This will allow to enable
language-specific features.

For now, allow C++ prefix "cpp".
At this point, specifying the prefix has no effect. In future, giving
one of the above will allow C++ symbol demangling.

There are now four parts of uprobe specification and some of them are
optional, which may create tricky situations:
- 'uprobe:bin:func' attaches to 'bin' and uses no language-specific
  features,
- 'uprobe:bin:func ... -p PID' attaches to 'bin' with given PID (if the
  process runs 'bin') and uses no language-specific features,
- 'uprobe:cpp:func ... -p PID' attaches to PID and uses C++ features,
  this one is distinguished from the previous one by using one of the
  supported language prefixes as the second part of the attachpoint,
- 'uprobe:func ... -p PID' attaches to PID and uses no language-specific
  features.
The previous commit introduced a new part of uprobe specification - the
language prefix. For now, only C++ prefixes are supported.

This commits changes the behaviour of uprobes by only allowing symbol
demangling when the :cpp prefix is specified. This saves some
computation time and gives users a better control over bpftrace
behaviour. In addition, symbol demangling sometimes produces confusing
results - demangling support automatically adds leading and trailing
wildcards, which means that more functions than expected are sometimes
matched. Now, this behaviour does not happen for default uprobes.

Note that this is a breaking change b/c it is now required to specify
the :cpp prefix to enable symbol demangling (it was always enabled
previously).
When matching demangled symbols from the binary, we always add a
trailing wildcard to the search input b/c the demangled name contains
also the argument list (e.g. "fun(void)"). This, however, produces false
matches as "fun" would also match e.g. "function(void)".

This commit takes a different approach. Rather than adding a trailing
wildcard, remove the argument list from the demangled name that we're
matching against, unless the user explicitly specified the '(' character
in the search input meaning that he wants to match against function
arguments, too.

This allows to remove addition of the trailing wildcard. At the same
time, we remove addition of leading wildcard as it has no effect.

Expand unit tests to cover this situation.
For now adds just basic information about demangling and the C++ uprobe
prefix.
@danobi danobi merged commit 2a9db89 into bpftrace:master Jul 20, 2023
@viktormalik viktormalik deleted the cxx-demangling branch July 21, 2023 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mangled symbols overlap cannot be disambiguated
3 participants