-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce builtin to access percpu kernel data #3596
Conversation
When emitting debug info for global variables, emit real types instead of types that are internally used in codegen. This mainly applies to: - pointers which are internally represented as integers - structs which are internally represented as byte arrays Similarly to kernel functions, we will need to emit debug info for global variables that matches their BTF in the kernel so we need the actual types.
When emitting debug info for structs (which will be translated into BTF), the name *must not* contain the "struct" (or "union") prefix, otherwise our emitted debug info will not match the kernel BTF (which doesn't use the prefix) and the programs where the match is required will be rejected.
c21ec0f
to
85a2603
Compare
85a2603
to
570eb87
Compare
Since I don't generally like adding new builtins, I've got to ask: what are the benefits of this over reading from the kernel PERCPU data manually, a la: https://github.com/sched-ext/scx/blob/912d6e01c1538cb5ec7e6363332caef66f7765ba/scheds/rust/scx_layered/integration/layer_llc.bt#L22-L27 No bpftrace changes required and with user defined functions, the above approach could be wrapped in a simple stdlib function to make it easier to use. |
570eb87
to
cdf2075
Compare
Mostly convenience, at the moment. It will take a while until we get user-defined functions and a standard library and writing this manually is much more code. On top of that, we get some verifier checks for free, e.g. the mentioned check that users cannot do pointer arithmetic on the returned value without a prior null-check. For stdlib bpftrace function, we'd have to define those checks on our own. That being said, I understand you reluctance to add new helpers but I feel like the standard library is still quite some time away and, in the meantime, we'll need to address some real-world use-cases which require new builtins. I'm wondering if we could establish a plan that once the standard library is in place, we will remove some (potentially many) builtin functions in favor of stdlib functions? That would allow us to add builtins for the time being but we would also be transparent about the fact that they will go away from the language eventually. |
cdf2075
to
138d142
Compare
I can get behind this 👍 |
Is it possible for this to be seamless for users? Maybe we could detect what "builtin" functions are being used and automatically include them from the standard library (if there are no conflicts with other user-defined functions) |
Yeah, something like that. It very much depends on the include model that we end up with but I'm sure we'll be able to find a solution which is transparent for users. |
CallInst *IRBuilderBPF::CreatePerCpuPtr(Value *var, | ||
Value *cpu, | ||
const location &loc) | ||
{ | ||
// void *bpf_per_cpu_ptr(const void *percpu_ptr, u32 cpu) | ||
// Return: | ||
// A pointer pointing to the kernel percpu variable on | ||
// cpu, or NULL, if cpu is invalid. | ||
FunctionType *percpuptr_func_type = FunctionType::get( | ||
GET_PTR_TY(), { GET_PTR_TY(), getInt64Ty() }, false); | ||
return CreateHelperCall(libbpf::BPF_FUNC_per_cpu_ptr, | ||
percpuptr_func_type, | ||
{ var, cpu }, | ||
"per_cpu_ptr", | ||
&loc); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the bpf_this_cpu_ptr()
variant can never return null according to docs. Given that user needs to do null checks if accessing fields, probably worthwhile to use the this
variant when possible. So less null checks necessary. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, absolutely, I missed that one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can bake this into a semantic_analyser check. It's a bit confusing to users that they have to null check one variant and not the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not easy, I think. First, I wouldn't put it to semantic analyser but rather to a separate pass. And there, you'd have to introduce a context-sensitive analysis - you'd have to check that all field accesses of the pointer returned by the helper are done within a context where the pointer is not equal to NULL. I can't see a simple way of doing that.
Alternatively, we could inject the check ourselves into the AST or in codegen. But then, what would be the value for the other branch? Zero? That's a valid value for some fields so we would basically give users incorrect values.
The good thing here is that bpftrace will immediately inform you about the missing check so users don't need to remember anything, just add the check when the tool requires it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The good thing here is that bpftrace will immediately inform you about the missing check so users don't need to remember anything, just add the check when the tool requires it.
Yeah that's true. Maybe just update the adoc to say you don't have to null check when you are not supplying the cpu in the second arg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like something we can improve on in the verifier. If the CPU number is statically known and in the range of valid CPUs, verifier should be able to let user omit null check. cc @4ast
src/ast/passes/codegen_llvm.cpp
Outdated
std::string err; | ||
auto type = bpftrace_.btf_->get_var_type(var_name); | ||
if (type.IsNoneTy()) { | ||
throw FatalUserException("Could not resolve variable \"" + var_name + | ||
"\" from BTF"); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be done in semantic analyser so there's highlighting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, makes sense.
tests/runtime/call
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test missing symbol also?
138d142
to
64ee57e
Compare
64ee57e
to
f36ace9
Compare
tests/semantic_analyser.cpp
Outdated
@@ -1366,6 +1366,24 @@ TEST(semantic_analyser, call_kaddr) | |||
test("kprobe:f { kaddr(123); }", 1); | |||
} | |||
|
|||
TEST(semantic_analyser, call_percpu_kaddr) | |||
{ | |||
test("kprobe:f { percpu_kaddr(\"process_counts\"); }"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is process_counts
from host? I don't see it in btf data source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's from the host. It exists on every Linux system but it's still probably better to have it in data_source.
There exist "percpu" global variables in the kernel which contain a distinct value for each CPU. In BPF, access to these variables is done via the bpf_per_cpu_ptr and bpf_this_cpu_ptr helpers. Both accept a pointer to a percpu ksym and the former also accepts a CPU number. The ksym is an extern variable with a BTF entry matching the BTF of the corresponding kernel variable. Since we now use libbpf to do the loading, it is sufficient to emit a global variable declaration with a proper BTF and libbpf will take care of the rest. Introduce new bpftrace builtin percpu_kaddr to access the percpu data. The helper has two forms: percpu_kaddr("symbol_name") <- uses bpf_this_cpu_ptr percpu_kaddr("symbol_name", N) <- uses bpf_per_cpu_ptr where N is the CPU number. The former variant retrieves the value for the current CPU. A tricky part is that bpf_per_cpu_ptr may return NULL if the supplied CPU number is higher than the number of the CPUs. The BPF program should perform a NULL-check on the returned value, otherwise it is rejected by the verifier. In practice, this only happens if pointer arithmetics is used (i.e. a struct field is accessed). Since it is quite complex to detect a missing NULL-check in bpftrace, we instead let verifier do it and just display the potential verifier error in a nicer manner. The check if the global variable exists is done in semantic analyser to get better error highlighting. Therefore, for testing the new builtin in semantic analyser tests, we need to add a symbol (process_counts) into tests/data/data_source.c to get it into our mock BTF. The problem here is that pahole places only percpu variables into BTF (and none other) so the symbol must be in the ".data..percpu" section. To do that, we need to make data_source.o a relocatable file (using compiler's -c option), otherwise the linker would put the symbol back to ".data". This in turn breaks DWARF generation which seems to need a linked binary so we link data_source.o into data_source during the DWARF generation step.
f36ace9
to
efb61f6
Compare
All comments should be addressed. Adding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Nice work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yay, great to finally have this!
Introduce new builtin
percpu_kaddr
to access kernel percpu variables. It leverages thebpf_per_cpu_ptr
helper which accepts a pointer to a percpu ksym and a CPU number. The ksym is an extern variable with a BTF entry matching the BTF of the corresponding kernel variable. Since we now use libbpf to do the loading, it is sufficient to emit a global variable declaration with a proper BTF (debug info) and libbpf will take care of the rest.The helper has two forms:
where N is the CPU number. The former variant retrieves the value for the current CPU.
For instance, this allows to access the
process_counts
percpu variable:A tricky part is that
bpf_per_cpu_ptr
may return NULL if the supplied CPU number is higher than the number of the CPUs. The BPF program should perform a NULL-check on the returned value, otherwise it is rejected by the verifier. In practice, this only happens if pointer arithmetics is used (i.e. a struct field is accessed). Since it is quite complex to detect a missing NULL-check in bpftrace, we instead let verifier do it and just display the potential verifier error in a nicer manner:Resolves #1837.
Checklist
man/adoc/bpftrace.adoc
CHANGELOG.md