-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libbpf-tools: Add experimental BTFGen integration #3889
libbpf-tools: Add experimental BTFGen integration #3889
Conversation
Hi! Thanks for working on this! I didn't manage to get it working locally for some reason (I think I followed your instructions but I got no BTFs). Makefile was also failing until I did But here are some overall thoughts:
where opts are bpf_object_open_opts. Now, to the question about bpftool. Given bpftool is now mirrored into Github repo, I think we should add bpftool as submodule and build it from sources. This bypasses any need to pre-checking it, any architecture questions. We'll just build it on demand. WDYT? |
d7a19d8
to
0b1b193
Compare
Perhaps you're running a system with an older glibc and bpftool can't be run there? Or you forgot to uncompress the BTF files?
That's weird, the Makefile should create it https://github.com/kinvolk/bcc/blob/mauricio/btfgen-integration/libbpf-tools/Makefile#L135
Even if I agree with this, I'm not in charge of BTFHub, let's ping @rafaeldtinoco on this one.
I tried to do it but hit a problem. I'm using a variable
to get the list of BTF files and create dependencies between them and the .bpf.o files. This variable is populated when the Makefile is read, so if I include a step to clone btfhub that variable will be empty. Right now I don't have a clear solution for this on mind, do you any idea?
I used
I agree. Is it better to receive an array of bytes or would it be better to receive the
I agree, it's not so nice to have a binary checked on repo. I'll update it once btfgen makes it to the mirror. |
No, it was my usual dev server, where everything BPF related works just fine, but I think I missed the uncompression step, which probably explains why I didn't get any BTFs packaged.
@rafaeldtinoco, any opinion? I'm trying to minimize the amount of data that needs to be downloaded during build if btfgen is enabled. Having per-architecture repository or one repository with per-architecture submodule (if that's more convenient) would make sure we don't download unnecessary gigabytes of data.
You are trying to do it declaratively though Makefile rules. It would be easier and simpler to do it imperatively. When we need to rebuild tools, do an explicit step of checking out or git pull'ing latest BTFs, then find all relevant files, minimize BTFs, and then link them into final binaries.
Yeah, I cross-compiling is always harder and should be solved later (if there is enough interest).
I think raw bytes are move convenient (especially for applications that will embed minimized BTFs as is, without compression). It's analogous to
@qmonnet can you please help with the sync of bpftool into Github to unblock this for btfgen stuff? |
Sure thing! I was waiting for your latest libbpf update, so I'd pick the |
5a95333
to
0057aa5
Compare
I handled some of the feedback and updated all the tools that are compatible with it. I'll try to handle missing comments shortly. |
0057aa5
to
7ad5b11
Compare
I ended up splitting the makefile in two, moving the btfgen specific logic to a new one. The first makefile clones and updates the btfhub repository (if needed) and btfgen.mk creates all "reduced" BTF files using declarative rules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this dropped off my radar for a while, sorry for later review.
I did the first pass, I intend to play a bit more with it.
One more thing besides my comments, I didn't notice any .gitignore changes, we'll need to make sure all the temporary stuff is gitignored properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So few more things after I played with this locally.
-
Ignore what I said about .gitignore, you've put everything under .output, so all good.
-
Some dependencies are off in Makefile, I get a lot of errors similar to below:
/bin/sh: /data/users/andriin/bcc/libbpf-tools/.output/bpftool/bpftool: No such file or directory
make[1]: *** [btfgen.mk:29: /data/users/andriin/bcc/libbpf-tools/.output/min_core_btfs/amzn/2/x86_64/4.14.101-91.76.amzn2.x86_64.btf] Error 127
Please check that and fix.
But even when it does succeed building, I don't see _binary_min_core_btfs_tar_gz_start
symbol and running under GDB I get -EOPNOTSUPP from ensure_core_btf(). How did you test this? Am I missing some extra steps. I just did this:
$ make ENABLE_MIN_CORE_BTFS=1 -j90
@rafaeldtinoco, ping? Any thoughts? |
Sorry, was recovering from surgery, I'll visit this tomorrow and provide some feedback. |
7ad5b11
to
6c8a0a1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some dependencies are off in Makefile, I get a lot of errors similar to below:
There was a missing dependency on bpftool.
How did you test this?
I tested on a clean VM (Ubuntu 20.04.4) and the compilation worked fine by just using make ENABLE_MIN_CORE_BTFS=1 -j$(nproc)
.
Are the BTF files being generated?
$ find .output/min_core_btfs/ -name "*.btf" | xargs du -ch | tail -n 1
3.3M total
Are symbols present on .output/min_core_btf_tar.o
?
$ objdump --syms .output/min_core_btf_tar.o
.output/min_core_btf_tar.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l d .data 0000000000000000 .data
0000000000015f39 g .data 0000000000000000 _binary_min_core_btfs_tar_gz_end
0000000000015f39 g *ABS* 0000000000000000 _binary_min_core_btfs_tar_gz_size
0000000000000000 g .data 0000000000000000 _binary_min_core_btfs_tar_gz_start
$ make ENABLE_MIN_CORE_BTFS=1 -j90
I'm only trying with 8 or 16 cores. Perhaps there was still some dependencies problems in the Makefile that don't have effect with a lower number of cores 🤔. Also, have you tried to run make clean
and compile again?
there were BTFs downloaded, but there were no symbols. After pulling your latest changes I do see symbols now, so much have been some race condition in Makefile or something. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's almost ready, just few remaining small issues.
@@ -1,4 +1,5 @@ | |||
/.output | |||
/btfhub-archive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's call it just "btfhub"? archive implies it's outdated and not maintained anymore. So for consistency with MAkefile, let's call it "btf-hub" here and $BTF_HUB in Makefile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this one. There are two repositories related to btfhub: https://github.com/aquasecurity/btfhub & https://github.com/aquasecurity/btfhub-archive/. We are only using the later, so I think it's fine to call it BTFHUB_ARCHIVE to avoid confusion with the first one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's confusing, but ok, no problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we've split the scripts/tooling/docs from the actual archive. There were major discussion where to keep the BTF files back then, and, to be honest, github was what made it to have a higher reach (among projects)... Unfortunately we don't keep different arches (maybe they could be kept as diff branches). I'm just afraid of breaking other projects (as we currently have 4 or 5 projects relying in BTFHUB now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, separate branches is an interesting idea! You can still master branch that merge all the arch-specific branches each time you get an update. So no one should break, but anyone who wants only architecture-specific BTFs can do that without paying for all the other architectures.
6c8a0a1
to
360cbc3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one more tiny leak of resources, please fix. But otherwise looks and seem to work great (I don't have system without BTF to check that part, though).
@davemarchevsky, @yonghong-song with that minor fix it should be ready to go in
360cbc3
to
6e8fa3f
Compare
Thank you @mika for pointing this PR. In general, project wide, do we have some table about the compatibility with different versions of external dependencies ? If I remember correct, @copyninja was already having some integration issues when building recent versions of |
LGTM @mauriciovasquezbernal can you rebase? Will press button after |
bpftool now has a mirror on Github[0], let's pull it as a submodule and compile when needed instead of shipping the bpftool binary directly. [0]: https://github.com/libbpf/bpftool Signed-off-by: Mauricio Vásquez <[email protected]
This commit adds an experimental BTFGen[0] integration to allow some of the libbpf-tools to run in systems that don't provide BTF information. The whole process consist of two parts: (1) generating and embedding the BTF file within the tools binary and (2) using those when the tool is run. The first part is done by using the Makefile, it generates the reduced BTF files for the different eBPF objects of the tools, those files are then compressed and a C header file with its content is created. The second part is handled by a new C file that provides the logic to uncompress and save the BTF file according to the Linux distribution and kernel version where the tools is being run. [0] https://lore.kernel.org/bpf/[email protected]/ Signed-off-by: Mauricio Vásquez <[email protected]>
Use save_min_core_btf() to uncompress and save the BTF file for the current system. If the file is available, then use it when opening the object. Signed-off-by: Mauricio Vásquez <[email protected]>
Signed-off-by: Mauricio Vásquez <[email protected]>
6e8fa3f
to
b95542c
Compare
@davemarchevsky I rebased and pushed. |
This PR enables an experimental integration with BTFGen. The goal of this is to extend the range of systems where these tools can be run, in this case to allow running some of the tools in systems that don't expose BTF information.
How to use?
Snip from README.md:
FAQ:
All the following numbers were gathered on a AMD Ryzen 7 3700X, 32GB RAM, SN750 NMVe system.
Why disabled by default?
Cloning btfhub-archive and generating the reduced BTF files requires some extra time. Since not all users are interested in this feature let's keep it disabled by default.
What's the compilation time overhead?
The overhead is about 30 seconds when compiling with
-j16
:What's the size overhead of this?
It's about 100KB per binary:
Why to compress the files?
The size of the uncompressed BTF files is about 3.3MB, compressed they are only ~100KB.
Also libz is already a dependency of these tools, so we can use it a runtime without introducing a new one.
The compression ratio is so high because many of the generated BTF files are the same. Even if we can try to fix it directly in bpftool, it's easier in the time being to use a compression process.
What's the runtime overhead?
Decompressing and saving the file takes around ~10ms. On systems that expose BTF the overhead should be minimal as there's some logic to return early avoiding decompressing and so on.
Why do you say only some tools?
Tools that use fentry/fexit hooks can't work with this because they require the kernel to expose BTF information.
Why only execsnoop and opensnoop are updated?
I'll update other tools once this approach makes sense for the maintainers.
TODO
/cc @anakryiko