Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple candidates for *btf.Func error when loading bpf objects #723

Closed
harai opened this issue Jun 27, 2022 · 17 comments
Closed

multiple candidates for *btf.Func error when loading bpf objects #723

harai opened this issue Jun 27, 2022 · 17 comments
Labels
bug Something isn't working

Comments

@harai
Copy link

harai commented Jun 27, 2022

Describe the bug

I got the following error when I tried to load BPF objects:

	objs := bpfObjects{}
	if err := loadBpfObjects(&objs, nil); err != nil {
		log.Fatalf("loading objects: %v", err)
	}

output:

2022/06/27 11:26:24 loading objects: field FentryDoSysMprotect: program fentry_do_sys_mprotect: attach Tracing/TraceFEntry: find target for fentry __x64_sys_mprotect: type __x64_sys_mprotect: multiple candidates for *btf.Func

I found multiple __x64_sys_mprotect entries in /sys/kernel/btf/vmlinux:

$ bpftool btf dump file /sys/kernel/btf/vmlinux format raw | grep __x64_sys_mprotect
[11698] FUNC '__x64_sys_mprotect' type_id=11464 linkage=static
[23528] FUNC '__x64_sys_mprotect' type_id=6062 linkage=static

However, one of them is named as __unused:

...
[2] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
...
[175] STRUCT 'pt_regs' size=168 vlen=21
        'r15' type_id=2 bits_offset=0
        'r14' type_id=2 bits_offset=64
        'r13' type_id=2 bits_offset=128
        'r12' type_id=2 bits_offset=192
        'bp' type_id=2 bits_offset=256
...
[4215] CONST '(anon)' type_id=175
...
[4220] PTR '(anon)' type_id=4215
...
[6062] FUNC_PROTO '(anon)' ret_type_id=36 vlen=1
        'regs' type_id=4220
...
[11461] FWD 'pt_regs' fwd_kind=struct
[11462] CONST '(anon)' type_id=11461
[11463] PTR '(anon)' type_id=11462
[11464] FUNC_PROTO '(anon)' ret_type_id=36 vlen=1
        '__unused' type_id=11463
...
[11698] FUNC '__x64_sys_mprotect' type_id=11464 linkage=static
...
[23528] FUNC '__x64_sys_mprotect' type_id=6062 linkage=static
...

I don't know much about BTF, but I think this entry should be ignored when parsing to avoid this kind of error.

To Reproduce

$ go version
go version go1.18.3 linux/amd64

$ uname -a
Linux ip-10-5-0-115.ap-northeast-1.compute.internal 5.10.112-108.499.amzn2.x86_64 #1 SMP Wed Apr 27 23:39:40 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/issue
\S
Kernel \r on an \m

$ bpftool btf dump file /sys/kernel/btf/vmlinux format raw | grep __x64_sys_mprotect
[11698] FUNC '__x64_sys_mprotect' type_id=11464 linkage=static
[23528] FUNC '__x64_sys_mprotect' type_id=6062 linkage=static

Expected behavior

BTF entries named as __unused are simply ignored when parsing.

@harai harai added the bug Something isn't working label Jun 27, 2022
@harai
Copy link
Author

harai commented Jun 27, 2022

FUNC entry with __unused parameter might be the correct one because the only __x64_sys_mprotect FUNC entry in Ubuntu has the parameter.

I have no idea how to handle this kind of problem.

@lmb
Copy link
Collaborator

lmb commented Jun 27, 2022 via email

@harai
Copy link
Author

harai commented Jun 27, 2022

@lmb

FUNC entry with __unused parameter might be the correct one because the only __x64_sys_mprotect FUNC entry in Ubuntu has the parameter.

This machine is Ubuntu 20.04 LTS with newer kernel installed:

$ uname -a
Linux harai-XPS-13-9300 5.13.0-48-generic #54~20.04.1-Ubuntu SMP Thu Jun 2 23:37:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

@ti-mo
Copy link
Collaborator

ti-mo commented Jun 28, 2022

@lmb I think you missed this text in the OP: 5.10.112-108.499.amzn2.x86_64

My host kernel doesn't have any duplicates FUNCs that I can immediately spot, but a bunch of syscalls point to an FWD pt_regs:

[15747] FWD 'pt_regs' fwd_kind=struct
[15748] CONST '(anon)' type_id=15747
[15749] PTR '(anon)' type_id=15748
[15750] FUNC_PROTO '(anon)' ret_type_id=40 vlen=1
        '__unused' type_id=15749
[15751] FUNC '__ia32_sys_rseq' type_id=15750 linkage=static
[15752] FUNC '__x64_sys_rseq' type_id=15750 linkage=static
[15753] FUNC '__ia32_sys_setuid16' type_id=15750 linkage=static
[15754] FUNC '__x64_sys_setuid16' type_id=15750 linkage=static

while a bunch of others point to a satisfied pt_regs:

[191] STRUCT 'pt_regs' size=168 vlen=21
        'r15' type_id=1 bits_offset=0
        'r14' type_id=1 bits_offset=64
        'r13' type_id=1 bits_offset=128
... <snip> ...
[3955] CONST '(anon)' type_id=191
[3960] PTR '(anon)' type_id=3955
[7163] FUNC_PROTO '(anon)' ret_type_id=40 vlen=1
        'regs' type_id=3960
[7166] FUNC '__x64_sys_arch_prctl' type_id=7163 linkage=static
[7635] FUNC '__x64_sys_iopl' type_id=7163 linkage=static
[7637] FUNC '__x64_sys_ioperm' type_id=7163 linkage=static
[7725] FUNC '__x64_sys_modify_ldt' type_id=7163 linkage=static

This seems to be about 50/50, I have 234 with satisfied pt_regs and 198 with FWDs. 🤔

It's not clear to me what we should do in case there's a duplicate. I have plenty of in-use syscalls (even __x64_sys_bpf 🙂) with __unused, so simply ignoring them is not an option.

@harai Could you experiment by replacing the call to spec.TypeByName(typeName, &targetFunc) in findTargetInKernel() to a call to spec.AnyTypesByName(typeName) and picking one out of the two results? Do both Funcs work for attaching fentry?

@harai
Copy link
Author

harai commented Jun 29, 2022

@ti-mo Since I've found many syscalls with such duplication, I investigated recvmsg, which I know well enough.

recvmsg with __unused

Although the attached function is called when entering the syscall, I got an error when I tried to read struct pt_regs *regs value:

ev->fd = PT_REGS_PARM1_CORE(regs);

error:

2022/06/29 10:28:35 loading objects: field SysRecvfrom: program sys_recvfrom: load program: permission denied: invalid bpf_context access off=0 size=8 (6 line(s) omitted)

recvmsg with regs

This works fine. No errors occur when reading regs value.

@harai
Copy link
Author

harai commented Jun 29, 2022

I run into this issue on Amazon Linux 2. Since it offers a newer kernel, I've tested if updating the kernel resolves the issue:

sudo amazon-linux-extras install kernel-5.15
$ uname -a
Linux ip-10-1-1-209.ap-northeast-1.compute.internal 5.15.43-20.123.amzn2.x86_64 #1 SMP Fri May 27 00:28:44 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

And it does. No duplicated entries after the update:

$ bpftool btf dump file /sys/kernel/btf/vmlinux format raw | grep sys_mprotect
[15669] FUNC '__ia32_sys_mprotect' type_id=15442 linkage=static
[15670] FUNC '__x64_sys_mprotect' type_id=15442 linkage=static
$ bpftool btf dump file /sys/kernel/btf/vmlinux format raw | grep sys_recvmsg
[15692] FUNC '__ia32_compat_sys_recvmsg' type_id=15442 linkage=static
[15693] FUNC '__ia32_sys_recvmsg' type_id=15442 linkage=static
[15694] FUNC '__x64_sys_recvmsg' type_id=15442 linkage=static
[82393] FUNC '__sys_recvmsg' type_id=82392 linkage=static
[82395] FUNC '__sys_recvmsg_sock' type_id=82394 linkage=static
[82397] FUNC '___sys_recvmsg' type_id=82396 linkage=static
[82399] FUNC '____sys_recvmsg' type_id=82398 linkage=static

So if you are running Amazon Linux 2 and run into this issue, updating kernel might be the easiest way to avoid it.

@harai
Copy link
Author

harai commented Jun 29, 2022

Updating to the newer kernel caused another issue.

Now the only recvmsg entry is with __unused parameter:

[15439] FWD 'pt_regs' fwd_kind=struct
[15440] CONST '(anon)' type_id=15439
[15441] PTR '(anon)' type_id=15440
[15442] FUNC_PROTO '(anon)' ret_type_id=34 vlen=1
        '__unused' type_id=15441
...
[15694] FUNC '__x64_sys_recvmsg' type_id=15442 linkage=static

and I always get the error mentioned previously when I tried to read regs:

2022/06/29 14:48:40 loading objects: field SysRecvfrom: program sys_recvfrom: load program: permission denied: invalid bpf_context access off=0 size=8 (5 line(s) omitted)

I suspect FWD entry has something to do with this behavior.

@lmb
Copy link
Collaborator

lmb commented Jul 5, 2022

Can you dump the full verifier log with log.Fatalf("loading objects: %+v", err) please?

@harai
Copy link
Author

harai commented Jul 6, 2022

@lmb I got the following full verifier log with %+v:

2022/07/06 18:52:14 loading objects: permission denied:
        arg#0 type is not a struct
        Unrecognized arg#0 type PTR
        ; int BPF_PROG(sys_recvfrom, struct pt_regs *regs) {
        0: (79) r6 = *(u64 *)(r1 +0)
        func '__x64_sys_recvfrom' arg0 type FWD is not a struct
        invalid bpf_context access off=0 size=8
        processed 1 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

@lmb
Copy link
Collaborator

lmb commented Jul 8, 2022

I think you're right, the problem is that 15439 is a forward declaration. Do you have an sys_recvfrom entry that doesn't have a forward declaration?

This could be a bug in how pahole generates BTF at kernel build time.

lmb added a commit to lmb/ebpf that referenced this issue Jul 8, 2022
Add context if the last verifier log line indicates a context access
error.

See cilium#723
@harai
Copy link
Author

harai commented Jul 9, 2022

@lmb No. This is the only entry.

ti-mo pushed a commit that referenced this issue Jul 19, 2022
Add context if the last verifier log line indicates a context access
error.

See #723
@lmb
Copy link
Collaborator

lmb commented Jul 20, 2022

@harai a bit of a stab in the dark: do you know whether the kernels were compiled with gcc or with clang?

I think this might be a case for [email protected]

@harai
Copy link
Author

harai commented Jul 20, 2022

@lmb I got the following kernel configuration:

$ less /boot/config-$(uname -r)
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 5.15.43-20.123.amzn2.x86_64 Kernel Configuration
#
CONFIG_CC_VERSION_TEXT="gcc10-gcc (GCC) 10.3.1 20210422 (Red Hat 10.3.1-1)"
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=100301
CONFIG_CLANG_VERSION=0
CONFIG_AS_IS_GNU=y
CONFIG_AS_VERSION=23500
CONFIG_LD_IS_BFD=y
CONFIG_LD_VERSION=23500
CONFIG_LLD_VERSION=0
CONFIG_CC_CAN_LINK=y
CONFIG_CC_CAN_LINK_STATIC=y
CONFIG_CC_HAS_ASM_GOTO=y
CONFIG_CC_HAS_ASM_INLINE=y
CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_TABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y
...

@harai
Copy link
Author

harai commented Oct 22, 2022

@lmb Kernel developers are working on this issue now:

@lmb
Copy link
Collaborator

lmb commented Oct 24, 2022

That's great news! I'll follow the mailing list post and close this issue for now.

@lmb lmb closed this as completed Oct 24, 2022
@harai
Copy link
Author

harai commented Nov 11, 2022

This issue has been fixed in Linux 5.15.78.

@lmb
Copy link
Collaborator

lmb commented Nov 11, 2022

Great, thanks for pushing this upstream!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants