Skip to content
This repository has been archived by the owner on Jun 18, 2024. It is now read-only.

memory leak in scx_ops_enable #142

Closed
xingyiz opened this issue Feb 14, 2024 · 3 comments
Closed

memory leak in scx_ops_enable #142

xingyiz opened this issue Feb 14, 2024 · 3 comments

Comments

@xingyiz
Copy link

xingyiz commented Feb 14, 2024

Syzkaller found a memory leak in scx_ops_enable. Looking at the code, it seems that the allocation is being allocated and freed properly so we're not exactly sure what causes the unreferenced object, but this issue has popped up quite frequently from Syzkaller that we thought you might want to take a look.

Commit version: 65c2651
Config file:
config.txt

Bug report:

BUG: memory leak
unreferenced object 0xffff88800b935600 (size 64):
  comm "scx_serialise", pid 736, jiffies 4295779783 (age 80.593s)
  hex dump (first 32 bytes):
    c0 07 68 83 ff ff ff ff 08 56 93 0b 80 88 ff ff  ..h......V......
    08 56 93 0b 80 88 ff ff 00 00 00 00 00 00 00 00  .V..............
  backtrace:
    [<00000000cf39be81>] kmemleak_alloc_recursive include/linux/kmemleak.h:42 [inline]
    [<00000000cf39be81>] slab_post_alloc_hook mm/slab.h:766 [inline]
    [<00000000cf39be81>] slab_alloc_node mm/slub.c:3478 [inline]
    [<00000000cf39be81>] __kmem_cache_alloc_node+0x1f1/0x310 mm/slub.c:3517
    [<00000000a2522a89>] kmalloc_trace+0x26/0xc0 mm/slab_common.c:1098
    [<00000000ffb5fd6b>] kmalloc include/linux/slab.h:600 [inline]
    [<00000000ffb5fd6b>] kzalloc include/linux/slab.h:721 [inline]
    [<00000000ffb5fd6b>] scx_ops_enable+0x234/0x1610 kernel/sched/ext.c:3632
    [<00000000975486ca>] bpf_struct_ops_link_create+0x21e/0x290 kernel/bpf/bpf_struct_ops.c:914
    [<00000000da36584a>] link_create kernel/bpf/syscall.c:4938 [inline]
    [<00000000da36584a>] __sys_bpf+0x3684/0x4970 kernel/bpf/syscall.c:5453
    [<00000000418ac434>] __do_sys_bpf kernel/bpf/syscall.c:5487 [inline]
    [<00000000418ac434>] __se_sys_bpf kernel/bpf/syscall.c:5485 [inline]
    [<00000000418ac434>] __x64_sys_bpf+0x48/0x60 kernel/bpf/syscall.c:5485
    [<0000000090b269c5>] do_syscall_x64 arch/x86/entry/common.c:52 [inline]
    [<0000000090b269c5>] do_syscall_64+0x45/0xf0 arch/x86/entry/common.c:83
    [<0000000094d75c5b>] entry_SYSCALL_64_after_hwframe+0x6e/0x76

A reproducer was found for this problem:

# {Threaded:false Repeat:true RepeatTimes:0 Procs:1 Slowdown:1 Sandbox:none SandboxArg:0 Leak:true NetInjection:false NetDevices:true NetReset:true Cgroups:true BinfmtMisc:true CloseFDs:true KCSAN:false DevlinkPCI:false NicVF:false USB:false VhciInjection:false Wifi:false IEEE802154:false Sysctl:true Swap:true UseTmpDir:true HandleSegv:true Repro:false Trace:false LegacyOptions:{Collide:false Fault:false FaultCall:0 FaultNth:0}}
r0 = socket$inet6_udplite(0xa, 0x2, 0x88)
setsockopt$IP6T_SO_SET_REPLACE(r0, 0x29, 0x40, &(0x7f00000040c0)=@security={'security\x00', 0xe, 0x7fffffe, 0x360, 0xffffffff, 0xf0, 0xf0, 0x0, 0xffffffff, 0xffffffff, 0x300, 0x300, 0x300, 0xffffffff, 0x4, 0x0, {[{{@ipv6={@mcast1, @dev, [], [], 'syzkaller1\x00', 'vxcan1\x00'}, 0x0, 0xa8, 0xf0}, @common=@unspec=@IDLETIMER={0x48, 'IDLETIMER\x00', 0x0, {0x0, 'syz1\x00'}}}, {{@ipv6={@local, @loopback, [], [], 'veth1_to_hsr\x00', 'wg1\x00'}, 0x0, 0xa8, 0xd0}, @common=@inet=@SET1={0x28}}, {{@ipv6={@empty, @private2, [], [], 'lo\x00', 'dummy0\x00'}, 0x0, 0xa8, 0xd0}, @common=@unspec=@STANDARD={0x28}}], {{'\x00', 0x0, 0xa8, 0xd0}, {0x28}}}}, 0x3c0) (rerun: 64)
@Byte-Lab
Copy link
Collaborator

Do we need to be calling kobject_put()?

@Byte-Lab
Copy link
Collaborator

Hmm, and it looks like we're not actually calling kfree() on the disable path. Let me set up the reproducer and write up a patch.

Byte-Lab added a commit that referenced this issue Feb 14, 2024
We're making two mistakes with respect to object lifecycle for
scx_root_kobj:

- We're not calling kobject_put() if kobject_init_and_add() returns an
  error code. According to the API for that function, we should be.

- We weren't kfree()'ing scx_root_kobj on the scx_ops_disable_workfn()
  path.

Address both of these. With these changes, we no longer trigger the
memory leak detected by Syzkaller in
#142. Without this patch,
we see the following with memleak enabled:

unreferenced object 0xffff8dd78a99f000 (size 64):
  comm "runner", pid 2383, jiffies 4294697813 (age 128.878s)
  hex dump (first 32 bytes):
    78 17 5f be ff ff ff ff 08 f0 99 8a d7 8d ff ff  x._.............
    08 f0 99 8a d7 8d ff ff 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<0000000034bda486>] __kmem_cache_alloc_node+0x109/0x1f0
    [<00000000a91612ee>] kmalloc_trace+0x29/0xa0
    [<00000000bfe2a4b7>] bpf_scx_reg+0x135/0xd80
    [<000000000e3a5728>] bpf_struct_ops_link_create+0xd7/0x130
    [<00000000b14e1608>] __sys_bpf+0x2a9/0x530
    [<000000003140300e>] __x64_sys_bpf+0x17/0x20
    [<00000000fa410384>] do_syscall_64+0x4d/0xf0
    [<000000008cbc0d51>] entry_SYSCALL_64_after_hwframe+0x6f/0x77

With it, we see no such entries.

Signed-off-by: David Vernet <[email protected]>
Byte-Lab added a commit that referenced this issue Feb 14, 2024
We're making two mistakes with respect to object lifecycle for scx_root_kobj:

- We're not calling kobject_put() if kobject_init_and_add() returns an
  error code. According to the API for that function, we should be.

- We weren't kobject_putt()'ing scx_root_kobj on the
  scx_ops_disable_workfn() path. We were calling kobject_del(), but this
  doesn't actually call the release callback under the hood. It just
  unlinks the kobject from the hierarchy.

Address both of these. With these changes, we no longer trigger the memory leak detected by Syzkaller in
#142. Without this patch, we see the following with memleak enabled:

unreferenced object 0xffff8dd78a99f000 (size 64):
  comm "runner", pid 2383, jiffies 4294697813 (age 128.878s)
  hex dump (first 32 bytes):
    78 17 5f be ff ff ff ff 08 f0 99 8a d7 8d ff ff  x._.............
    08 f0 99 8a d7 8d ff ff 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<0000000034bda486>] __kmem_cache_alloc_node+0x109/0x1f0
    [<00000000a91612ee>] kmalloc_trace+0x29/0xa0
    [<00000000bfe2a4b7>] bpf_scx_reg+0x135/0xd80
    [<000000000e3a5728>] bpf_struct_ops_link_create+0xd7/0x130
    [<00000000b14e1608>] __sys_bpf+0x2a9/0x530
    [<000000003140300e>] __x64_sys_bpf+0x17/0x20
    [<00000000fa410384>] do_syscall_64+0x4d/0xf0
    [<000000008cbc0d51>] entry_SYSCALL_64_after_hwframe+0x6f/0x77

With it, we see no such entries.

Fixes: e7a7781 ("scx: Add /sys/kernel/sched_ext interface")
Signed-off-by: David Vernet <[email protected]>
Byte-Lab added a commit that referenced this issue Feb 14, 2024
We're making two mistakes with respect to object lifecycle for scx_root_kobj:

- We're not calling kobject_put() if kobject_init_and_add() returns an
  error code. According to the API for that function, we should be.

- We weren't kobject_putt()'ing scx_root_kobj on the
  scx_ops_disable_workfn() path. We were calling kobject_del(), but this
  doesn't actually call the release callback under the hood. It just
  unlinks the kobject from the hierarchy.

Address both of these. With these changes, we no longer trigger the memory leak detected by Syzkaller in
#142. Without this patch, we see the following with memleak enabled:

unreferenced object 0xffff8dd78a99f000 (size 64):
  comm "runner", pid 2383, jiffies 4294697813 (age 128.878s)
  hex dump (first 32 bytes):
    78 17 5f be ff ff ff ff 08 f0 99 8a d7 8d ff ff  x._.............
    08 f0 99 8a d7 8d ff ff 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<0000000034bda486>] __kmem_cache_alloc_node+0x109/0x1f0
    [<00000000a91612ee>] kmalloc_trace+0x29/0xa0
    [<00000000bfe2a4b7>] bpf_scx_reg+0x135/0xd80
    [<000000000e3a5728>] bpf_struct_ops_link_create+0xd7/0x130
    [<00000000b14e1608>] __sys_bpf+0x2a9/0x530
    [<000000003140300e>] __x64_sys_bpf+0x17/0x20
    [<00000000fa410384>] do_syscall_64+0x4d/0xf0
    [<000000008cbc0d51>] entry_SYSCALL_64_after_hwframe+0x6f/0x77

With it, we see no such entries.

Fixes: e7a7781 ("scx: Add /sys/kernel/sched_ext interface")
Signed-off-by: David Vernet <[email protected]>
Byte-Lab added a commit that referenced this issue Feb 14, 2024
We're making two mistakes with respect to object lifecycle for scx_root_kobj:

- We're not calling kobject_put() if kobject_init_and_add() returns an
  error code. According to the API for that function, we should be.

- We weren't kobject_putt()'ing scx_root_kobj on the
  scx_ops_disable_workfn() path. We were calling kobject_del(), but this
  doesn't actually call the release callback under the hood. It just
  unlinks the kobject from the hierarchy.

Address both of these. With these changes, we no longer trigger the memory leak detected by Syzkaller in
#142. Without this patch, we see the following with memleak enabled:

unreferenced object 0xffff8dd78a99f000 (size 64):
  comm "runner", pid 2383, jiffies 4294697813 (age 128.878s)
  hex dump (first 32 bytes):
    78 17 5f be ff ff ff ff 08 f0 99 8a d7 8d ff ff  x._.............
    08 f0 99 8a d7 8d ff ff 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<0000000034bda486>] __kmem_cache_alloc_node+0x109/0x1f0
    [<00000000a91612ee>] kmalloc_trace+0x29/0xa0
    [<00000000bfe2a4b7>] bpf_scx_reg+0x135/0xd80
    [<000000000e3a5728>] bpf_struct_ops_link_create+0xd7/0x130
    [<00000000b14e1608>] __sys_bpf+0x2a9/0x530
    [<000000003140300e>] __x64_sys_bpf+0x17/0x20
    [<00000000fa410384>] do_syscall_64+0x4d/0xf0
    [<000000008cbc0d51>] entry_SYSCALL_64_after_hwframe+0x6f/0x77

With it, we see no such entries.

Fixes: e7a7781 ("scx: Add /sys/kernel/sched_ext interface")
Signed-off-by: David Vernet <[email protected]>
@Byte-Lab
Copy link
Collaborator

Thanks a lot for the report and for finding this bug. This should be addressed with #143. Could you please let us know if you still observe this issue?

htejun pushed a commit that referenced this issue Feb 17, 2024
Recently, when running './test_progs -j', I occasionally hit the
following errors:

  test_lwt_redirect:PASS:pthread_create 0 nsec
  test_lwt_redirect_run:FAIL:netns_create unexpected error: 256 (errno 0)
  #142/2   lwt_redirect/lwt_redirect_normal_nomac:FAIL
  #142     lwt_redirect:FAIL
  test_lwt_reroute:PASS:pthread_create 0 nsec
  test_lwt_reroute_run:FAIL:netns_create unexpected error: 256 (errno 0)
  test_lwt_reroute:PASS:pthread_join 0 nsec
  #143/2   lwt_reroute/lwt_reroute_qdisc_dropped:FAIL
  #143     lwt_reroute:FAIL

The netns_create() definition looks like below:

  #define NETNS "ns_lwt"
  static inline int netns_create(void)
  {
        return system("ip netns add " NETNS);
  }

One possibility is that both lwt_redirect and lwt_reroute create
netns with the same name "ns_lwt" which may cause conflict. I tried
the following example:
  $ sudo ip netns add abc
  $ echo $?
  0
  $ sudo ip netns add abc
  Cannot create namespace file "/var/run/netns/abc": File exists
  $ echo $?
  1
  $

The return code for above netns_create() is 256. The internet search
suggests that the return value for 'ip netns add ns_lwt' is 1, which
matches the above 'sudo ip netns add abc' example.

This patch tried to use different netns names for two tests to avoid
'ip netns add <name>' failure.

I ran './test_progs -j' 10 times and all succeeded with
lwt_redirect/lwt_reroute tests.

Signed-off-by: Yonghong Song <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Tested-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants