Skip to content

Commit

Permalink
fix bitmap corruption on close_range() with CLOSE_RANGE_UNSHARE
Browse files Browse the repository at this point in the history
commit 9a2fa14 upstream.

copy_fd_bitmaps(new, old, count) is expected to copy the first
count/BITS_PER_LONG bits from old->full_fds_bits[] and fill
the rest with zeroes.  What it does is copying enough words
(BITS_TO_LONGS(count/BITS_PER_LONG)), then memsets the rest.
That works fine, *if* all bits past the cutoff point are
clear.  Otherwise we are risking garbage from the last word
we'd copied.

For most of the callers that is true - expand_fdtable() has
count equal to old->max_fds, so there's no open descriptors
past count, let alone fully occupied words in ->open_fds[],
which is what bits in ->full_fds_bits[] correspond to.

The other caller (dup_fd()) passes sane_fdtable_size(old_fdt, max_fds),
which is the smallest multiple of BITS_PER_LONG that covers all
opened descriptors below max_fds.  In the common case (copying on
fork()) max_fds is ~0U, so all opened descriptors will be below
it and we are fine, by the same reasons why the call in expand_fdtable()
is safe.

Unfortunately, there is a case where max_fds is less than that
and where we might, indeed, end up with junk in ->full_fds_bits[] -
close_range(from, to, CLOSE_RANGE_UNSHARE) with
	* descriptor table being currently shared
	* 'to' being above the current capacity of descriptor table
	* 'from' being just under some chunk of opened descriptors.
In that case we end up with observably wrong behaviour - e.g. spawn
a child with CLONE_FILES, get all descriptors in range 0..127 open,
then close_range(64, ~0U, CLOSE_RANGE_UNSHARE) and watch dup(0) ending
up with descriptor torvalds#128, despite torvalds#64 being observably not open.

The minimally invasive fix would be to deal with that in dup_fd().
If this proves to add measurable overhead, we can go that way, but
let's try to fix copy_fd_bitmaps() first.

* new helper: bitmap_copy_and_expand(to, from, bits_to_copy, size).
* make copy_fd_bitmaps() take the bitmap size in words, rather than
bits; it's 'count' argument is always a multiple of BITS_PER_LONG,
so we are not losing any information, and that way we can use the
same helper for all three bitmaps - compiler will see that count
is a multiple of BITS_PER_LONG for the large ones, so it'll generate
plain memcpy()+memset().

Reproducer added to tools/testing/selftests/core/close_range_test.c

Cc: [email protected]
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
  • Loading branch information
Al Viro authored and gregkh committed Sep 4, 2024
1 parent 463074c commit e807487
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 17 deletions.
30 changes: 13 additions & 17 deletions fs/file.c
Original file line number Diff line number Diff line change
Expand Up @@ -41,27 +41,23 @@ static void free_fdtable_rcu(struct rcu_head *rcu)
#define BITBIT_NR(nr) BITS_TO_LONGS(BITS_TO_LONGS(nr))
#define BITBIT_SIZE(nr) (BITBIT_NR(nr) * sizeof(long))

#define fdt_words(fdt) ((fdt)->max_fds / BITS_PER_LONG) // words in ->open_fds
/*
* Copy 'count' fd bits from the old table to the new table and clear the extra
* space if any. This does not copy the file pointers. Called with the files
* spinlock held for write.
*/
static void copy_fd_bitmaps(struct fdtable *nfdt, struct fdtable *ofdt,
unsigned int count)
static inline void copy_fd_bitmaps(struct fdtable *nfdt, struct fdtable *ofdt,
unsigned int copy_words)
{
unsigned int cpy, set;

cpy = count / BITS_PER_BYTE;
set = (nfdt->max_fds - count) / BITS_PER_BYTE;
memcpy(nfdt->open_fds, ofdt->open_fds, cpy);
memset((char *)nfdt->open_fds + cpy, 0, set);
memcpy(nfdt->close_on_exec, ofdt->close_on_exec, cpy);
memset((char *)nfdt->close_on_exec + cpy, 0, set);

cpy = BITBIT_SIZE(count);
set = BITBIT_SIZE(nfdt->max_fds) - cpy;
memcpy(nfdt->full_fds_bits, ofdt->full_fds_bits, cpy);
memset((char *)nfdt->full_fds_bits + cpy, 0, set);
unsigned int nwords = fdt_words(nfdt);

bitmap_copy_and_extend(nfdt->open_fds, ofdt->open_fds,
copy_words * BITS_PER_LONG, nwords * BITS_PER_LONG);
bitmap_copy_and_extend(nfdt->close_on_exec, ofdt->close_on_exec,
copy_words * BITS_PER_LONG, nwords * BITS_PER_LONG);
bitmap_copy_and_extend(nfdt->full_fds_bits, ofdt->full_fds_bits,
copy_words, nwords);
}

/*
Expand All @@ -79,7 +75,7 @@ static void copy_fdtable(struct fdtable *nfdt, struct fdtable *ofdt)
memcpy(nfdt->fd, ofdt->fd, cpy);
memset((char *)nfdt->fd + cpy, 0, set);

copy_fd_bitmaps(nfdt, ofdt, ofdt->max_fds);
copy_fd_bitmaps(nfdt, ofdt, fdt_words(ofdt));
}

static struct fdtable * alloc_fdtable(unsigned int nr)
Expand Down Expand Up @@ -330,7 +326,7 @@ struct files_struct *dup_fd(struct files_struct *oldf, int *errorp)
open_files = count_open_files(old_fdt);
}

copy_fd_bitmaps(new_fdt, old_fdt, open_files);
copy_fd_bitmaps(new_fdt, old_fdt, open_files / BITS_PER_LONG);

old_fds = old_fdt->fd;
new_fds = new_fdt->fd;
Expand Down
12 changes: 12 additions & 0 deletions include/linux/bitmap.h
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,18 @@ static inline void bitmap_copy_clear_tail(unsigned long *dst,
dst[nbits / BITS_PER_LONG] &= BITMAP_LAST_WORD_MASK(nbits);
}

static inline void bitmap_copy_and_extend(unsigned long *to,
const unsigned long *from,
unsigned int count, unsigned int size)
{
unsigned int copy = BITS_TO_LONGS(count);

memcpy(to, from, copy * sizeof(long));
if (count % BITS_PER_LONG)
to[copy - 1] &= BITMAP_LAST_WORD_MASK(count);
memset(to + copy, 0, bitmap_size(size) - copy * sizeof(long));
}

/*
* On 32-bit systems bitmaps are represented as u32 arrays internally, and
* therefore conversion is not needed when copying data from/to arrays of u32.
Expand Down

0 comments on commit e807487

Please sign in to comment.