Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory corruption in hardlink tool #3330

Open
a-a-danilov opened this issue Dec 18, 2024 · 2 comments
Open

Memory corruption in hardlink tool #3330

a-a-danilov opened this issue Dec 18, 2024 · 2 comments

Comments

@a-a-danilov
Copy link

hardlink fails with various memory corruption errors (e.g. double free or corruption (!prev), munmap_chunk(): invalid pointer, malloc(): invalid size (unsorted)) on large datasets.

The output of valgrind hardlink ... shows the following trace:

==3793== Invalid write of size 8
==3793==    at 0x484E823: memset (vg_replace_strmem.c:1390)
==3793==    by 0x10FCC2: read_all (all-io.h:67)
==3793==    by 0x111589: get_digest (fileeq.c:444)
==3793==    by 0x11182F: get_cmp_data (fileeq.c:493)
==3793==    by 0x111978: ul_fileeq (fileeq.c:520)
==3793==    by 0x10EA0C: visitor (hardlink.c:1143)
==3793==    by 0x10F990: main (hardlink.c:1528)
==3793==  Address 0x4a983c0 is 0 bytes after a block of size 1,312 alloc'd
==3793==    at 0x4841878: malloc (vg_replace_malloc.c:446)
==3793==    by 0x11143E: get_digest (fileeq.c:427)
==3793==    by 0x11182F: get_cmp_data (fileeq.c:493)
==3793==    by 0x111978: ul_fileeq (fileeq.c:520)
==3793==    by 0x10EA0C: visitor (hardlink.c:1143)
==3793==    by 0x10F990: main (hardlink.c:1528)
==3793== 
==3793== Syscall param read(buf) points to unaddressable byte(s)
==3793==    at 0x4986B71: read (in /lib64/libc.so.6)
==3793==    by 0x10FCDC: read_all (all-io.h:69)
==3793==    by 0x111589: get_digest (fileeq.c:444)
==3793==    by 0x11182F: get_cmp_data (fileeq.c:493)
==3793==    by 0x111978: ul_fileeq (fileeq.c:520)
==3793==    by 0x10EA0C: visitor (hardlink.c:1143)
==3793==    by 0x10F990: main (hardlink.c:1528)
==3793==  Address 0x4a983c0 is 0 bytes after a block of size 1,312 alloc'd
==3793==    at 0x4841878: malloc (vg_replace_malloc.c:446)
==3793==    by 0x11143E: get_digest (fileeq.c:427)
==3793==    by 0x11182F: get_cmp_data (fileeq.c:493)
==3793==    by 0x111978: ul_fileeq (fileeq.c:520)
==3793==    by 0x10EA0C: visitor (hardlink.c:1143)
==3793==    by 0x10F990: main (hardlink.c:1528)

(the source code @ 25e62e5 was compiled with CFLAGS=-g )

During my own investigation I found at least two problematic places in lib/fileeq.c and misc-utils/hardlink.c.

  1. Function ul_fileeq_set_size in lib/fileeq.c rounds down both readsiz and blocksmax variables when larger readsize is used

    util-linux/lib/fileeq.c

    Lines 273 to 280 in 25e62e5

    /* enlarge readsize for large files */
    if (nreads > maxdigs)
    readsiz = filesiz / maxdigs;
    break;
    }
    eq->readsiz = readsiz;
    eq->blocksmax = filesiz / readsiz;

    In the unfortunate case get_digest may request one extra block when reading the tail of the file.

The following commands reproduce this error:

mkdir -p short
dd if=/dev/urandom of=short/file0 bs=43007 count=1
for x in {1..255}; do cp -a short/file0 short/file$x; done
hardlink --dry-run --io-size 1k --cache-size 328k short

We create 256 identical files with size 43007 bytes. With reduced io-size and cache-size we get the following debug output with ULFILEEQ_DEBUG=16:

27116: ulfileeq:       EQ: [0x559c7e28f1a0]: set sizes: filesiz=43007, maxblocks=41, readsiz=1048

Note, that 43007 > 32 + 41*1048 = 43000

This problem can be fixed with the following patch:

diff --git a/lib/fileeq.c b/lib/fileeq.c
index 1f7f90ccb..37c6cdea5 100644
--- a/lib/fileeq.c
+++ b/lib/fileeq.c
@@ -272,12 +272,12 @@ size_t ul_fileeq_set_size(struct ul_fileeq *eq, uint64_t filesiz,
                nreads = filesiz / readsiz;
                /* enlarge readsize for large files */
                if (nreads > maxdigs)
-                       readsiz = filesiz / maxdigs;
+                       readsiz = (filesiz+maxdigs-1) / maxdigs;
                break;
        }
 
        eq->readsiz = readsiz;
-       eq->blocksmax = filesiz / readsiz;
+       eq->blocksmax = (filesiz+readsiz-1) / readsiz;
 
        DBG(EQ, ul_debugobj(eq, "set sizes: filesiz=%ju, maxblocks=%" PRIu64 ", readsiz=%zu",
                                eq->filesiz, eq->blocksmax, eq->readsiz));
  1. The second problem is not so obvious. The size of per-file cache size may change during iterations inside visitor leading to small values of blocksmax during memory allocation for data->blocks buffers at early iterations and large values of blocksmax during get_digest calls at later iterations
    /* calculate per file max memory use */
    nnodes = count_nodes(master);
    if (!nnodes)
    continue;
    /* per-file cache size */
    memsiz = opts.cache_size / nnodes;
    /* filesiz, readsiz, memsiz */
    ul_fileeq_set_size(&fileeq, master->st.st_size, opts.io_size, memsiz);

    The overall idea of changing the readsiz value during iterations inside visitor looks very bad to me. If we call ul_fileeq_set_size and change the readsiz value, then we should also invalidate and recalculate all cached digests. I think ul_fileeq_set_size should be called only once during iterations inside visitor for consistency.

The following commands reproduce this error:

mkdir -p resize
truncate -s 2560 resize/a0 resize/b0
echo '20: 01' | xxd -r - resize/a0
echo '20: 02' | xxd -r - resize/b0
touch -d @1271828183 resize/a0
touch -d @1314159265 resize/b0
for x in {1..255}; do cp -a resize/a0 resize/a$x; cp -a resize/b0 resize/b$x; done
hardlink --dry-run --ignore-time --keep-oldest --io-size 1k --cache-size 32k resize

We create two sets of 256 identical files with the same 32 byte header and different first block digests. If we reduce io-size and cache-size and organize iterations with file modification time ordering, then we get the following debug output with ULFILEEQ_DEBUG=16:

28653: ulfileeq:       EQ: [0x562bab4441a0]: set sizes: filesiz=2560, maxblocks=2, readsiz=1536
...
28653: ulfileeq:       EQ: [0x562bab4441a0]: set sizes: filesiz=2560, maxblocks=3, readsiz=1024

Note, that blocksmax=2 was used for memory allocation for all file candidates, while blocksmax=3 was used during digests calculation for files in the second batch of identical files.

@karelzak
Copy link
Collaborator

1. Function `ul_fileeq_set_size` in `lib/fileeq.c` rounds down both `readsiz` and `blocksmax` variables when larger readsize is used
   https://github.com/util-linux/util-linux/blob/25e62e5e57f2e8cd04a63c7a9499ce9c81047925/lib/fileeq.c#L273-L280
   
   In the unfortunate case `get_digest` may request one extra block when reading the tail of the file.

It seems you're right. The calculation should be more robust. I'll make sure to use your suggestion.

2. The second problem is not so obvious. The size of per-file cache size may change during iterations inside `visitor` leading to small values of `blocksmax` during memory allocation for `data->blocks` buffers at early iterations and large values of `blocksmax` during `get_digest` calls at later iterations
   https://github.com/util-linux/util-linux/blob/25e62e5e57f2e8cd04a63c7a9499ce9c81047925/misc-utils/hardlink.c#L1093-L1101
   
   The overall idea of changing the `readsiz` value during iterations inside `visitor` looks very bad to me. If we call `ul_fileeq_set_size` and change the `readsiz` value, then we should also invalidate and recalculate all cached digests. I think `ul_fileeq_set_size` should be called only once during iterations inside `visitor` for consistency.

Each node in the tree contains files of the same size, and ul_fileeq_set_size() is called when we switch from one node (size) to another. All settings should be reset by ul_fileeq_data_deinit(). It means the ul_fileeq_set_size() setting should be valid for all files within the next for (other = master->next; other != NULL; other = other->next).

The following commands reproduce this error:

I have no time to try it, but I will do it later. Thanks for all the reproduces!

28653: ulfileeq:       EQ: [0x562bab4441a0]: set sizes: filesiz=2560, maxblocks=2, readsiz=1536
...
28653: ulfileeq:       EQ: [0x562bab4441a0]: set sizes: filesiz=2560, maxblocks=3, readsiz=1024

Note, that blocksmax=2 was used for memory allocation for all file candidates, while blocksmax=3 was used during digests calculation for files in the second batch of identical files.

There should be an ul_fileeq_data_deinit() call between the EQ: [0x562bab4441a0]: set sizes: calls, and it deallocates the data->blocks. What I'm not sure about is eq->buf_a and eq->buf_b. These buffers are allocated only once in get_buffer() but not reset in ul_fileeq_data_deinit(). It seems fragile.

karelzak added a commit to karelzak/util-linux-work that referenced this issue Dec 26, 2024
The current code rounds down the values for readsiz and blocksmax,
which is incorrect. The sizes must be large enough to match the files.

Addresses: util-linux#3330
Signed-off-by: Karel Zak <[email protected]>
@karelzak
Copy link
Collaborator

I think ul_fileeq_set_size() needs to deallocate eq->buf_a and eq->buf_b if already allocated.

karelzak added a commit that referenced this issue Jan 3, 2025
The current code rounds down the values for readsiz and blocksmax,
which is incorrect. The sizes must be large enough to match the files.

Addresses: #3330
Signed-off-by: Karel Zak <[email protected]>
(cherry picked from commit 70c1ffb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants