Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd test-copy test fails for lseek(SEEK_HOLE/SEEK_DATA) – punched holes not reported as such #13261

Closed
nabijaczleweli opened this issue Mar 28, 2022 · 6 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@nabijaczleweli
Copy link
Contributor

System information

Type Version/Name
Distribution Name Debian
Distribution Version Bullseye
Kernel Version Linux tarta 5.10.0-11-amd64 #1 SMP Debian 5.10.92-2 (2022-02-28) x86_64 GNU/Linux
Architecture amd64
OpenZFS Version zfs-2.1.2-1.1

Describe the problem you're observing

$ ./build/test-copy
/* test_copy_file */
/* test_copy_file_fd */
/* test_copy_tree */
/* test_copy_bytes */
/* test_copy_bytes_regular_file */
test_copy_bytes_regular_file_one try_reflink=no max_bytes=18446744073709551615
test_copy_bytes_regular_file_one try_reflink=yes max_bytes=18446744073709551615
test_copy_bytes_regular_file_one try_reflink=no max_bytes=1000
test_copy_bytes_regular_file_one try_reflink=yes max_bytes=1000
test_copy_bytes_regular_file_one try_reflink=no max_bytes=32000
test_copy_bytes_regular_file_one try_reflink=yes max_bytes=32000
/* test_copy_atomic */
/* test_copy_proc */
/* test_copy_holes */
Assertion 'lseek(fd_copy, 0, SEEK_HOLE) == 0' failed at src/test/test-copy.c:363, function int test_copy_holes(void)(). Aborting.
Aborted

Describe how to reproduce the problem

Clone https://github.com/systemd/systemd (I'm at 6b72105a780c0b88c81f979ce6d5651c5dee9b86) but that shouldn't matter, meson build, ninja -C build test.

Include any warning/errors/backtraces from the system logs

The test in question is

        assert_se(fstat(fd, &stat) >= 0);
        blksz = stat.st_blksize;
        buf = alloca0(blksz);

        /* We need to make sure to create hole in multiples of the block size, otherwise filesystems (btrfs)
         * might silently truncate/extend the holes. */

        assert_se(lseek(fd, blksz, SEEK_CUR) >= 0);
        assert_se(write(fd, buf, blksz) >= 0);
        assert_se(lseek(fd, 0, SEEK_END) == 2 * blksz);
        /* Only ftruncate() can create holes at the end of a file. */
        assert_se(ftruncate(fd, 3 * blksz) >= 0);
        assert_se(lseek(fd, 0, SEEK_SET) >= 0);

        assert_se(copy_bytes(fd, fd_copy, UINT64_MAX, COPY_HOLES) >= 0);

        /* Test that the hole starts at the beginning of the file. */
        assert_se(lseek(fd_copy, 0, SEEK_HOLE) == 0);
        /* Test that the hole has the expected size. */
        assert_se(lseek(fd_copy, 0, SEEK_DATA) == blksz);
        assert_se(lseek(fd_copy, blksz, SEEK_HOLE) == 2 * blksz);
        assert_se(lseek(fd_copy, 2 * blksz, SEEK_DATA) < 0 && errno == ENXIO);

        /* Test that the copied file has the correct size. */
        assert_se(fstat(fd_copy, &stat) >= 0);
        assert_se(stat.st_size == 3 * blksz);

Which creates a file consisting of [128k hole (sought)] + [128k data] + [128k hole (truncated)].

The copied data is correct, I've verified that manually, but the lseek(SEEK_DATA/SEEK_HOLE) always return no holes. Consider the strace for ext4:

writev(2, [{iov_base="/* test_copy_holes */", iov_len=21}, {iov_base="\n", iov_len=1}], 2) = 22
umask(077)                              = 022
getpid()                                = 2864651
openat(AT_FDCWD, "/tmp/ext4/test-copy-hole-fd-aobaiF", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0600) = 3
umask(022)                              = 077
umask(077)                              = 022
getpid()                                = 2864651
openat(AT_FDCWD, "/tmp/ext4/test-copy-hole-fd2-JX1HkD", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0600) = 4
umask(022)                              = 077
fallocate(3, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 0, 1) = 0
fstat(3, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
lseek(3, 4096, SEEK_CUR)                = 4096
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
lseek(3, 0, SEEK_END)                   = 8192
ftruncate(3, 12288)                     = 0
lseek(3, 0, SEEK_SET)                   = 0
/* copy_bytes start */
lseek(3, 0, SEEK_CUR)                   = 0
lseek(3, 0, SEEK_DATA)                  = 4096
lseek(4, 0, SEEK_CUR)                   = 0
lseek(4, 0, SEEK_END)                   = 0
ftruncate(4, 4096)                      = 0
lseek(4, 0, SEEK_END)                   = 4096
lseek(3, 4096, SEEK_HOLE)               = 8192
lseek(3, 4096, SEEK_SET)                = 4096
copy_file_range(3, NULL, 4, NULL, 4096, 0) = 4096
lseek(3, 0, SEEK_CUR)                   = 8192
lseek(3, 8192, SEEK_DATA)               = -1 ENXIO (No such device or address)
lseek(3, 0, SEEK_END)                   = 12288
lseek(4, 0, SEEK_CUR)                   = 8192
lseek(4, 0, SEEK_END)                   = 8192
ftruncate(4, 12288)                     = 0
lseek(4, 0, SEEK_END)                   = 12288
lseek(3, 12288, SEEK_HOLE)              = -1 ENXIO (No such device or address)
/* copy_bytes end */
lseek(4, 0, SEEK_HOLE)                  = 0
lseek(4, 0, SEEK_DATA)                  = 4096
lseek(4, 4096, SEEK_HOLE)               = 8192
lseek(4, 8192, SEEK_DATA)               = -1 ENXIO (No such device or address)
fstat(4, {st_mode=S_IFREG|0600, st_size=12288, ...}) = 0

with the one for ZFS:

umask(077)                              = 022
getpid()                                = 2847592
openat(AT_FDCWD, "/var/tmp/test-copy-hole-fd-KnDiDM", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0600) = 3
umask(022)                              = 077
umask(077)                              = 022
getpid()                                = 2847592
openat(AT_FDCWD, "/var/tmp/test-copy-hole-fd2-H1aAFM", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0600) = 4
umask(022)                              = 077
fallocate(3, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 0, 1) = 0
fstat(3, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
lseek(3, 131072, SEEK_CUR)              = 131072
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0....", 131072) = 131072
lseek(3, 0, SEEK_END)                   = 262144
ftruncate(3, 393216)                    = 0
lseek(3, 0, SEEK_SET)                   = 0
/* copy_bytes start */
lseek(3, 0, SEEK_CUR)                   = 0
lseek(3, 0, SEEK_DATA)                  = 0
lseek(3, 0, SEEK_HOLE)                  = 393216
lseek(3, 0, SEEK_SET)                   = 0
copy_file_range(3, NULL, 4, NULL, 393216, 0) = 393216
lseek(3, 0, SEEK_CUR)                   = 393216
lseek(3, 393216, SEEK_DATA)             = -1 ENXIO (No such device or address)
lseek(3, 0, SEEK_END)                   = 393216
lseek(3, 393216, SEEK_HOLE)             = -1 ENXIO (No such device or address)
/* copy_bytes end */
lseek(4, 0, SEEK_HOLE)                  = 393216
writev(2, [{iov_base="Assertion 'lseek(fd_copy, 0, SEEK_HOLE) == 0' failed at src/test/test-copy.c:361, function int test_copy_holes(void)(). Aborting.", iov_len=129}, {iov_base="\n", iov_len=1}], 2) = 130

I don't think this is a big deal, or a deal at all (since, well, in the default configuration with compress=lz4 the whole file becomes a hole), but I'm posting this before I comment it out.

@nabijaczleweli nabijaczleweli added the Type: Defect Incorrect behavior (e.g. crash, hang) label Mar 28, 2022
@rincebrain
Copy link
Contributor

rincebrain commented Mar 28, 2022

#12724 says hello, I suspect.

@behlendorf
Copy link
Contributor

And #12746 which has been applied to the master branch, but not the 2.1 release branch.

@nabijaczleweli
Copy link
Contributor Author

nabijaczleweli commented Mar 29, 2022

Heh, setting zfs_dmu_offset_next_sync=1 proceeds for one more syscall:

lseek(4, 0, SEEK_HOLE)                  = 0
lseek(4, 0, SEEK_DATA)                  = -1 ENXIO (No such device or address)

because it fails on the next assertion:

Assertion 'lseek(fd_copy, 0, SEEK_DATA) == blksz' failed at src/test/test-copy.c:362, function int test_copy_holes(void)(). Aborting.

Which is expected with compress=zstd; setting compress=off makes the test work.

Doing getrandom(buf, blksz, 0) (or memset(buf, 1, blksz)) before the write has the test succeeding in all configurations.

@behlendorf
Copy link
Contributor

Right, that's the expected behavior when compression is enabled.

@nabijaczleweli
Copy link
Contributor Author

nabijaczleweli commented Mar 29, 2022

Posted systemd fix in systemd/systemd#22902, and default configuration was fixed to what it expects (do we want to pull it into 2.1.5?), so closing.

@behlendorf
Copy link
Contributor

behlendorf commented Mar 29, 2022

We held off pulling it in to 2.1 release right away until we better understood any real world performance impact. It's now been in master for almost 4 months and we haven't heard and complaints, so I think it'd be reasonable to pull #12746 in for 2.1.5.

nabijaczleweli pushed a commit to nabijaczleweli/zfs that referenced this issue Mar 29, 2022
Strict hole reporting was previously disabled by default as a
performance optimization.  However, this has lead to confusion
over the expected behavior and a variety of workarounds being
adopted by consumers of ZFS.  Change the default behavior to
always report holes and force the TXG sync.

Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Upstream-commit: 05b3eb6
Ref: openzfs#13261
Closes openzfs#12746
behlendorf added a commit that referenced this issue Apr 1, 2022
Strict hole reporting was previously disabled by default as a
performance optimization.  However, this has lead to confusion
over the expected behavior and a variety of workarounds being
adopted by consumers of ZFS.  Change the default behavior to
always report holes and force the TXG sync.

Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Upstream-commit: 05b3eb6
Ref: #13261
Closes #12746
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

3 participants