You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copying files with reflink does not respect holes added with fallocate(FALLOC_FL_PUNCH_HOLE).
Describe how to reproduce the problem
To reproduce on a dataset with recordsize=128k:
dd if=/dev/random of=out bs=1M count=1 status=none
zpool sync
fallocate -p -o 262144 -l 524288 out
cp --debug out out.2
diff -q out out.2
Output:
'out' -> 'out.2'
copy offload: unknown, reflink: yes, sparse detection: unknown
Files out and out.2 differ
The difference:
$ diff -u <(hexdump -C out) <(hexdump -C out.2)
--- /dev/fd/63 2024-03-19 08:34:33.496600405 -0400
+++ /dev/fd/62 2024-03-19 08:34:33.494600367 -0400
@@ -16382,8 +16382,32774 @@
0003ffd0 39 0d 00 a3 91 2f 00 b6 a8 79 6d cd c9 83 d8 42 |9..../...ym....B|
0003ffe0 b2 42 4e c3 9b 79 aa 96 66 68 8d 14 57 4b 98 32 |.BN..y..fh..WK.2|
0003fff0 1f e7 36 ff 63 c0 b6 53 fd 81 cb 9e c5 bb 39 f9 |..6.c..S......9.|
-00040000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
-*
+00040000 5b 1c ca ce 08 10 72 cb b9 a6 a2 c0 15 84 79 93 |[.....r.......y.|
+00040010 6a c1 55 9f cb 30 bb a9 a2 05 8d cb 7b a0 a3 42 |j.U..0......{..B|
+00040020 7a 32 05 9a 2a f5 29 91 9f 48 25 12 ac 4e 6b 09 |z2..*.)..H%..Nk.|
+00040030 ac 04 21 13 43 89 e1 96 c3 11 f1 dd e0 31 3c e4 |..!.C........1<.|
+00040040 1d db de 92 f1 67 6d dc d1 d4 5d 72 ae d9 de 99 |.....gm...]r....|
... snip
+000bffc0 d7 2c 5d bb a0 3b 32 11 37 d1 24 49 b8 0d 88 fc |.,]..;2.7.$I....|
+000bffd0 ea 79 9d df 25 ae 3d 16 c6 fd 5c 64 b2 9f 56 f2 |.y..%.=...\d..V.|
+000bffe0 4e d6 5d 4c a9 0b 83 47 51 ac 06 5b ec 0c 49 61 |N.]L...GQ..[..Ia|
+000bfff0 de 7c 87 0d e8 bc 8e f4 e3 b2 ef 07 96 3c fd a6 |.|...........<..|
000c0000 e7 4e 28 f9 dc f8 f8 41 8b 1a d1 62 9d 4c f3 93 |.N(....A...b.L..|
000c0010 66 88 ad ef 46 ef 78 11 19 08 c0 cb 3c 1a d0 ce |f...F.x.....<...|
000c0020 ab ba c3 c5 38 c3 77 95 88 65 d5 b0 28 d5 61 93 |....8.w..e..(.a.|
The issue does not happen if zpool sync is added betweenfallocate and cp.
Include any warning/errors/backtraces from the system logs
Nothing of note
Possible root cause
Punched holes for full records are processed only at sync time. The L1 block(s) for the affected range get dirtied, but none of the L0 blocks get dirtied.
When holes are read back, dbuf_read_hole detects the record is within a freed range via dnode_block_freed and synthesizes a zeroed record. Note that unsynced holes still have the original non-hole blkptr set on the parent L1 block, and must be specifically skipped by reads if the dbuf is not cached until sync clears the pointer.
dmu_read_l0_bps does not check for detect holes and relies on dirtying the L0 record to detect unsynced changes, so zfs_clone_range gets it wrong. It 1) accepts the file is clean and 2) copies the stale blkptrs as part of the clone.
Adding some zdb outputs to the reproducer clearly shows the blocks are being miscopied:
System information
Describe the problem you're observing
Copying files with reflink does not respect holes added with
fallocate(FALLOC_FL_PUNCH_HOLE)
.Describe how to reproduce the problem
To reproduce on a dataset with
recordsize=128k
:Output:
The difference:
The issue does not happen if
zpool sync
is added betweenfallocate
andcp
.Include any warning/errors/backtraces from the system logs
Nothing of note
Possible root cause
Punched holes for full records are processed only at sync time. The L1 block(s) for the affected range get dirtied, but none of the L0 blocks get dirtied.
When holes are read back,
dbuf_read_hole
detects the record is within a freed range viadnode_block_freed
and synthesizes a zeroed record. Note that unsynced holes still have the original non-hole blkptr set on the parent L1 block, and must be specifically skipped by reads if the dbuf is not cached until sync clears the pointer.dmu_read_l0_bps
does not check for detect holes and relies on dirtying the L0 record to detect unsynced changes, sozfs_clone_range
gets it wrong. It 1) accepts the file is clean and 2) copies the stale blkptrs as part of the clone.Adding some zdb outputs to the reproducer clearly shows the blocks are being miscopied:
The text was updated successfully, but these errors were encountered: