-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SEEK_DATA fails randomly #11900
Comments
@zmedico fyi |
Any chance #11910 is related? |
See https://bugs.chromium.org/p/chromium/issues/detail?id=1198679 for more of the nitty-gritty debugging. |
@ismell it's not clear to me from the comment above, but does the patch you posted above avoid the issue in your testing? One plausible way you might try and reproduce this is to write to the file while performing the |
No the patch above was just to confirm that the pointer was at the start of the file. I haven't yet been able to find an easy way to reproduce the problem. There should be no writers to the input file at this point. I'll keep working on trying to find an easy way to reproduce it. Is there anything else I should be tracing that will help? |
One thing you could try to rule out the dirty data case would be to add an |
Thanks for the tip. So I wrote a script that calls No patch
With fsyncdiff --git a/src/portage_util_file_copy_reflink_linux.c b/src/portage_util_file_copy_reflink_linux.c
index 352342c06..004b8618a 100644
--- a/src/portage_util_file_copy_reflink_linux.c
+++ b/src/portage_util_file_copy_reflink_linux.c
@@ -140,8 +140,10 @@ do_lseek_data(int fd_out, int fd_in, off_t *off_out) {
/* Use lseek SEEK_DATA/SEEK_HOLE for sparse file support,
* as suggested in the copy_file_range man page.
*/
- off_t offset_data, offset_hole;
+ off_t offset_data, offset_hole, current;
+ fsync(fd_in);
+ current = lseek(fd_in, 0, SEEK_CUR);
offset_data = lseek(fd_in, *off_out, SEEK_DATA);
if (offset_data < 0) {
if (errno == ENXIO) {
It looks like adding |
@ismell if you're comfortable rebuilding ZFS for the testing I could provide a debug patch which should give us a little more insight. |
I'm sure I can figure it out :) |
In that case, would you mind testing commit f73e0a6 from the https://github.com/behlendorf/zfs/tree/zfs-holey branch. It implements a possible fix by closing a small race. It's only been lightly tested since I wasn't able to reproduce the issue. |
Sorry for the long cycle time. I was waiting until I was forced to reboot my computer. So I tried
This normally reproduced within a few iterations. So maybe it was fixed already? |
To be clear you're saying that even without the path you're unable to reproduce the issue now? There weren't any other recent changes in this area which I would have expected to fix this. |
The dn_struct_rwlock lock which prevents the dnode structure from changing should be held over the dirty check and the subsequent call to dnode_next_offset(). This ensures that a clean dnode can't be dirtied before the the data/hole is located. Furthermore, refactor the code to provide a dnode_is_dirty() helper function which performs the multilist_link_active() check while holding the correct multilist lock. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#11900
So I went back to the original version:
It looks like I can't reproduce the issue anymore...
I'm wondering if rebooting helped clear what ever bad state I was in... |
I've started hitting a similar issue which manifests as More precisely:
I eventually somehow figured I should try downgrading coreutils from 9.0 -> 8.32(-r1 in Gentoo) which made the issue disappear: rebuilding Go worked! coreutils 9.0 changed some default behaviours around reflinking, which ZFS doesn't have, and sparse files, but I tried forcing them off in the I then decided to bisect and arrived at:
I've straced Portage (just for the lseek syscall) while installing Go and it is filled with:
TL;DR: I seem to be hitting this bug as a result of changes in coreutils which exposes the problem. Please let me know if you feel this is a different issue and I should open a new bug. system information (emerge --info)Portage 3.0.23 (python 3.10.0-candidate-2, default/linux/amd64/17.1/hardened, gcc-11.2.0, glibc-2.33-r7, 5.14.7-adry x86_64) ================================================================= System uname: Linux-5.14.7-adry-x86_64-11th_Gen_Intel-R-_Core-TM-_i7-11800H_@_2.30GHz-with-glibc2.33 KiB Mem: 16105692 total, 1991680 free KiB Swap: 0 total, 0 free Timestamp of repository gentoo: Sun, 26 Sep 2021 20:51:27 +0000 Head commit of repository gentoo: fe49ef8e5b7b42f9fcb1f7bf36fbf4c571b8b120 EDIT: Interestingly, upstream made a kind-of-related remark in a commit after the release: Upstream commit mentioning ZFStests: cp/sparse-perf: make more robust and add zfs comments
@behlendorf I've tried applying the patch you suggested to ZFS 2.1.1 but unfortunately, while it applies, it ends up failing at build time. Is there any chance you're able to please rebase it? The failure looks like:
|
The dn_struct_rwlock lock which prevents the dnode structure from changing should be held over the dirty check and the subsequent call to dnode_next_offset(). This ensures that a clean dnode can't be dirtied before the data/hole is located. Furthermore, refactor the code to provide a dnode_is_dirty() helper function which performs the multilist_link_active() check while holding the correct multilist lock. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#11900
@thesamesam thanks for continuing the investigation of this issue. I've rebased my https://github.com/behlendorf/zfs/tree/zfs-holey branch and resolved the build failure (commit 1722451). If you can give it a try that would be great. |
@behlendorf Thanks! (Note: I hadn't been able to reproduce this on other ZFS'd machines with pretty similar setups, so I/O speed might be a factor, which I guess is obvious given it's racey). Applied to 2.1.1 and built fine (rebuilt initramfs and so on, and rebooted) but no luck. Same corrupted I'm happy to enable more debugging as necessary, especially given I know this is tricky given you can't reproduce it on your end. Here's a partial strace during this cp line from the ebuild (I've just picked out the bits around copying the corrupted binaries, but I can get you more if it's useful):
EDIT: I did out of curiosity try |
openzfs/zfs#11900 Signed-off-by: Georgy Yakovlev <[email protected]>
That's very surprising. When set ZFS should always report the correct holes at the cost of performance. Were you able to reproduce this behavior with both the unmodified and patched code? |
Yeah, I just tried again unpatched and it happens regardless of the value of the toggle (0 or 1). Happy to try enabling more debugging if can think of a way to toggle it on/off easily during the merge. |
Looking at the strace output what appears to be happening is:
The story is similar for |
Ah, it gets renamed during the build process:
Here's the output for 10 lines after every instance of the path being mentioned:
You're welcome to the full 550MB of strace spew if you want it rather than me trying to clumsily filter, but I appreciate that's no holiday for you either! |
The dn_struct_rwlock lock which prevents the dnode structure from changing should be held over the dirty check and the subsequent call to dnode_next_offset(). This ensures that a clean dnode can't be dirtied before the data/hole is located. Furthermore, refactor the code to provide a dnode_is_dirty() helper function which checks the dnode for any dirty records to determine its dirtiness. Finally, hold the znodes range lock to prevent any raciness with concurrent reads or writes. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#11900
@thesamesam would you mind trying this updated patch with your test setup, 6338227. It would be great if you could try it with |
This can probably be closed now, unless someone's reproduced it since the fixes? |
When using lseek(2) to report data/holes memory mapped regions of the file were ignored. This could result in incorrect results. To handle this zfs_holey_common() was updated to asynchronously writeback any dirty mmap(2) regions prior to reporting holes. Additionally, while not strictly required, the dn_struct_rwlock is now held over the dirty check to prevent the dnode structure from changing. This ensures that a clean dnode can't be dirtied before the data/hole is located. The range lock is now also taken to ensure the call cannot race with zfs_write(). Furthermore, the code was refactored to provide a dnode_is_dirty() helper function which checks the dnode for any dirty records to determine its dirtiness. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Rich Ercolani <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#11900 Closes openzfs#12724
In addition to flushing memory mapped regions when checking holes, commit de198f2 modified the dirty dnode detection logic to check the dn->dn_dirty_records instead of the dn->dn_dirty_link. Relying on the dirty record has not be reliable, switch back to the previous method. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#11900 Closes openzfs#12745
This reverts commit 4b3133e671b958fa2c915a4faf57812820124a7b upstream. See #14753 - possible corruption again, very similar symptoms to the nightmare that was #11900 and same area of code. We can safely revert it as it's an optimisation rather than a bugfix in itself. Keen Gentoo users may remember the following commits (in Gentoo): - 8e5626d - 9fb275f - 95c250a Bug: openzfs/zfs#14753 Bug: openzfs/zfs#11900 Signed-off-by: Sam James <[email protected]>
openzfs/zfs#11900 Signed-off-by: Georgy Yakovlev <[email protected]>
openzfs/zfs#12724 openzfs/zfs#11900 Signed-off-by: Georgy Yakovlev <[email protected]>
Seems corruption still happening with coreutils-9 even with the SEEK patch applied Bug: https://bugs.gentoo.org/815469 openzfs/zfs#11900 Signed-off-by: Georgy Yakovlev <[email protected]>
Preliminary investigation suggests this patch (and earlier iterations) may introduce regressions. Let's yank it for now while we continue the investigation with upstream. Bug: https://bugs.gentoo.org/815469 See: openzfs/zfs#11900 Signed-off-by: Sam James <[email protected]>
Preliminary investigation suggests this patch (and earlier iterations) may introduce regressions. Let's yank it for now while we continue the investigation with upstream. Bug: https://bugs.gentoo.org/815469 See: openzfs/zfs#11900 Signed-off-by: Sam James <[email protected]>
take two Bug: https://bugs.gentoo.org/815469 openzfs/zfs#11900 Signed-off-by: Georgy Yakovlev <[email protected]>
This reverts commit 4b3133e671b958fa2c915a4faf57812820124a7b upstream. See #14753 - possible corruption again, very similar symptoms to the nightmare that was #11900 and same area of code. We can safely revert it as it's an optimisation rather than a bugfix in itself. Keen Gentoo users may remember the following commits (in Gentoo): - 8e5626dc90e4e6166c2e296371b6ff5a9d13a8c4 - 9fb275f656de639e25acc9497b70b4cae593d35d - 95c250a3f3986b2bc2091dd3981ff1e1d3de0c73 Bug: openzfs/zfs#14753 Bug: openzfs/zfs#11900 Signed-off-by: Sam James <[email protected]>
As a mitigation until more is understood and fixes are tested & reviewed, change the default of zfs_dmu_offset_next_sync from 1 to 0, as it was before 05b3eb6d232009db247882a39d518e7282630753 upstream. There are no reported cases of The Bug being hit with zfs_dmu_offset_next_sync=1: that does not mean this is a cure or a real fix, but it _appears_ to be at least effective in reducing the chances of it happening. By itself, it's a safe change anyway, so it feels worth us doing while we wait. Note that The Bug has been reproduced on 2.1.x as well, hence we do it for both 2.1.13 and 2.2.1. Bug: openzfs/zfs#11900 Bug: openzfs/zfs#15526 Bug: https://bugs.gentoo.org/917224 Signed-off-by: Sam James <[email protected]>
This helps mitigate a data corruption bug. This was previously defaulted to zero prior to upstream commit 05b3eb6d232009db247882a39d518e7282630753, and it is already a tunable, so doing this seems safe. Initially data corruption was thought to be introduced with v2.2, but further upstream investigation currently believes that changes in v2.2 only exacerbated the issue that already existed. A longer-term fix is likely to be openzfs/zfs#15571, though that is not yet merged. The zfs_2_1 package has already backported that, so do not apply the tunable default change there. Positioning of `extraPatches` is to avoid merge conflicts with NixOS#269097. Patch is nearly identical to the [Gentoo][1] patch, but better patch formatting. See openzfs/zfs#11900 See openzfs/zfs#15526 [1]: https://gitweb.gentoo.org/repo/gentoo.git/tree/sys-fs/zfs-kmod/files/zfs-kmod-2.2.1-Disable-zfs_dmu_offset_next_sync-tunable-by-default.patch
This helps mitigate a data corruption bug. This was previously defaulted to zero prior to upstream commit 05b3eb6d232009db247882a39d518e7282630753, and it is already a tunable, so doing this seems safe. Initially data corruption was thought to be introduced with v2.2, but further upstream investigation currently believes that changes in v2.2 only exacerbated the issue that already existed. A longer-term fix is likely to be openzfs/zfs#15571, though that is not yet merged. The zfs_2_1 package has already backported that, so do not apply the tunable default change there. Positioning of `extraPatches` is to avoid merge conflicts with NixOS#269097. Patch is nearly identical to the [Gentoo][1] patch, but better patch formatting. See openzfs/zfs#11900 See openzfs/zfs#15526 [1]: https://gitweb.gentoo.org/repo/gentoo.git/tree/sys-fs/zfs-kmod/files/zfs-kmod-2.2.1-Disable-zfs_dmu_offset_next_sync-tunable-by-default.patch (cherry picked from commit 9670ddc)
As a mitigation until more is understood and fixes are tested & reviewed, change the default of zfs_dmu_offset_next_sync from 1 to 0, as it was before 05b3eb6d232009db247882a39d518e7282630753 upstream. There are no reported cases of The Bug being hit with zfs_dmu_offset_next_sync=1: that does not mean this is a cure or a real fix, but it _appears_ to be at least effective in reducing the chances of it happening. By itself, it's a safe change anyway, so it feels worth us doing while we wait. Note that The Bug has been reproduced on 2.1.x as well, hence we do it for both 2.1.13 and 2.2.1. Bug: openzfs/zfs#11900 Bug: openzfs/zfs#15526 Bug: https://bugs.gentoo.org/917224 Signed-off-by: Sam James <[email protected]>
System information
Describe the problem you're observing
lseek
+SEEK_DATA
is randomly returning-ENXIO
.Code showing the problem: https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/third_party/portage_tool/src/portage_util_file_copy_reflink_linux.c;l=145
I patched it with the following:
On a successful run this is what the strace looks like(depthcharge.elf):
When a failure happens this is what it looks like (netboot.elf):
man lseek
shows:Describe how to reproduce the problem
I don't have a simple way to reproduce this yet. It happens as part of a
portage
build. I have attached a full strace log. The problem shows up randomly, so running the same command multiple times will cause different files to fail to copy.Include any warning/errors/backtraces from the system logs
depthcharge-strace-patch1.log
The text was updated successfully, but these errors were encountered: