-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5.3.11] WARNING: CPU: include/os/linux/kernel/linux/simd_x86.h:204 zvol_fini+0x2463e/0x5ef0b [zfs] #9608
Comments
there are different traces and modules/calls beyond raidz involved (from the full log): fletcher_4_* , __switch_to_asm
raidz_will_scalar_work
vdev_raidz_reconstruct:
|
@zachariasmaladroit the warning indicates an error was returned when trying to save the vector registers. Unfortunately, it doesn't log exact error number returned. Are the warnings consistently reproducible after system reset? The strange thing is that I don't see any relevant kernel changes between the mainline 5.3.6 and 5.3.11 tags. Would it possible for you to confirm you do not see this warning with the 5.3.6 kernel and the same (or newer) ZFS version. @Fabian-Gruenbichler would you mind taking a look at this to see if there's something I overlooked. |
yes they are, they appear with every time the module is loaded
that's the route I will have to go either way, I probably deleted the sources, kernel and modules already but luckily root system and boot are not ZFS so basic access and system maintenance is still possible; yep, will do. The only config change from 5.3.6 to 5.3.11 I remember that changed was the TSX option which I explicitly disabled (for some strange reason it didn't ask for it with 5.3.6 when upgrading from 5.2*) it's unsafe on Haswell, so it must not be used: https://en.wikipedia.org/wiki/Transactional_Synchronization_Extensions Thanks |
I'll try to reproduce it. @zachariasmaladroit could you share your kernel config and USE/C* flags? |
cannot reproduce here with gentoo, 5.3.12 with and without that BMQ patchset, current ZoL master |
Just to be sure, have you ran a memtest recently? |
I am experiencing the same issue on 5.3.11.a-1-hardened (Arch's linux-hardened kernel) with ZoL master built yesterday. I reproduced in a virtual machine on a Xeon Broadwell machine with ECC memory, I do not believe this is a hardware fault, as multiple machines running the same image were affected, some with ECC. All affected machines have encrypted storage. I got to this bug because I am investigating a possible data corruption or memory corruption issue, as unexpected segfaults and crashes have been detected but not yet root caused. |
.config for the kernel is here: https://git.archlinux.org/svntogit/packages.git/tree/trunk/config?h=packages/linux-hardened |
Example from same VM of possible storage/memory corruption: https://pastebin.com/ckJjQm0X I'm still not certain the above is caused by ZoL master, but adding as a datapoint. Once again, ECC RAM and multiple machines exhibiting this on 5.3.11-hardened + ZoL. |
Nope, it's highly unlikely that the system has issues, there weren't any stability issues during gaming on Windows or no edac errors reported under linux (before unable to fully work with it when that error occurred). The probability is very high that it's the build system changes and the things that are happening "in the background" so to speak (during the module build and the restructuring or addition of additional files, features, etc. in the repo). The error also was shown with a newly built 5.3.6 kernel. I did some digging related to toolchain (I'm also an custom enabled hardened toolchain via use-flags) and system components but GCC, binutils, etc. weren't changed in that time window or it would have had to show those messages earlier, which didn't happen. Next what I checked was zfs-kmod and the last known "good" builds were from October 13th 👍
the "broken" builds with the successful build but these messages showing was on November 19th:
did run a kdiff3 on the build-logs of both builds (luckily both were still there) and several files appear to differ significantly. Keeping the stability aspect in mind of the stable branches (non-master) and that there shouldn't be much changes added that alter behavior in the background (build system, caching of variables, etc.), 0.8.2 was built (luckily as well, 5.3 kernel support is there, otherwise would have been forced to downgrade). Atypically the build took exceptionally long (in a rather "bad" way):
A few master-branch build times for "reference":
(worst from master was on August 30th 2018). The build went fine and the following output in dmesg is everything reported with it:
|
I got the message as well with 5.3.12-ck1 thanks for trying to reproduce, it looks very likely that it's build system and/or hardened toolchain related - recent changes might have introduced it between 2c6fa6e and 8221bcf , it seems ... some assorted system variable infos:
|
Since the "Linux 4.14, 4.19, 5.0+ compat: SIMD save/restore" changes got introduced on October 24th and the "working" kernel without those errors was built before (October 13th) it looks like a regression or additional changes are necessary to get it working https://github.com/zfsonlinux/zfs/blob/master/include/os/linux/kernel/linux/simd_x86.h#L204 was mentioned in the error message / warning so simply checked for changes to that file |
related to #9515 since that will re-add SIMD support to the 0.8 branch and it gives me some ideas to check build logs for messages thanks |
@zachariasmaladroit is your kernel also including some "hardening" patch set? if so: |
@bugcommenter also, to make sure: what do you mean with "encrypted storages"? dmcrypt? ZoL native encryption? |
@Fabian-Gruenbichler no, it was vanilla 5.3.6 kernel from kernel.org |
CC: @ryao since he's maintaining the Gentoo ebuilds and perhaps knows of something related hardened toolchain with PIE/PIC and stack-protector might be something special though |
@Fabian-Gruenbichler I do not experience any issues with data or kernel bugs with default Arch kernel. The differences are both patchset and kernel config, the latter which I linked above. By encrypted storage, I mean ZoL native encryption, however I was seeing issues on both encrypted and unencrypted datasets in the pool as far as I can tell. |
@bugcommenter able to reproduce with Arch and zfs-linux-hardened-git - will continue to narrow it down further to find the cause (currently compiling linux-hardened with stock config to see whether its config or patches). |
compiling the 5.3.13 hardened kernel with stock config + make olddefconfig results in a kernel without the issue. strangely, the kernel used by @zachariasmaladroit does not share any of the delta w.r.t. KConfig values.. https://bugzilla.kernel.org/show_bug.cgi?id=205663 might potentially be relevant - currently test-compiling the hardened kernel + hardened config + this patch. |
patch mentioned in previous comment does not help, here's the decoded Oops from Arch linux-hardened 5.3.13:
vs. the one(s) from the original report:
possibly two similar, but not completely identical issue? or just an artifact of different (hardening) patches.. |
@bugcommenter for Arch, it seems like SLAB_CANARY is causing the issue to be triggered. if I make the state allocations bigger than the regular 4k/sizeof(union fpregs_state) (which is already more than enough), it does not trigger anymore (different SLAB codepath taken). I am not yet sure whether the bug is in SLAB_CANARY, our new state storage code or the fletcher code/helpers ;) |
and it's also reproducible on "normal" 5.13 kernels booted with "slub_debug=Z" (which also adds extra stuff around allocations, similar to SLAB_CANARY). whether it's a full-fledged panic or just warnings seems to depend which SIMD instruction set is available (e.g., Qemu CPU type 'host' vs. 'kvm64'). on systems with just SSE2, loading zcommon leads to a panic. on systems supporting AVX, the initial benchmarking spews warnings identical to the panic, but only the initial benchmarking code seems to be affected - regular ZFS operations afterwards continue to work fine with no side-effects observed. @bugcommenter @zachariasmaladroit can you confirm that your systems don't have AVX support? (/proc/cpuinfo) |
FWIW, here's a full-trace with slub_debug=FPUZ on stock Arch 5.3.13 kernel, triggered by "modprobe zcommon" on a VM with AVX support:
|
loading the
|
Where you able to log the
The lack of console warnings after the benchmark is likely because of |
@Fabian-Gruenbichler CPU feature flags from my test environment: |
@behlendorf it's a bug in simd_x86.h - a rather simple one at that:
kmalloc(_node) are usually aligned to allocation size, if that is a power of 2, and at least 8-byte aligned in general. enabling slub_debug, KASAN, SLAB_CANARY, .. make this "usually" void, the allocated memory regions where we store the FPU state are not correctly aligned, and thus state saving fails. there is an upstream patch set that attempts to make kmalloc always aligned, but it's not yet finalized. in any case, we need a solution that works for all kernels anyway ;) I'll open up a PR to switch this part over to kmem_cache, which allows specifying a 64 byte alignment. unfortunately, that means losing node-locality, since the spl wrappers don't support kmem_cache_alloc_node, unless I am missing something. but let's continue the discussion there in a minute or two ;) |
@Fabian-Gruenbichler my system definitely has avx avx2 support |
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignement[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. resort to (spl_)kmem_cache_alloc, which enables us to specify an explicit alignment. CPU/NUMA-node locality is lost, since spl_kmem_cache does not have kmem_cache_alloc_node support. Fixes: openzfs#9608 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Signed-off-by: Fabian Grünbichler <[email protected]>
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignement[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. spl_kmem_cache_alloc does not expose NUMA/node-aware allocation, since this is linux-specific code the stock kernel SLAB interface via 'kmem_cache_create/alloc_node/free' is a better fit here. since the SPL wrapper/implementation re-defines the kernel methods, we temporarily undefine them if necessary. Fixes: openzfs#9608 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Signed-off-by: Fabian Grünbichler <[email protected]>
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignement[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. Use alloc_pages_node() instead which allows us to allocate page-aligned memory. Since pages are a minimum of 4K this is guaranteed to satisfy the alignment constraints. Fixes: openzfs#9608 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Signed-off-by: Fabian Grünbichler <[email protected]>Foo Signed-off-by: Brian Behlendorf <[email protected]>
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignement[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. resort to (spl_)kmem_cache_alloc, which enables us to specify an explicit alignment. CPU/NUMA-node locality is lost, since spl_kmem_cache does not have kmem_cache_alloc_node support. Fixes: openzfs#9608 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Signed-off-by: Fabian Grünbichler <[email protected]>
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignement[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. Use alloc_pages_node() instead which allows us to allocate page-aligned memory. Since fpregs_state is padded to a full page anyway, and this code is only relevant for x86 which has 4k pages, this approach should not allocate any unnecessary memory but still guarantuee the neeeded alignment. Fixes: openzfs#9608 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Signed-off-by: Fabian Grünbichler <[email protected]>Foo Signed-off-by: Brian Behlendorf <[email protected]>
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignement[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. Use alloc_pages_node() instead which allows us to allocate page-aligned memory. Since fpregs_state is padded to a full page anyway, and this code is only relevant for x86 which has 4k pages, this approach should not allocate any unnecessary memory but still guarantuee the neeeded alignment. Fixes: openzfs#9608 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Signed-off-by: Fabian Grünbichler <[email protected]>Foo Signed-off-by: Brian Behlendorf <[email protected]>
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignment[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. Use alloc_pages_node() instead which allows us to allocate page-aligned memory. Since fpregs_state is padded to a full page anyway, and this code is only relevant for x86 which has 4k pages, this approach should not allocate any unnecessary memory but still guarantee the needed alignment. 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Fabian Grünbichler <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9608 Closes #9674
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignment[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. Use alloc_pages_node() instead which allows us to allocate page-aligned memory. Since fpregs_state is padded to a full page anyway, and this code is only relevant for x86 which has 4k pages, this approach should not allocate any unnecessary memory but still guarantee the needed alignment. 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Fabian Grünbichler <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#9608 Closes openzfs#9674
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignment[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. Use alloc_pages_node() instead which allows us to allocate page-aligned memory. Since fpregs_state is padded to a full page anyway, and this code is only relevant for x86 which has 4k pages, this approach should not allocate any unnecessary memory but still guarantee the needed alignment. 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Fabian Grünbichler <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#9608 Closes openzfs#9674
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignment[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. Use alloc_pages_node() instead which allows us to allocate page-aligned memory. Since fpregs_state is padded to a full page anyway, and this code is only relevant for x86 which has 4k pages, this approach should not allocate any unnecessary memory but still guarantee the needed alignment. 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Fabian Grünbichler <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#9608 Closes openzfs#9674
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignment[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. Use alloc_pages_node() instead which allows us to allocate page-aligned memory. Since fpregs_state is padded to a full page anyway, and this code is only relevant for x86 which has 4k pages, this approach should not allocate any unnecessary memory but still guarantee the needed alignment. 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Fabian Grünbichler <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#9608 Closes openzfs#9674
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignment[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. Use alloc_pages_node() instead which allows us to allocate page-aligned memory. Since fpregs_state is padded to a full page anyway, and this code is only relevant for x86 which has 4k pages, this approach should not allocate any unnecessary memory but still guarantee the needed alignment. 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Fabian Grünbichler <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9608 Closes #9674
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignment[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. Use alloc_pages_node() instead which allows us to allocate page-aligned memory. Since fpregs_state is padded to a full page anyway, and this code is only relevant for x86 which has 4k pages, this approach should not allocate any unnecessary memory but still guarantee the needed alignment. 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Fabian Grünbichler <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#9608 Closes openzfs#9674
…node to force alignment fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignment[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. Use alloc_pages_node() instead which allows us to allocate page-aligned memory. Since fpregs_state is padded to a full page anyway, and this code is only relevant for x86 which has 4k pages, this approach should not allocate any unnecessary memory but still guarantee the needed alignment. 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Fabian Grünbichler <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#9608 Closes openzfs#9674
fxsave and xsave require the target address to be 16-/64-byte aligned. kmalloc(_node) does not (yet) offer such fine-grained control over alignment[0,1], even though it does "the right thing" most of the time for power-of-2 sizes. unfortunately, alignment is completely off when using certain debugging or hardening features/configs, such as KASAN, slub_debug=Z or the not-yet-upstream SLAB_CANARY. Use alloc_pages_node() instead which allows us to allocate page-aligned memory. Since fpregs_state is padded to a full page anyway, and this code is only relevant for x86 which has 4k pages, this approach should not allocate any unnecessary memory but still guarantee the needed alignment. 0: https://lwn.net/Articles/787740/ 1: https://lore.kernel.org/linux-block/[email protected]/ Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Fabian Grünbichler <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#9608 Closes openzfs#9674
This issue is most likely related in some way to SIMD acceleration
System information
architecture supports EDAC, ECC RAM, Xeon processor
Describe the problem you're observing
The following scary error message "WARNING: CPU: include/os/linux/kernel/linux/simd_x86.h:204 zvol_fini+0x2463e/0x5ef0b [zfs]" or different variants are shown
since upgrade to 5.3.11 (yesterday) this message appears, didn't with 5.3.6 and a prior state of the master branch, all of my data and backups are on ZFS pools, I'm hesitating and have been holding off for now to import the pools due to concerns of data integrity
(data corruption has been mentioned in prior issue entries [before committing the reworked SIMD acceleration support] when not switching to scalar / non-accelerated state with prime tests, not certain it's related to this but wanted to make sure that the data integrity/safety is ensured before further progressing and importing the pools and continuing backups)
Describe how to reproduce the problem
upgraded from 5.3.6 to 5.3.11 and that error message appears as soon as the zfs module + dependencies are loaded (sys-fs/zfs & sys-fs/zfs-kmod both in 9999 variants of Gentoo main-tree portage ebuilds)
Include any warning/errors/backtraces from the system logs
sample
full log (valid 1 year):
https://pastebin.com/GEeFTPWY dmesg_zfs_entry_SYSCALL_64_after_hwframe_5.3.11
The text was updated successfully, but these errors were encountered: