Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVX2 not available for RAIDZ oder Fletcher algorithms on Ubuntu 22.04 #15223

Open
koelmel opened this issue Aug 30, 2023 · 14 comments
Open

AVX2 not available for RAIDZ oder Fletcher algorithms on Ubuntu 22.04 #15223

koelmel opened this issue Aug 30, 2023 · 14 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang) Type: Performance Performance improvement or performance problem

Comments

@koelmel
Copy link

koelmel commented Aug 30, 2023

System information

Type | Version/Name
Ubuntu | 22.04 LTS
Distribution Name | Ubuntu
Distribution Version | 22.04
Kernel Version | 5.15.0-82-generic and 6.2.0-31-generic
Architecture | x86
OpenZFS Version | 2.1.12-1 (self compiled) and 2.1.9-2ubuntu1.1

Describe the problem you're observing

After booting the new Ubuntu kernel 5.15.0-82-generic on a dedicated AMD Epyc Zen3 System (also with updated amd64-microcode package version 3.20191218.1ubuntu2.2 which updates the microcode version from 0xa001173 to 0xa0011d1 ) and a VM hosted on a AMD Epyc Zen3 System (an openSUSE 15.4 system with not updated kernel and microcode package) i recognized that AVX2 is not anymore available in RAIDZ or Fletcher algorithms.
Because of the not "recognized" AVX2 the fastest algorithms are now "ssse3".
Because there was also an microcode patch for Zen3 systems i've tried on the dedicated AMD Epyc Zen3 System booting the former kernel 5.15.0-79-generic with the updated microcode package. There is AVX2 available again.

Describe how to reproduce the problem

#boot Ubuntu kernel 5.15.0-82-generic on a AMD Epyc Zen3 system (e.g. AMD EPYC 7443P CPU or a VM hosted on such a system)
cat /sys/module/zfs/parameters/zfs_vdev_raidz_impl
#output is: "cycle [fastest] original scalar sse2 ssse3" instead of expected "cycle [fastest] original scalar sse2 ssse3 avx2"
cat /sys/module/zcommon/parameters/zfs_fletcher_4_impl
#output is: "[fastest] scalar superscalar superscalar4 sse2 ssse3" instead of expected "[fastest] scalar superscalar superscalar4 sse2 ssse3 avx2"

Include any warning/errors/backtraces from the system logs

@koelmel koelmel added the Type: Defect Incorrect behavior (e.g. crash, hang) label Aug 30, 2023
@koelmel
Copy link
Author

koelmel commented Aug 30, 2023

Also the new Ubuntu 22.04 LTS HWE kernel version 6.2.0-31-generic with the official Ubuntu zfs version 2.1.9-2ubuntu1.1 is showing the problem.
Other kernel modules (like raid6) are still using AVX2 and the cpu flags have still avx and avx2.

@rom4nik
Copy link

rom4nik commented Sep 3, 2023

I think I'm seeing the same (or very similar) issue on Debian 11 and 12 using zfs-dkms packages from contrib repos. What's interesting to me, is that current and previous (2.1.12, 2.1.11) zfs-dkms AUR packages on Arch Linux work well and list avx2 in raidz/fletcher4 impls.

On Debian I'm seeing very high (near 100%) multicore CPU usage during reads (writes too) from an encrypted dataset on RAID-Z2 pool. htop with hiding kernel threads disabled shows multiple z_rd_int_0 rows reaching 100% each. perf top produces results like below:

Overhead  Shared Object                                                   Symbol
  16.73%  [kernel]                                                        [k] gcm_pclmulqdq_mul
  11.64%  [kernel]                                                        [k] kfpu_end
   4.46%  [kernel]                                                        [k] kfpu_begin

/sys/module/zfs/parameters/zfs_vdev_raidz_impl and /sys/module/zcommon/parameters/zfs_fletcher_4_impl don't list avx2 as available, same in /proc/spl/kstat/zfs/{vdev_raidz,fletcher_4}_bench.

CPUs: Ryzen 5600G (baremetal), 5700X (baremetal) and i7-8650U (in VMs using host-passthrough as CPU model). In all cases /proc/cpuinfo contains avx2 and aes.

Distros tested:

  • Debian 11:
    • kernel 5.10.191-1 (5.10.0-25-amd64)
    • zfs-kmod-2.0.3-9+deb11u1
  • Debian 12:
    • kernel 6.1.38-4 (6.1.0-11-amd64)
    • zfs-kmod-2.1.11-1
  • Arch:
    • kernel 6.1.51-1-lts
    • zfs-kmod-2.1.12-1 / zfs-kmod-2.1.11-1

(maybe relevant for troubleshooting ideas: #9215)

@rincebrain
Copy link
Contributor

My suspicion, based on another report someone gave me once, was that on some systems, it wasn't correctly detecting certain newer architecture features in the compile-time checks, and so compiling them out entirely, leading to FPU functions that are using much less efficient implementations.

I'm kind of tempted to either refactor the existing Linux kfpu_begin/end to include which things it thinks are supported or expose it in /proc or something to make it easier to catch that, assuming of course it is the issue at hand...

I'll be at home after tomorrow and in a position to test these theories.

@rincebrain
Copy link
Contributor

rincebrain commented Sep 5, 2023

The problem appears to be that boot_cpu_has(X86_FEATURE_OSXSAVE) is returning 0, and the avx checks are the ones that depend on that indirectly with __ymm_enabled...

e: I suspect that this is the issue that we're seeing, so when that lands, it should go away. But that doesn't help people now, now does it...

e2: I think the above link, when that patch lands, will fix it, but if we want this to work in the interim, I don't see a good option other than parsing the feature bits ourselves or just doing what they do and unconditionally make the check pass and assume everyone wanting to check that also has more checks that would break if this wasn't actually true?

So something like

#if 0
/**
 * We can't have nice things on Linux.
 * See #15223 for why we can't use this.
 */
#if defined(X86_FEATURE_OSXSAVE)
        has_osxsave = !!this_cpu_has(X86_FEATURE_OSXSAVE);
#else
        has_osxsave = B_FALSE;
#endif
        if (!has_osxsave) {
                return (B_FALSE);
        }
#endif

in __simd_state_enabled instead of the current contents around OSXSAVE checking.

I don't think we can use this_cpu_has because that still fails on cpu0...

@koelmel
Copy link
Author

koelmel commented Sep 6, 2023

With a Ubuntu 22.04 LTS test VM and CPU passthrough configuration i have seen the problem only when running it on host systems with AMD CPUs (Zen3) and not with a Intel CPU (checked with Haswell CPU E5-1630 v3).
But @rom4nik seems to have the problem as well on a Intel i7-8650U.
It's good that the problem will be hopefully solved in the "midterm".
Maybe it would be interesting to determine, what has changed in the Ubuntu Kernel (from 5.15.0-79-generic to 5.15.0-82-generic) breaking the existing zfs AVX check.

@rincebrain
Copy link
Contributor

I literally already linked the patch discussing the bug and the previous patch breaking it.

@lsylipei
Copy link

lsylipei commented Sep 7, 2023

This also happen to my system. I'm ubuntu 22.04 with 5.15.0-83-generic kernel. ZFS is 2.1.5. And my cpu is xeon gold 6154.
No avx2 for fletcher and raidz. And I also don't have zfs_fletcher_4_impl in /sys/module/zfs/parameters.

@rom4nik
Copy link

rom4nik commented Sep 13, 2023

It seems that on kernel 6.1.52-1 (6.1.0-12-amd64 on Debian 12) AVX2 works again, checked on 5600G and i7-8650U.

The patch mentioned earlier has landed in stable tree at 6.1.50: https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.1.50

@rincebrain
Copy link
Contributor

FWIW, Ubuntu still hasn't pulled torvalds/linux@2c66ca3, though they pulled torvalds/linux@b81fac906a8f in 6.2.0-30.30.

@Fabian-Gruenbichler
Copy link
Contributor

FWIW, the next version of Proxmox kernels (6.2.16-14) will contain the cherry-picked fix (already confirmed to fix the regression, but currently still in internal testing):

https://git.proxmox.com/?p=pve-kernel.git;a=commit;h=9ba0dde971e6153a12f94e9c7a7337355ab3d0ed

also already reported on the Ubuntu side, so should be fixed there at some point in the near future as well: https://bugs.launchpad.net/bugs/2034745

@rincebrain rincebrain added the Type: Performance Performance improvement or performance problem label Sep 19, 2023
@lowjoel
Copy link

lowjoel commented Sep 27, 2023

(Un)interestingly, this actually causes owners of CPUs with AVX2 to run into #10846.

In my case, encryption+sha512 checksums+raidz2: I had a workload where a VM was downloading a Steam game, and I saw all txg syncing grind to a halt. perf top shows that there's a lot of time spent in gcm_pclmulqdq_mul and mutex_spin_on_owner. Some VMs weren't happy that all I/O started timing out and promptly crashed 😢

@lsylipei
Copy link

lsylipei commented Nov 1, 2023

Problem fixed with the release of kernel 5.15.0-88.98 on ubuntu 22.04.

@Alyssumi
Copy link

It seems that on kernel 6.1.52-1 (6.1.0-12-amd64 on Debian 12) AVX2 works again, checked on 5600G and i7-8650U.

The patch mentioned earlier has landed in stable tree at 6.1.50: https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.1.50

I just upgraded to Debian 12, but I'm still getting the same. My CPU is G4400. Anything I'm missing?

❯ uname -r
6.1.0-23-amd64
❯ cat -p /sys/module/zfs/parameters/zfs_fletcher_4_impl
[fastest] scalar superscalar superscalar4 sse2 ssse3
❯ cat -p /sys/module/zfs/parameters/zfs_vdev_raidz_impl
cycle [fastest] original scalar sse2 ssse3

@rom4nik
Copy link

rom4nik commented Aug 16, 2024

G4400

Intel ARK doesn't mention AVX2 support for this CPU: https://ark.intel.com/content/www/us/en/ark/compare.html?productIds=88179,124968

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang) Type: Performance Performance improvement or performance problem
Projects
None yet
Development

No branches or pull requests

7 participants