-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZFS grinds to a halt on Asahi while generating machine id when using luks device #13431
Comments
Seems to be reporting a lot of ...I think 5 is checksum?...errors from the LUKS device. Does it do that if you hand it a non-LUKS device? I am not running Asahi on my M1, but the AArch64 VM on it has been humming along perfectly fine for a bit. I'll give it a go with LUKS, but I would suspect I may have trouble reproducing this inside a VM on there, no matter how HW assisted. e: Nah, 5 is just generic EIO. |
No. zfs alone works just fine. zfs + native encryption works. Id think it was a dm-crypt issue, but I can put btrfs and ext4 on the luks device without any problem. Then the error messages lead me to believe more its a ZFS bug. |
Here's a fun one - if you have a LUKS device on the machine and a zpool not on that LUKS device, do ZFS and/or the FS on the LUKS device throw errors if you do a bunch of IO to both at once? My completely wild guess would be if the SIMD enable/disable dance is not sufficiently saving/restoring state and having two kernel-land consumers is stepping on each other. Let me see if I can see anything obviously awry...worst case, I just boot Asahi on my M1 and play with it. |
something tells me it's related to the M1 primitives for crypto. can you run with software decryption? i'm assuming any hardware support exists at all. |
Just the generic AArch64 CPU that using the acceleration interface in macOS on an M1 provides has a bunch of the normal ARM crypto extensions, so yes, I'd say that's a safe bet.
|
@jittygitty I am running the latest git kernel 5.18. S just use the grep/sed loop thats posted before running configure? Ill give it a shot. What other effects does setting the license to GPL have? Any downsides? |
@jittygitty Assuming I did this correctly, swapping CDDL with GPL did not make a differece here. Is there a way to verify that acceleration is disabled? To be clear, are you are suggesting that there may be some overlap with ZFS trying to use acceleration "on top" of what LUKS has already done? Note: I am not trying to use ZFS native encryption on top of LUKS. Where else would ZFS be using acceleration then, checksumming? Sorry if I am misunderstanding |
@derzahla In your case since you tried building as GPL and it made no difference it likely isn't related, "or" the code doesn't check for Kernel acceleration paths for ARM, or Linux kernel doesn't make good use of available ARM hardware acceleration? (Building module as GPL should not disable acceleration, it should technically give you the best path to acceleration without having to do the "workarounds" forced on us explained in arstechnica article below. And yes far as I know LUKS definitely could be SLOW if its not using acceleration for its encryption. ZFS even if not using native encryption on top, might still use some acceleration for compression or checksums or other, I'm not expert as to where/how. Anyway I just noticed #13431 (comment) so seems he was thinking similar initially at least.) But maybe read the article below to get an idea about how "workarounds" implemented to avoid certain "GPL-only" exports (export_symbol_gpl), can cause various problems: "Removing access to that symbol therefore requires module developers to reinvent their own state-preservation code individually. This increases the likelihood of catastrophic error within the kernel itself, since improperly restored state could cause a later kernel operation to crash." https://arstechnica.com/gadgets/2020/01/linus-torvalds-zfs-statements-arent-right-heres-the-straight-dope/ In your case you may want to take a look at: In fact read @rincebrain 's own comments in that thread since they may be helpful to you. |
@rincebrain Do you know if building as gpl as Brian mentioned in 11357, if it removes "all" workarounds and any outside of kernel system state preservation/restoration etc? Or is there still some zfs state management? |
I don't actually think that mattered on AArch64 - I think it was only on x86 that they moved the FPU save/restore calls outside of the visible symbol set. |
aarch64's issue with GPL symbols was in preemptable kernels (#8545) and @jittygitty no offense intended but this extra noise you're putting everywhere about things you don't fully understand is quite irritating, you ask about Btrfs and other simple issues without even doing a basic level of research |
@zfsbot Initially I had a similar thought to @rincebrain whose comment I linked, after response from @derzahla I concluded and told him that likely it wasn't some strange "state" issue. So I don't think our initial thoughts would have been completely impossible, and of course I admit I'm no 'expert' on FPU and RCU etc but I did think the kernel may have been built as preempt. But apologies if I somehow somewhere missed that this was definitely not preempt. (I thought it was quite common these days even with desktop installations, for low latency audio etc.) (Sorry I can't recall what too simple btrfs question I asked, where?) |
Arch
System information
5.17.0-rc7-asahi-next-20220310-5-2-ARCH
Describe the problem you're observing
ZFS grinds to a halt while attempting to install on it when using a zpool on LUKS device
Describe how to reproduce the probleminsta
Include any warning/errors/backtraces from the system logs
^ pacstrap hangs forever on initializing machine id
journald is murdered by zio logs at thousands of lines per second. Small sample below
This happening:
I get similar results if I attempt to rsync a large amount of files rather than using pacstrap.
The text was updated successfully, but these errors were encountered: