-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.6.5 regression: zpool import -d /dev/disk/by-id hangs #3866
Comments
I don't get any kernel panic or out of memory errors (other than the ENOMEM shown by strace, which I believe are wrong as per code). The process hangs hard (likely in D state) and I can't escape with Ctrl-C on the only console I have...:( I should probably run it in background to see if I can troubleshoot more. When I press ctrl-alt-del, the system reboots fine, suggesting that it is not a kernel panic or hang. |
why can't I attach a text file here? |
I think I can vaguely remember an issue saying something about the import going into an endless loop or something. Should have been fixed in one of the point releases, so what version exactly are you using? And where do you get ZoL from - package or source? |
Sorry, I forgot to mention this. I upgraded from 0.6.4 to 0.6.5. I am on Gentoo, so its build from source. This worked perfectly fine in 0.6.4. |
Ah. Don't know if @ryao used my init scripts in that (I've rewritten the init scripts from scratch because we had five versions, all different which made it impossible to maintain). He have mentioned that he was going back to "some other" means, but I don't know the exact information about that. Is there no point release (such as |
This is early boot (initrd) trying to poke to see which pools are available for import. So, no init scripts are in the picture at this time. Pretty much:
Same thing yields profit in 0.6.4...:) |
0.6.5.2 just hit the portage. But do we know if it will fix this or not? |
@FransUrbo The scripts in the repository are in Gentoo at the moment, but I intend to merge #3800 by the end of the week. |
Since this is in the initrd, it's not the/my init scripts that's at fault. Gentoo is using completely different code in their initrd, so it's not my initrd code either… Can't say. BUT |
@FransUrbo That fix was backported. Gentoo skipped from 0.6.5 + that fix to 0.6.5.2. |
@devsk Are you using genkernel? The genkernel zfs branch might work for you: https://gitweb.gentoo.org/proj/genkernel.git/log/?h=zfs It is designed to read the cachefile from the pool and import all pools using that. It solves reliability problems involving the cachefile and initramfs archives. It has not been merged yet because it is missing support for generating the scsi, usb and wwn symlinks in /dev/disk/by-id and the other /dev/disk symlinks. That should be resolved later this month. |
No, I am doing manual 'debug' boot where nothing else happens other than 3 things I mentioned above. So, its not related to initrd packaging. The right version of module is loaded for the kernel loaded. And same version user space tools (zpool) are being used to import. |
@ryao: I do use the genkernel but I am not auto importing any pools. Doing very simple modprobe zfs and zpool import (which is supposed to list pools and not import them) from the debug shell that genkernel drops you into with 'debug' kernel cmdline. |
I suggest asking for help in #zfsonlinux on freenode. It will be quicker than going back and forth in the issue. |
Chris at the mailing list had this analysis:
The ioctls are sort of listed in include/sys/fs/zfs.h. If I'm (Again if I'm understanding this correctly what strace represents as eg
If I'm reading the code right, I'm not convinced that ENOMEM means |
I upgraded to 0.6.5.2 and the issue is gone. Does anybody have any idea from the information above what might have happened and fix to which issue actually fixed this issue for me? We can close this bug as a dup of that issue. |
Right, in this context is means that user space needs to pass a bigger buffer for the kernel to use. @devsk my best guess is that you were hitting #3652 / #3785. This regression manifested itself in quite a few different ways but was resolved by 5592404. I'm happy to close this as a duplicate of that issue. |
At boot up, one of the scripts is doing 'zpool import -d /dev/disk/by-id' for listing all the pools available for import and it hangs there forever.
This worked fine in 0.6.4.
It does not matter if whether -d is provided or not. Simple zpool import hangs as well.
I did a quick debug boot of the system (it drops me into a shell in the initrd) and the strace revealed that some ioctls are failing with ENOMEM. Since I can't log this to a file, I took a photo of the strace run. Have a look.
The hang is in ioctl(3, __IOC(0, 0x5a, 0x06, 0x00), which fails with ENOMEM previously in the strace. And there is: ioctl(3, __IOC(0, 0x5a, 0x05, 0x00), 0x7fff4fd7e660) = -1 failed with ENOENT.
The text was updated successfully, but these errors were encountered: