-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zfs-import script does not honor the cachefile #3777
Comments
I'm not exactly sure I understand what you're saying… Are you saying that it imports the pool TWICE?! |
@ryao I think you meant to say "A pool should not be auto imported , unless it was imported previously and such fact has been persisted in cachefile" . At least, this is what your justification seem to point to. However, this would prevent auto importing of pools which are not configured to use cachefile, or are configured to use non-standard location of a cachefile. Since cachefile is on way to obsolescence, it does not seem a like a very good idea. Perhaps the problem that other Gentoo user has experienced could to be addressed differently. FWIW I use similar setup (root i.e. / in one pool and two directories under root in two other pools , all are auto imported on boot) and it works for me without issues, under ArchLinux. It worked when I was using cachefile (and saved it to kernel RAM image) and it still works when I updated cachefile to none for all pools (and removed it from kernel RAM image). |
@FransUrbo The import/export is not like mounting or RAID assembly. An import says "this pool belongs to this system, so do not touch it" while an export says "this pool no longer belongs to this system and is fair game". The way that the import persists across reboots is by something called a verbatim import, which reads the details of the imported pools from the cachefile. The fact that an import is being done explicitly in zfs-import makes using accurate terminology here difficult. What should be done at boot is a verbatim import from the cachefile, rather than a normal import. A normal import should never be done by automated scripts. @Bronek People have talked about getting rid of the cachefile in the past, but it is required for things like multipath to be done sanely, accelerates the boot process and has no alternative that is able to offer similar semantics. I used to think that the cachefile should be eliminated, but I found that the rationale for eliminating the cachefile to be invalid upon actually trying to make it work. Rather than trying to get rid of the cachefile and having regressions, we should be modifying our systems to better integrate with it. For instance, initramfs archives could read the cachefile from the bootfs by doing a readonly forced import, mounting the bootfs, grabbing the cachefile, unmounting, exporting and then doing the verbatim import from the cachefile. This would eliminate the need to have the cachefile in the initramfs and make things behave like they should. initramfs generators such as Gentoo's genkernel do not currently do this, but they should in the future. As for why things work fine on Arch Linux, that uses systemd, which is properly using the cachefile: https://github.com/zfsonlinux/zfs/blob/master/etc/systemd/system/zfs-import-cache.service.in |
If the first cache file is accurate enough to import the pool, there's no NEED to get a second cache file and import it again. If the import was successful the first time, there's no need to do it again..
|
@ryao I do not have cachefile, and yet all my pools are mounted automatically on boot:
|
I agree with @ryao. This is of particular importance when you consider a SAN environment, where a pool may be visible to multiple hosts (this is very common in virtualization environments, and for certain high-availability setups). You do not necessarily want the first system to boot to grab all of the visible pools. Hosts should consult the cachefile to determine which pools should be imported at boot. This is the way Solaris works, and Linux should follow. |
it makes sense, I was thinking about using hostid (you could mount read-only, compare hostid stored in the pool and then re-mount for write if its yours) but then it occurred to me that I'd rather get rid of hostid and keep cachefile than other way around. |
if all you want to do is ignore ("not import") certain pools, use |
I still have no idea what this ticket is about! Nothing in the comments by @ryao makes any sense… ! |
Everything @ryao has made sense, and he's right. I can see how it doesn't make sense when your plan (#3526) involves getting rid of the cachefile completely. I wish I had seen that earlier, because it is, in my opinion, a very bad idea. Let me explain why: The cache file as a concept has existed since the beginning of ZFS. From the manpage, zpool(1M), we learn that:
The contents of the cachefile are effectively controlled by which pools you import and export. When you import a pool, its details are saved in the cache and the system will reimport the pool every time the system boots, until you export it. There is never a need to export a pool until you are ready to import it on another system. It was simple, and beautiful, and all of your previously imported pools could be imported with one command:
This is compared to your new code which takes:
185 lines. And your code is incorrect. It tries to import all available pools. You say you you can ignore pools by setting For system administrators: are they supposed to update that variable every time a new pool appears on their SAN? So you say the variable can be modified to accept wildcards, but that just gives you the same situation in reverse. Now I have to ask, is the user supposed to update the variable every time they plug in a USB zpool that they want mounted at boot? Is a system administrator supposed to update that variable every time they attach a new pool to their SAN that they want automatically imported at boot? These are exactly the sort of challenges ZFS was designed to eliminate. It's the same reason we don't have to update I say to you: we already had a solution to this problem, a cache file. Do not eliminate it. And let's fix the |
This change modifies the import service to use the default cache file to reimport pools at boot. This fixes code that exhaustively searched the entire system and imported all visible pools. Using the cache file is in keeping with the way ZFS has always worked, and is how it is written in the man page (zpool(1M,8)): All pools in this cache are automatically imported when the system boots. Importantly, the cache contains important information for importing multipath devices, and helps control which pools get imported in more dynamic environments like SANs, which may have thousands of visible and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable is not equipped to handle. The change also stops the service from exporting pools at shutdown. Exporting pools is only meant to be performed by the administrator of the system. Closes openzfs#3777 Closes openzfs#3526
This change modifies the import service to use the default cache file to reimport pools at boot. This fixes code that exhaustively searched the entire system and imported all visible pools. Using the cache file is in keeping with the way ZFS has always worked, and is how it is written in the man page (zpool(1M,8)): All pools in this cache are automatically imported when the system boots. Importantly, the cache contains important information for importing multipath devices, and helps control which pools get imported in more dynamic environments like SANs, which may have thousands of visible and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable is not equipped to handle. The change also stops the service from exporting pools at shutdown. Exporting pools is only meant to be performed by the administrator of the system. Closes openzfs#3777 Closes openzfs#3526
As I said in #3800 (comment), the correct fix is to set:
in the defaults config file ( That setup will enforce the use of cache file, and ONLY use the cache file (i.e., try once with that ONLY). |
* Move code from the SYSV init and initramfs scripts that is better served in a common code base library (zfs-functions). * Dracut have the func emergency_shell() which does a little more. For those that don't have it, just run '/bin/sh -i -l'. * zfs_set_ifs() needs to be a global func. Use this in zfs_run_cmd() before we run the actual command. Fixing some issues disovered. * Use "${VAR}" instead of "$VAR". * Add some comments to functions that didn't have any. * Create zfs_run_cmd() which records stderr, exit code etc. * Create check_boolean() to check if a user config option is set ('yes', 'Yes', 'YES' or any combination there of) OR '1'. Anything else is considered 'unset'. * Put the check for zfs debugging in check_zfs_debug(). Use this in zfs_action() to check if we should output the message we're called with. * Move code snippet that fetches and mounts recursive filesystems into the new recursive_mount_filesystems() function. * Minor fix/change to read_mtab(): * Strip control characters (space - \040) from /proc/mounts GLOBALY, not just first occurrence. * Don't replace unprintable characters ([/-. ]) for use in the variable name with underscore. No need, just remove them all together. * Instead of using ZFS_CMD+zfs_action()+zfs_log_failure_msg() etc each and every time, use the more generic zfs_action() that does most of this for us. * Downside: Won't accept pools with spaces in them any more :(. Need to find a way to solve this (can't use arrays, because we're not running under bash!). * Apparently "var > 0" doesn't work. Use 'var -gt 0' instead. But that means that 'var' needs to be set. * (Re)set IFS in zfs_action() as first step so that command line options doesn't get clobbered. * Fix checking if pool is imported in import_pool(). * Remove "sort". Sometimes not available, and not really nessesary. * Add ZPOOL_CACHE option (commented out) and documentation in the defaults file to indicate that the default in zfs-functions can be overwritten. * Needed when/if enforcing the use of a cache file is needed/wanted. * Remove some 'local' variables in do_import(). Signed-off-by: Turbo Fredriksson <[email protected]> Closes openzfs#3782 Closes openzfs#3777 Closes zfsonlinux/pkg-zfs#166
* Use "${VAR}" instead of "$VAR". * Create zfs_run_cmd() which records stderr, exit code etc. * Create check_boolean() to check if a user config option is set ('yes', 'Yes', 'YES' or any combination there of) OR '1'. Anything else is considered 'unset'. * Put the check for zfs debugging in check_zfs_debug(). Use this in zfs_action() to check if we should output the message we're called with. * Move code snippet that fetches and mounts recursive filesystems into the new recursive_mount_filesystems() function. * Minor fix/change to read_mtab(): * Strip control characters (space - \040) from /proc/mounts GLOBALY, not just first occurrence. * Don't replace unprintable characters ([/-. ]) for use in the variable name with underscore. No need, just remove them all together. * Instead of using ZFS_CMD+zfs_action()+zfs_log_failure_msg() etc each and every time, use the more generic zfs_action() that does most of this for us. * Downside: Won't accept pools with spaces in them any more :(. Need to find a way to solve this (can't use arrays, because we're not running under bash!). * Apparently "var > 0" doesn't work. Use 'var -gt 0' instead. But that means that 'var' needs to be set. * (Re)set IFS in zfs_action() as first step so that command line options doesn't get clobbered. * Fix checking if pool is imported in import_pool(). * Remove "sort". Sometimes not available, and not really nessesary. * Add ZPOOL_CACHE option (commented out) and documentation in the defaults file to indicate that the default in zfs-functions can be overwritten. * Needed when/if enforcing the use of a cache file is needed/wanted. * Remove some 'local' variables in do_import(). Signed-off-by: Turbo Fredriksson <[email protected]> Closes openzfs#3782 Closes openzfs#3777 Closes zfsonlinux/pkg-zfs#166
And as I said in #3800 (comment), @FransUrbo's suggested fix doesn't work because:
|
The following patch from the genkernel zfs branch makes genkernel read the cachefile from the pool and verbatim import pools from it. It will be merged to genkernel HEAD after review has been completed: For systems where an initramfs is not used to load the pool, the boot scripts should just load from the cachefile. It should be a fairly simple matter since they already have access to the cachefile. I am fairly confident that the cachefile issues that people have had are caused by the initramfs have its own copy of the cachefile. This is analogous to what the current systemd unit files do. |
I don't see the point of doing it this way! If the root pool could be imported read only, why bother with all this?? |
Also, it doesn't take that fact into account - there's no check for a failed import! |
Oh, and the third problem is that if the ro import succeeded, but there is no cache file in there (because user have elected not to have one), everything fails… Even though the pool could be imported! |
@ryao I like this approach.
... the pool might be unusable for read-write, e.g. because it is SAN and is accessed by a different node at this time. I can think of other reasons. This does not prevent reading one file in a well known location, though.
good point
also good point, but this probably calls for a different boot switch (* edit below). There are basically two approaches and I like one presented by @ryao better than what ZOL is doing now, i.e. trying to import all available pools (including those in ZVOLs) EDIT: @ryao's solution requires "root=ZFS=$DATASET" to be provided at boot. I like the way it's explicit, you know what to expect just glancing at the kernel options, and it happens to fit my expectation as to "how ZFS should work", as a lowly user (I managed some FC SANs in the past, if that matters). Having said that, I can imagine a whole class of users who might want ZFS to automagically import all available pools. It seems possible (i.e. to intuitively understand) to perform auto-import when there is no dataset name following root=ZFS . However when there is, I would really prefer if auto-import did not take place. |
If you don't want any pools to auto import at boot up (other than the root pool), just don't enable the If you want a subset of all available pools to import and you don't want to (or can't) setup exceptions, ok, that is a problem. But there's a much better way to do this than to rewrite the whole shebang (which works for the large majority as it is). BTW, Most (all?) distributions have (not Gentoo??) have always required |
there you go. My /boot is ext2, my root and two more pools are ZFS and, unfortunately, I often at the time of startup have other ZFS pools attached which I am not comfortable importing (e.g. because they are in ZVOLs , or because I left USB attached when starting - they tend to be random). It might be unusual scenario (but I think that it's not, naiively perhaps) and one which, apparently, cachefile is designed to serve best. I am not saying that other use cases are not better served by auto-import. FWIW my distribution is ArchLinux, and the active maintainer of ZFS modules there is not very active on this list, so I will probably have to massage the scripts myself. For this I really, really need to understand what I'm doing. |
* Use "${VAR}" instead of "$VAR". * Create zfs_run_cmd() which records stderr, exit code etc. * Create check_boolean() to check if a user config option is set ('yes', 'Yes', 'YES' or any combination there of) OR '1'. Anything else is considered 'unset'. * Put the check for zfs debugging in check_zfs_debug(). Use this in zfs_action() to check if we should output the message we're called with. * Move code snippet that fetches and mounts recursive filesystems into the new recursive_mount_filesystems() function. * Minor fix/change to read_mtab(): * Strip control characters (space - \040) from /proc/mounts GLOBALY, not just first occurrence. * Don't replace unprintable characters ([/-. ]) for use in the variable name with underscore. No need, just remove them all together. * Instead of using ZFS_CMD+zfs_action()+zfs_log_failure_msg() etc each and every time, use the more generic zfs_action() that does most of this for us. * Downside: Won't accept pools with spaces in them any more :(. Need to find a way to solve this (can't use arrays, because we're not running under bash!). * Apparently "var > 0" doesn't work. Use 'var -gt 0' instead. But that means that 'var' needs to be set. * (Re)set IFS in zfs_action() as first step so that command line options doesn't get clobbered. * Fix checking if pool is imported in import_pool(). * Remove "sort". Sometimes not available, and not really nessesary. * Add ZPOOL_CACHE option (commented out) and documentation in the defaults file to indicate that the default in zfs-functions can be overwritten. * Needed when/if enforcing the use of a cache file is needed/wanted. * Remove some 'local' variables in do_import(). * Add a ZFS_POOL_IMPORT to the defaults config. This is a semi colon separated list of pools to import ONLY. Signed-off-by: Turbo Fredriksson <[email protected]> Closes openzfs#3782 Closes openzfs#3777 Closes zfsonlinux/pkg-zfs#166
If there's only a few, known, pools (which it sounds like) that you don't want imported, use In #3559, I've now added a |
@FransUrbo The check for a failed import will be added. Thanks for pointing it out. As for editing configuration files, it should not be necessary. The zpool import command should make all the changes to ensure a persistently imported pool. That is how it works on Illumos, FreeBSD and systemd systems. Gentoo is going back to this soon. It should not work any other way by default. |
This change modifies the import service to use the default cache file to perform a verbatim import of pools at boot. This fixes code that searches all devices and imported all visible pools. Using the cache file is in keeping with the way ZFS has always worked, how Solaris, Illumos, FreeBSD, and systemd performs imports, and is how it is written in the man page (zpool(1M,8)): All pools in this cache are automatically imported when the system boots. Importantly, the cache contains important information for importing multipath devices, and helps control which pools get imported in more dynamic environments like SANs, which may have thousands of visible and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable is not equipped to handle. Verbatim imports prevent rogue pools from being automatically imported and mounted where they shouldn't be. The change also stops the service from exporting pools at shutdown. Exporting pools is only meant to be performed explicitly by the administrator of the system. The old behavior of searching and importing all visible pools is preserved and can be switched on by heeding the warning and toggling the ZPOOL_IMPORT_ALL_VISIBLE variable in /etc/default/zfs. Closes openzfs#3777 Closes openzfs#3526
@FransUrbo There is code in genkernel that will allow people to drop to a shell and execute a manual import of the pool if failure occurs from the pool not being imported. It had not been intended to be used this way, but now that we are switching to read the cachefile from the system, it will be. I am looking into including instructions to handle the root pool missing from the cachefile or the cachefile being missing as that is a real concern. In particular, telling the user to execute |
This change modifies the import service to use the default cache file to perform a verbatim import of pools at boot. This fixes code that searches all devices and imported all visible pools. Using the cache file is in keeping with the way ZFS has always worked, how Solaris, Illumos, FreeBSD, and systemd performs imports, and is how it is written in the man page (zpool(1M,8)): All pools in this cache are automatically imported when the system boots. Importantly, the cache contains important information for importing multipath devices, and helps control which pools get imported in more dynamic environments like SANs, which may have thousands of visible and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable is not equipped to handle. Verbatim imports prevent rogue pools from being automatically imported and mounted where they shouldn't be. The change also stops the service from exporting pools at shutdown. Exporting pools is only meant to be performed explicitly by the administrator of the system. The old behavior of searching and importing all visible pools is preserved and can be switched on by heeding the warning and toggling the ZPOOL_IMPORT_ALL_VISIBLE variable in /etc/default/zfs. Closes openzfs#3777 Closes openzfs#3526
This change modifies the import service to use the default cache file to perform a verbatim import of pools at boot. This fixes code that searches all devices and imported all visible pools. Using the cache file is in keeping with the way ZFS has always worked, how Solaris, Illumos, FreeBSD, and systemd performs imports, and is how it is written in the man page (zpool(1M,8)): All pools in this cache are automatically imported when the system boots. Importantly, the cache contains important information for importing multipath devices, and helps control which pools get imported in more dynamic environments like SANs, which may have thousands of visible and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable is not equipped to handle. Verbatim imports prevent rogue pools from being automatically imported and mounted where they shouldn't be. The change also stops the service from exporting pools at shutdown. Exporting pools is only meant to be performed explicitly by the administrator of the system. The old behavior of searching and importing all visible pools is preserved and can be switched on by heeding the warning and toggling the ZPOOL_IMPORT_ALL_VISIBLE variable in /etc/default/zfs. Signed-off-by: James Lee <[email protected]> Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3777 Closes #3526
Bug Fixes * Fix CPU hotplug openzfs/spl#482 * Disable dynamic taskqs by default to avoid deadlock openzfs/spl#484 * Don't import all visible pools in zfs-import init script openzfs/zfs#3777 * Fix use-after-free in vdev_disk_physio_completion openzfs/zfs#3920 * Fix avl_is_empty(&dn->dn_dbufs) assertion openzfs/zfs#3865 Signed-off-by: Nathaniel Clark <[email protected]> Change-Id: I36347630be2506bee4ff0a05f1b236ba2ba7a0ae Reviewed-on: http://review.whamcloud.com/16877 Reviewed-by: Alex Zhuravlev <[email protected]> Reviewed-by: Andreas Dilger <[email protected]> Tested-by: Jenkins Tested-by: Maloo <[email protected]>
A pool should not be imported unless it is already imported on the system in the cachefile to avoid issues with multipath. I had not scrutinized the zfs-import script too closely when @FransUrbo proposed it and unfortunately, this got by me. Another Gentoo developer who setup a system with a separate pool for /home has the boot process breaking on him because the script is not finding it, but if the cachefile's contents were properly obeyed, this would not be an issue. The scanning behavior should not be the default and the script should import when it knows an import is acceptable, rather than all the time. e.g. a pool on a USB key would be auto-imported by this, which is likely not what the system administrator would want.
The text was updated successfully, but these errors were encountered: