-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hostid deprecation causes zpool.cache
mismatch and zpool import
failure
#2794
Comments
Change the zpool program to skip its hostid mismatch check in the same way that libzfs already does. Invoked imports fail if the ZPOOL_CONFIG_HOSTID nvpair is missing in the /etc/zfs/zpool.cache file, which can happen as of the /etc/hostid deprecation in commit openzfs/spl@acf0ade. Closes: openzfs#2794
Change the zpool program to skip its hostid mismatch check in the same way that libzfs already does. Invoked imports fail if the ZPOOL_CONFIG_HOSTID nvpair is missing in the /etc/zfs/zpool.cache file, which can happen as of the /etc/hostid deprecation in commit openzfs/spl@acf0ade. Closes: openzfs#2794
@dajhorn Thanks for getting to the bottom on this. This clearly explains why this caused issues on some systems and not others.
I think we should definitely do this. For better or worse the upstream code is designed to use a hostid of 0 to disable these checks and it would be desirable to remain compatible with that logic.
Sadly we can't disable these checks entirely. For sites which are using ZFS in a legitimate failover configuration it's the only multimount protect they have. At least until a robust system is implemented like that described in #745.
I like this idea a lot. Unifying the behavior between user space and kernel space will simplify things. For example, we'd be able to remove the conditional logic here and here if we provided a Can you propose a patch for this? |
Change the zpool program to skip its hostid mismatch check in the same way that libzfs already does. Invoked imports fail if the ZPOOL_CONFIG_HOSTID nvpair is missing in the /etc/zfs/zpool.cache file, which can happen as of the /etc/hostid deprecation in commit openzfs/spl@acf0ade. Signed-off-by: Darik Horn <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#2794
The old boot.spl.hostid option was not working correctly due to an upstream bug. Instead, now we will create the /etc/hostid file so that all applications (including the ZFS kernel modules, ZFS user-space applications and other unrelated programs) pick-up the same system-wide host id. Note that glibc (and by extension, the `hostid` program) also respect the host id configured in /etc/hostid, if it exists. The hostid option is now mandatory when using ZFS because otherwise, ZFS will require you to force-import your ZFS pools if you want to use them, which is undesirable because it disables some of the checks that ZFS does to make sure it is safe to import a ZFS pool. The /etc/hostid file must also exist when booting the initrd, before the SPL kernel module is loaded, so that ZFS picks up the hostid correctly. The complexity in creating the /etc/hostid file is due to having to write the host ID as a 32-bit binary value, taking into account the endianness of the machine, while using only shell commands and/or simple utilities (to avoid exploding the size of the initrd).
Change the zpool program to skip its hostid mismatch check in the same way that libzfs already does. Invoked imports fail if the ZPOOL_CONFIG_HOSTID nvpair is missing in the /etc/zfs/zpool.cache file, which can happen as of the /etc/hostid deprecation in commit openzfs/spl@acf0ade. Signed-off-by: Darik Horn <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#2794
According to openzfs/spl@acf0ade openzfs/zfs#2794 the hostid handling is not needed anymore. If /etc/hostid does not exist, then spl treats it as 0 and continues operation. Closes #60 Closes #31
According to openzfs/spl@acf0ade openzfs/zfs#2794 the hostid handling is not needed anymore. If /etc/hostid does not exist, then spl treats it as 0 and continues operation. Closes #60 Closes #31
According to openzfs/spl@acf0ade openzfs/zfs#2794 the hostid handling is not needed anymore. If /etc/hostid does not exist, then spl treats it as 0 and continues operation. Closes #60 Closes #31
According to openzfs/spl@acf0ade openzfs/zfs#2794 the hostid handling is not needed anymore. If /etc/hostid does not exist, then spl treats it as 0 and continues operation. Closes #60 Closes #31
According to openzfs/spl@acf0ade openzfs/zfs#2794 the hostid handling is not needed anymore. If /etc/hostid does not exist, then spl treats it as 0 and continues operation. Closes #60 Closes #31
According to openzfs/spl@acf0ade openzfs/zfs#2794 the hostid handling is not needed anymore. If /etc/hostid does not exist, then spl treats it as 0 and continues operation. Closes #60 Closes #31
According to openzfs/spl@acf0ade openzfs/zfs#2794 the hostid handling is not needed anymore. If /etc/hostid does not exist, then spl treats it as 0 and continues operation. Closes #60 Closes #31
Commit openzfs/spl@acf0ade deprecates the
/etc/hostid
file, relaxes its handler, and sets a default of zero. The new default breaks userland imports by causing the/etc/zfs/zpool.cache
file to be updated without aZPOOL_CONFIG_HOSTID
nvpair.When the pool cache is in this state, invoked pool imports always fail. For example:
This happens because:
zpool
program assumes thatZPOOL_CONFIG_HOSTID
exists in the configuration at https://github.com/zfsonlinux/zfs/blob/master/cmd/zpool/zpool_main.c#L1919zpool
program still callsgethostid()
from the system library, which causes an SPA mismatch if a generated value is returned at https://github.com/zfsonlinux/zfs/blob/master/cmd/zpool/zpool_main.c#L1921zpool
, thelibzfs
library uses zero to skip its mismatch check at https://github.com/zfsonlinux/zfs/blob/master/lib/libzfs/libzfs_status.c#L222zfs
module uses zero to skip a VERIFY at https://github.com/zfsonlinux/zfs/blob/master/module/zfs/spa_config.c#L407Automatic pool import is unaffected because it runs in the kernel on a different code path. This corner-case is easier to notice when the proposal in #2779 is enabled. (ie:
zfs_autoimport_disable=1
is the system default.)Fuzzing this behavior on a test bench sometimes causes the additional disappearance of
ZPOOL_CONFIG_HOSTNAME
, which causes assertion failures later.Solaris 11 updates its
/etc/zfs/zpool.cache
file identically when its/etc/hostid
file is forced to zero using the"_I________"
string, but imports are not broken when the hostid is missing or the hostname is empty.For code consistency,
zpool
could be patched to skip its mismatch check on zero too.Alternatively, given that zero is a valid hostid on Linux and seems to be something special on Solaris, the solution could be one or more of:
/etc/hostid
file actually exists.gethostid()
so that it behaves likezone_get_hostid()
in the SPL.The text was updated successfully, but these errors were encountered: