Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAS-115860 / 22.12 / fix importing zpools on SCALE #8972

Merged
merged 5 commits into from
May 18, 2022
Merged

Conversation

yocalebo
Copy link
Contributor

@yocalebo yocalebo commented May 17, 2022

Discovered and witnessed on an internal M50 (no expansion shelf) as well as an R50b with an ES60 expansion shelf (in the field). The simplified version is that /dev/disk/by-partuuid symlinks aren't being created by the time the raw devices are being populated inside /dev/.

Since we're trying to import the zpools specifying /dev/disk/by-partuuid AND /dev/ it means zpool is importing the disks via gptid but if it can't find one, it's choosing a random raw device. The device letters for raw devices aren't guaranteed between reboots and often change so when the zpool is imported, certain devices are "missing" and other drives are being put in their place.

This is painful because the zpool is now imported and in an unhealthy state. This adds a simple helper script that gets called as a prerequisite to the ExecStart entries in the ix-zfs.service file. Testing this on the M50, fixed the problem and the zpool imported with gptid's and produced a healthy pool.

NOTE: This is still "flawed" and not something that I want to do but our hand is forced. The "proper" solution is for openzfs to actually integrate with systemd and use the proper mechanism(s) to automagically handle this.

The ix-zfs.service has a time limit of 15mins while we allow udev "events" to be received for a maximum of 10mins. This worked on the R50b and the M50 so I'm leaving those time limits for now.

BONUS: adding After=systemd-udevd-settle.service is what I initially thought would be the "solution" but that emits a DeprecationWarning from systemd. Furthermore, reading the documentation uses big scary verbiage about how this is absolutely the wrong approach, so I've refrained from doing that.

@yocalebo yocalebo requested a review from a team May 17, 2022 17:21
@bugclerk bugclerk changed the title fix importing zpools on SCALE NAS-115860 / 22.12 / fix importing zpools on SCALE May 17, 2022
@bugclerk
Copy link
Contributor

Copy link
Member

@sonicaj sonicaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think it's good, i don't mind. I was just not seeing a reason to let a list/deque grow to huge lengths when we are only interested in last 2 events maximum

@yocalebo
Copy link
Contributor Author

If you think it's good, i don't mind. I was just not seeing a reason to let a list/deque grow to huge lengths when we are only interested in last 2 events maximum

That's a valid point, let me make sure .append( doesn't block once it's full

@yocalebo
Copy link
Contributor Author

retest this please

@themylogin
Copy link
Contributor

@yocalebo @sonicaj why did we remove flake8-import-order?

@themylogin
Copy link
Contributor

Also why is this a script outside of middleware?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants