-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Importing ZFS pool xyz Out of memory" crash at boot. #3863
Comments
Might this be related to #3866? |
Hello Frans I have not yet tried to mount the zraid1 pool by hand inside linux to see if it complains about memory the same way it does during boot. I am however testing other zfs raid options instead of zraid1. So far with zraid10 (2x1TB + 2x1TB), I have already copied 700GB of data, and the computer boots just fine into linux. In contrast, with zraid1 the computer will not boot if i have more than 10GB of data on the zpool (4x1TB). There is a definite issue with zraid1 and out of memory problems, at least with 4 hard drives. I will also try zraid1 with 2 and 3 hard drivers to reach a consensus. |
@DannCos from what you're describing it sounds as if we're allocating a significant amount of working memory during the import. Two quick questions might help us narrow this down.
|
@behlendorf Am I correct to assume "sudo shutdown -r now" properly unmounts the zpool before actually rebooting? I never read we were supposed to export the zpool before rebooting the OS. I say this because you might be correct. The issue only happens when rebooting the computer, but only under three simultaneous conditions:
I created and destroyed my pool many times to reach the following conclusions:
Also, one more thing that contributes to your assumption: After the OS boot hangs with "out of memory", I shutdown the computer to insert another +2GB of RAM and the OS now boots correctly. I shutdown the computer again and remove the extra 2GB of RAM and the OS continues to boot correctly. I then copy another chunk of data to the pool, reboot, and it hangs again. I add another 2 GB of ram again, it boots correctly, remove the 2GB of ram, and it continues boots correctly. So every time I copy chunk of data to the pool and reboot, it will hang. slabtop will not tell me anything because everytime I export/import the pool inside the OS, it mounts properly everytime. |
@DannCos it's entirely safe to just pull the plug on the system but when you do so there may be some pending work which needs to be completed during import and subsequent mount. This shouldn't take a significant amount of memory but clearly something unexpected is going on. Unfortunately, the only way we're going to be able to get to the bottom of this is to get some debugging on where the memory is being used at the time of the OOM. Or even just a back trace from the console. If you can drop to a recovery shell after the boot fails and run |
Hello
I replicated the following scenario in 2 different computers running the following specs:
-Computer one has one zpool of 4 x 500GB in zraid1 (4 mirrors).
-Computer two has one zpool of 4 x 1000GB in zraid 1 (4 mirrors)
Issue:
After a certain amount of space is filled on your zpool (please jump to the end of this post to find how much was needed to trigger this issue) and you reboot for whatever reason not related to this problem, you find an "Out of memory" warning at boot time that appears while importing the zpool and prevents linux from completing the boot. The error reads
"Importing ZFS pool zpool1000gb Out of memory: Kill process XYZ or sacrifice child
Out of memory: Kill process YZX or sacrifice child
Out of memory: Kill process ZYX or sacrifice child"
-On computer one (4x500GB), the issue occurred after 180GB of data.
-On computer two (4x1000GB), the issue occurred after ONLY 10GB of data.
In both cases, I had to insert another stick of ram (from 4GB to 6GB) for the computer to boot properly.
In both cases, there was no issues while copying the data.
I found this very weird, has I have another server with 8 x 2TB in zraid10 with 16GB of ram, running with the zpool at 90% capacity, and this never happens.
cheers
The text was updated successfully, but these errors were encountered: