-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Odd MD pseudo-crash related to ZFS memory issue? #1619
Comments
On 07/30/2013 10:48 PM, cousins wrote:
Why do you suspect zfs?
I saw similar issues many times with md raid. tamas |
Hi Tamas, We've been having a fair amount of trouble with ZFSonLinux that make me a bit gun-shy: #1179 and openzfs/spl#247. I've used MD for over 10 years on many systems and I've never seen this behavior before. I admit that I have no proof that ZFS had anything to do with this. That is why I'm asking if anyone has seen anything like this. Just trying to get more information. Steve |
I saw similar (not exactly the same) issues, when there was crappy HDD (check smart), or bad SATA/power connector, crappy HBA or its driver. In similar case with HW raid I saw timeouts both in linux and controller logs. In such a case disabling HDD cache can help (if it's a HW raid array). |
Even if this was caused by ZFS without additional information to go on there's not much which can be done. |
Just checking to see if anybody else has had anything similar to this happen:
I have a large (60 4TB disks in 6 groups of 10 disk raidz2's) zfs pool that I have been rsyncing data to from a couple of other systems. While on vacation and checking in (I know, always a bad idea) I noticed one of the rsyncs had hung but the others were still going. While investigating I found that certain commands would give me "I/O error"s. Then I found that the mirrored OS volume was degraded but with /dev/sda2 having been thrown out. sda1 and sda3 were still active in their mirrors though. /usr/bin and /usr/sbin showed I/O errors and my vacation wasn't much fun for a while.
I eventually booted from a DVD and poked around. The md devices were fine. The underlying hardware was fine. The file system was fine. I booted into the OS (Centos 6.4) again and everything is fine again. I didn't have to add /dev/sda2 back into the mirror and have it sync. It was just fine.
My guess (along with a tech-support person from the vendor we bought the hardware from) is that ZFS somehow tromped on memory, and put the root volume in a very weird state. Looking at the logs, I don't see any entries since the 19th which.
Has anyone seen anything similar to this?
Thanks,
Steve
The text was updated successfully, but these errors were encountered: