Skip to content
This repository has been archived by the owner on Feb 26, 2020. It is now read-only.

System hang #41

Closed
wengole opened this issue May 6, 2011 · 7 comments
Closed

System hang #41

wengole opened this issue May 6, 2011 · 7 comments
Labels

Comments

@wengole
Copy link

wengole commented May 6, 2011

I have been experiencing frequent total system hangs whilst using 0.6.0 RCs. Below is all I can find related to it in the kernel logs:
Is there somewhere else I should be looking too?

May 5 19:03:21 san kernel: [ 381.857685] SPL: Showing stack for process 3044
May 5 19:03:21 san kernel: [ 381.857696] Pid: 3044, comm: zpool Tainted: P 2.6.38-8-generic #42-Ubuntu
May 5 19:03:21 san kernel: [ 381.857702] Call Trace:
May 5 19:03:21 san kernel: [ 381.857730] [] ? spl_debug_dumpstack+0x27/0x40 [spl]
May 5 19:03:21 san kernel: [ 381.857747] [] ? kmem_alloc_debug+0x11d/0x130 [spl]
May 5 19:03:21 san kernel: [ 381.857857] [] ? zfs_ioc_pool_get_history+0xaf/0x120 [zfs]
May 5 19:03:21 san kernel: [ 381.857871] [] ? pool_namecheck+0x5e/0x160 [zcommon]
May 5 19:03:21 san kernel: [ 381.857932] [] ? zfsdev_ioctl+0xe8/0x1b0 [zfs]
May 5 19:03:21 san kernel: [ 381.857943] [] ? do_vfs_ioctl+0x8f/0x360
May 5 19:03:21 san kernel: [ 381.857952] [] ? vfs_write+0x123/0x180
May 5 19:03:21 san kernel: [ 381.857959] [] ? sys_ioctl+0x91/0xa0
May 5 19:03:21 san kernel: [ 381.857967] [] ? system_call_fastpath+0x16/0x1b

@behlendorf
Copy link
Contributor

The kernel logs are absolutely the place to look. The above message you posted however is just an unrelated warning and doesn't indicate what the problem is. If you haven't been testing with the latest -rc4 release candidate please try it. We've resolved a couple bugs which can result in system hangs:

https://github.com/downloads/behlendorf/spl/spl-0.6.0-rc4.tar.gz
https://github.com/downloads/behlendorf/zfs/zfs-0.6.0-rc4.tar.gz

@wengole
Copy link
Author

wengole commented May 6, 2011

Just upgraded to the latest packages in Darik's PPA and still getting system hangs every time I try and load the module. Well at least I think that's when it is. This is what happens:

  • System hangs
  • Reboot system
  • Login via SSH
  • Everything's fine
  • zpool status ... SSH session becomes unresponsive
  • Console shows a stack trace and has frozen

Unfortunately the stack trace doesn't get saved anywhere that I can see, so it's hard to submit it.

My system is running Ubuntu server 11.04 and 2.6.38-8-generic kernel

@wengole
Copy link
Author

wengole commented May 7, 2011

I managed to take a picture of (at least part) of the kernel panic on the console. I hope it helps:
https://picasaweb.google.com/lh/photo/VLOsHcPMUJ7RV5eHBnXMZyx8BRHVeF9eujsD7Y6Ailc?feat=directlink

@behlendorf
Copy link
Contributor

That's exactly what we needed to know to fix this. It looks like a stack overflow due to a scrub you started. Can you try setting the zfs_no_scrub_io=1 module option. This should allow you to import the pool and then manually disable the scrub, zpool scrub -s pool, until we can come up with a proper long term fix.

@wengole
Copy link
Author

wengole commented May 7, 2011

Thanks. I managed to get it going again, and now I know not to do any scrubbing. At least for the time being.

@psynophile
Copy link

I'm seeing the same thing here. Even the same panic. I was able to use the zfs_no_scrub_io option that's listed here to cancel the scrub and mount the pool, but now I'm getting massively slow and glitchy SSH sessions and it doesn't look like anything else is going on (top reports more than half of my RAM is available and load averages are below 0).

I'm using Ubuntu 10.10 with kernel 2.6.35-28-server 64-bit on a 24T RAID card from Areca (ARC-1680) with JBOD configured on the card. I compiled 0.6.0-rc4.

If there's any other output or logs that I can provide, just let me know and thanks for working on this.

@behlendorf
Copy link
Contributor

Closing issue, several stack fixes have been merged in to master to address this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants