-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raidz1: device IO failure when zfs filesystem is full #742
Comments
Maybe this is a special issue with loopback devices in general. I tried to reproduce the problem with "real" devices in a vbox-VM (different kernel, 2.6.32), but no problems so far. Only the raidz1 based on loopback devices won't work, it killed the whole VM: May 13 16:00:55 debian kernel: [ 1016.353645] BUG: unable to handle kernel NULL pointer dereference at (null) Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:55 ... Message from syslogd@debian at May 13 16:00:56 ... Message from syslogd@debian at May 13 16:00:56 ... Message from syslogd@debian at May 13 16:00:56 ... Message from syslogd@debian at May 13 16:00:56 ... Message from syslogd@debian at May 13 16:00:56 ... Message from syslogd@debian at May 13 16:00:56 ... Message from syslogd@debian at May 13 16:00:58 ... Message from syslogd@debian at May 13 16:00:58 ... Message from syslogd@debian at May 13 16:00:58 ... Message from syslogd@debian at May 13 16:00:58 ... Message from syslogd@debian at May 13 16:00:58 ... Message from syslogd@debian at May 13 16:00:58 ... |
Regarding running on loopback devices you might try using the latest master source instead of -rc8. There were a few changes merged which improved the memory management and may avoid certain deadlock scenarios similar to what you've described. |
This should be resolved in the master source, what will be tagged as -rc11. |
Reduce code duplication by combining `Pool::open_from_txg()` (opening an existing pool) and `Pool::open()` (opening/creating a new pool).
I'm still stress testing zfs with tools like bonnie, iozone, dbench and some self made beastiness on some small (~100 MB) loopback devices. ;-)
I think I've found a problem regarding the device management in raidz1 mode: If the file system is full "zpool status -v" reports files errors in metadata:
root@heros:~/test# zpool status -v
pool: pool1
state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: http://zfsonlinux.org/msg/ZFS-8000-HC
scan: scrub repaired 0 in 0h0m with 0 errors on Sun May 13 13:16:16 2012
config:
errors: Permanent errors have been detected in the following files:
The pool ends up in this irreversible state of blocked IO.
So far, this happens in two- and three-device raidz1-configuration, but not in one-, two- or three-device mirrors or pools.
The error occurs with and without permanent (one started scrub per second) scrubbig.
The text was updated successfully, but these errors were encountered: