-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ASSERT "atomic_read(&ZTOI(zp)->i_count) > 0)" in zfs_sb_teardown #1536
Comments
To clarify, the exact way I got this assert the previous time was to:
When the send stream was finished (confirmed by observing "zfs send -v" output), the assert occurred. I'm going to try this a few more times and also take a peek at the pool history. |
I've created the small patch dweeezil/zfs@5cbf1fcd to address this for the time being. I'll likely turn it into a pull request if it's the right thing to do. |
As part of zfs_sb_teardown() there is an assertion that all inodes which are part of the zsb->z_all_znodes list have at least one reference on them. This is always true for the standard unmount case but there are two other cases where it is not strictly true. * zfs_ioc_rollback() - This is the most common case and it results from the fact that we aren't unmounting the filesystem. During a normal unmount the MS_ACTIVE flag will be cleared on the super block causing iput_final() to evict the inode when its reference count drops to zero. However, during a rollback MS_ACTIVE remains set since we're rolling back a live filesystem and need to preserve the existing super block. This allows inodes with a zero reference count to stay in the cache thereby violating the assertion. * destroy_inode() / zfs_sb_teardown() - There exists a small race between dropping the last reference on an inode and removing it from the zsb->z_all_znodes list. This is unlikely to occur but could also trigger the assertion which is incorrect. The inode may safely have a zero reference count in this case. Since allowing a zero reference count on the inode is expected and safe for both of these cases the simplest thing to do is remove the ASSERT. This code is only enabled for default builds so removing this entirely is a very safe change. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chris Dunlop <[email protected]> Signed-off-by: Tim Chase <[email protected]> Closes openzfs#1417 Closes openzfs#1536
As part of zfs_sb_teardown() there is an assertion that all inodes which are part of the zsb->z_all_znodes list have at least one reference on them. This is always true for the standard unmount case but there are two other cases where it is not strictly true. * zfs_ioc_rollback() - This is the most common case and it results from the fact that we aren't unmounting the filesystem. During a normal unmount the MS_ACTIVE flag will be cleared on the super block causing iput_final() to evict the inode when its reference count drops to zero. However, during a rollback MS_ACTIVE remains set since we're rolling back a live filesystem and need to preserve the existing super block. This allows inodes with a zero reference count to stay in the cache thereby violating the assertion. * destroy_inode() / zfs_sb_teardown() - There exists a small race between dropping the last reference on an inode and removing it from the zsb->z_all_znodes list. This is unlikely to occur but could also trigger the assertion which is incorrect. The inode may safely have a zero reference count in this case. Since allowing a zero reference count on the inode is expected and safe for both of these cases the simplest thing to do is remove the ASSERT. This code is only enabled for default builds so removing this entirely is a very safe change. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chris Dunlop <[email protected]> Signed-off-by: Tim Chase <[email protected]> Closes openzfs#1417 Closes openzfs#1536
I came across this while tracking down the failure reported in #1518. When running with --enable-debug, I can get this assert rather easily:
The problem of #1518 appeared to me to be part of the forced unmount/remount that happens when a zfs receive is sent to a mounted file system. While trying to reproduce the NULL dereference (which I couldn't do), I tried doing various things to make a the receive fail such as cd-ing to the mounted directory on the received side, polluting the mounted FS with new files, etc. The best I could get so far is the above assertion.
I was initially running dweeezil/zfs@357fa8e [which is the same code as my actual pull request dweeezil/zfs@8592f00 but is my working branch with individual commits] and then reverted to 0377189 which is the last master I've pulled and still was able to get the assert.
The text was updated successfully, but these errors were encountered: