-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use after free by zio_done #6401
Comments
Hm, as I was writing that bugreport, also had a very related crash:
So it looks like something is actively freeing a mutex as it's being used by the io thread? |
and I had another one in the same mutex reference:
this is again in the same spl_mutex_lockdep_on_maybe() place |
This race happens when a mutex is used for refcounting. To prevent it, these two lines should be reversed (see the comment above). |
Thanks! Reverting the two lines is an effective fix for me and looking at them the current way looks to be very incorrect indeed. |
It was unfortunately overlooked since it was part of a larger change. I think the right way to move forward here would be to get a minimal fix in to the 0.7 release branch and master. Then work on getting the larger change reviewed and merged to master. @ironMann it would be great if you could open a PR with the minimal fix. It sounds like @verygreen can confirm it does resolve the issue as expected. |
Flipping those to line will indeed fix this, though I think you need to flip the top two lines as well. Otherwise, the lock/unlock will become unbalanced in lockdep. |
Prevent race on accessing kmutex_t when the mutex is embedded in a ref counted structure. Issue openzfs/zfs#6401 Signed-off-by: Gvozden Neskovic <[email protected]>
@verygreen kudos for finding this, also for the intro to @tuxoko I believe you're right 👍 |
Prevent race on accessing kmutex_t when the mutex is embedded in a ref counted structure. Issue openzfs/zfs#6401 Signed-off-by: Gvozden Neskovic <[email protected]>
Prevent race on accessing kmutex_t when the mutex is embedded in a ref counted structure. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Gvozden Neskovic <[email protected]> Closes openzfs/zfs#6401 Closes #637
Prevent race on accessing kmutex_t when the mutex is embedded in a ref counted structure. Issue openzfs/zfs#6401 Signed-off-by: Gvozden Neskovic <[email protected]>
Prevent race on accessing kmutex_t when the mutex is embedded in a ref counted structure. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Gvozden Neskovic <[email protected]> Closes openzfs/zfs#6401 Closes openzfs#637
I started to do Lustre on zfs testing on a kernel with debug_pagealloc enabled (the option that unmaps memory once it is freed to catch use after free cases) and I have been having a lot of this sort of crashes.
This is an otherwise standard rhel 7.2 kernel version 3.10.0-327.22.2.el7 with just a lot of debug enabled and mutex_lock_nested converted to EXPORT_SYMBOL (without GPL).
In particular LOCK_DEP is enabled
I will try plain zfs to see if I can arrive at a plain posix reproducer, but just wanted to throw it here in case somebody sees an obvious problem right away.
This is in
So it seems the mutex itself is freed by the time we are trying to check it? This certainly is not great.
The text was updated successfully, but these errors were encountered: