-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two bugfixes for 2.0.6 #12346
Two bugfixes for 2.0.6 #12346
Conversation
This reverts commit 13fac09. Per the discussion in openzfs#11531, the reverted commit---which intended only to be a cleanup commit---introduced a subtle, unintended change in behavior. Care was taken to partially revert and then reapply 10b3c7f which would otherwise have caused a conflict. These changes were squashed in to this commit. Reviewed-by: Brian Behlendorf <[email protected]> Suggested-by: @chrisrd Suggested-by: [email protected] Signed-off-by: Antonio Russo <[email protected]> Closes openzfs#11531 Closes openzfs#12227
Callers of zfs_file_get and zfs_file_put can corrupt the reference counts for the file structure resulting in a panic or a soft lockup. When zfs send/recv runs, it will add a reference count to the open file, and begin to send or recv the stream. If the file descriptor is closed, then when dmu_recv_stream() or dmu_send() return we will call zfs_file_put to remove the reference we placed on the file structure. Unfortunately, because zfs_file_put() uses the file descriptor to lookup the file structure, it may end up finding that the file descriptor table no longer contains the file struct, thus leaking the file structure. Or it might end up finding a file descriptor for a different file and blindly updating its reference counts. Other failure modes probably exists. This change reworks the zfs_file_[get|put] interface to not rely on the file descriptor but instead pass the zfs_file_t pointer around. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Co-authored-by: Allan Jude <[email protected]> Signed-off-by: George Wilson <[email protected]> External-issue: DLPX-76119 Closes openzfs#12299
bb394d0
to
f6d53b3
Compare
I'm pinging this because I'll have some time this weekend to see if this MR resolves #11688, but I'd like some expert review before running it on my desktop. |
When logging TX_SETATTR, we could otherwise fail to initialize part of the corresponding ZIL record depending on which fields are present in the xvattr. Initialize the creation time and the AV scan timestamp to zero so that uninitialized bytes are not written to the ZIL. This was found using KMSAN. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Mark Johnston <[email protected]> Closes openzfs#12383
When allocating a record, we round up the allocation size to a multiple of 8. In this case, any padding bytes should be zeroed, otherwise the contents of uninitialized memory are written to the ZIL. This was found using KMSAN. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Mark Johnston <[email protected]> Closes openzfs#12383
When logging a TX_WRITE record in the case where file data has to be copied from the DMU, we pad the log record size to a multiple of 8 bytes. In this case, any padding bytes should be zeroed, otherwise the contents of uninitialized memory are written to the ZIL. This was found using KMSAN. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Mark Johnston <[email protected]> Closes openzfs#12383
It seems nothing ensures that this array is zeroed when a dnode is freshly allocated, so in principle it retains the values from the previous allocation. In practice it seems to be the case that the fields should end up zeroed, but we can zero the field anyway for consistency. This was found using KMSAN. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Mark Johnston <[email protected]> Closes openzfs#12383
Sorry for the late response - I got assigned to look at this while I was on vacation. Can you re-submit so buildbot fires off again? Looks like a bunch of the builders never completed. |
dda1f2c
to
b54695c
Compare
I pushed these again--I haven't had a chance to look at the failures. |
Description
This MR cherry picks
The second was not completely clean (a very minor fixup due to a631283, which change the ZFS device allocation code, making
zfsdev_minor_alloc
static). @grwilson, could you glance at this and make sure nothing more serious happened since 2.0.5 that would require more changes to your MR?(Obviously, this shouldn't be merged until the aforementioned MR is actually finalized)(Both commits are now been merged.)Motivation and Context
I suspect these two patches might resolve a bug I'm experiencing since 2.0, and I'd like feedback on the cherry pick before I run it on the production machine --- the only way I have figured out how to reproduce the bug.
How Has This Been Tested?
ZTS locally (a VM). I opened this MR partly for the CI infrastructure feedback.
Types of changes
Checklist:
Signed-off-by
.