-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed data integrity issue when underlying disk returns error to zfs #12443
Changes from 1 commit
d3e1def
5f72b15
50cf019
6256570
35a74a6
4a45938
25de752
5bbbcd0
1b70a86
bad4ca3
05bf1fd
5183aab
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1179,7 +1179,8 @@ zil_lwb_flush_vdevs_done(zio_t *zio) | |
ASSERT3P(zcw->zcw_lwb, ==, lwb); | ||
zcw->zcw_lwb = NULL; | ||
|
||
zcw->zcw_zio_error = zio->io_error; | ||
if (zio->io_error != 0) | ||
zcw->zcw_zio_error = zio->io_error; | ||
|
||
ASSERT3B(zcw->zcw_done, ==, B_FALSE); | ||
zcw->zcw_done = B_TRUE; | ||
|
@@ -1253,6 +1254,24 @@ zil_lwb_write_done(zio_t *zio) | |
* written out. | ||
*/ | ||
if (zio->io_error != 0) { | ||
/* | ||
* Copy the write error to zcw, becaues the zil_lwb_write_done | ||
* error is not propagated to zil_lwb_flush_vdevs_done, which | ||
* will cause zil_commit_impl to return without committing | ||
* the data. | ||
* Refer https://github.com/openzfs/zfs/issues/12391 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @problame thank you for the comments. I have updated the PR. |
||
* for more details. | ||
*/ | ||
zil_commit_waiter_t *zcw; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm curious.. this block of code you're adding, looks similar to the block of code in the "flush done" function.. i.e. starting at line 1173.. but I see some differences, such as the fact that this block you're adding doesn't call:
nor does it set are these differences intentional? it's been awhile since I've been in this code, so I'm just curious if we should be using the same exact logic in both cases, here and in the flush function? in this error case, do we still call the "flush done" function? I presume not, which is why this change is needed.. but please correct me if I'm wrong. |
||
for (zcw = list_head(&lwb->lwb_waiters); zcw != NULL; | ||
zcw = list_next(&lwb->lwb_waiters, zcw)) { | ||
mutex_enter(&zcw->zcw_lock); | ||
ASSERT(list_link_active(&zcw->zcw_node)); | ||
ASSERT3P(zcw->zcw_lwb, ==, lwb); | ||
zcw->zcw_zio_error = zio->io_error; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AFAIK this is the first place where we might set There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
it is in
so I think we could do this verification here too. |
||
mutex_exit(&zcw->zcw_lock); | ||
} | ||
|
||
while ((zv = avl_destroy_nodes(t, &cookie)) != NULL) | ||
kmem_free(zv, sizeof (*zv)); | ||
return; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be a comment here explaining why we need to do this check.
Also, we should VERIFY that
zcw->zcw_zio_error == 0
before overwriting it withzio->io_error
.IIUC, we can assert that because, IIUC, we don't issue the flush if the write fails.
We should actually
VERIFY
becausea) it's not on a hot code path and
b) it's critical for correctness.
The comment should address the assertion as well.