You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue was confirmed privately with hx235 and the explanations on behavior and reproduction steps are provided by her. Opening this issue for traceability.
Assuming disable_wal = true, atomic_flush = false in the example below:
Expected behavior
Assuming no disk corruptions, manual removal of SST files, etc. the recovered database is expected to contain CF2 with mutations 8, 11.
There is no hole in CF2 since we did not skip any operation (e.g. if we skipped 8 and have 11, then it would have been an invalid recovery).
This is a valid recovery despite that there exists data loss to other CFs. Such data loss is fine since atomic flush is false and CFs are independent from each other.
Actual behavior
The current crash-recovery-verification in crash tests actually does not behave correctly and it will treat it as incorrect recovery.
The reason is crash-recovery-verification assumes everything up to db→GetLatestSequenceNumber() is recovered, but this example shows that it is not the case for CFs 1, 3. In this case, the ExpectedState was replayed up to 11 and will have below data:
CF1 - 1, 2, 3, 4, 5, 6, 10
CF2 - 8, 11, explicit flush call just for CF2 after mutation with sequence number 11
CF3 - 7, 9
This will lead to an error in VerifyOrSyncValue() and signal that the recovery of the database has failed, while the database did recover all the data it could with due efforts.
Steps to reproduce the behavior
The above failure can be reproduced by running db stress manually:
(1) Run db stress under a debugger with —column_families=3 —disable_wal=1 –atomic_flush=0 —max_key=15 —flush_one_in=6 (and all the other xx_one_in=0 for simplicity) —writepercent=95 (and all the other xxpercent to be in total 5%) —threads=1 —sync_fault_injection=1
(2) Kill or exit the debugger when it finishes flush call on CF2’s 11 while all the other CFs are unflushed and contain seqnos between 8 and 11.
(3) Re-run db stress with the same settings but with —destroy_db_initially=0.
(4) The verification will fail for CF1 or CF3 on seqno before 11 saying something like “value_from_expected:.... Value not found: NotFound: Crash-recovery verification failed :(”
The text was updated successfully, but these errors were encountered:
I wonder if there is more detail on the use case or possible bugs. Typically atomic_flush would be enabled when WAL is disabled and crash-recovery consistency is needed. Otherwise inconsistency across column families will happen by design. I suppose a user could attempt to reconcile the inconsistency by inspecting the SST file seqnos. Or they might not care about consistency. But these users are hypothetical AFAIK, unless I'm told otherwise.
For possible bugs, disabling WAL seems to reduce the number of places a hole could occur, compared to having WAL enabled without atomic_flush. Maybe I'm wrong.
This issue was confirmed privately with hx235 and the explanations on behavior and reproduction steps are provided by her. Opening this issue for traceability.
Assuming disable_wal = true, atomic_flush = false in the example below:
Expected behavior
Assuming no disk corruptions, manual removal of SST files, etc. the recovered database is expected to contain CF2 with mutations 8, 11.
There is no hole in CF2 since we did not skip any operation (e.g. if we skipped 8 and have 11, then it would have been an invalid recovery).
This is a valid recovery despite that there exists data loss to other CFs. Such data loss is fine since atomic flush is false and CFs are independent from each other.
Actual behavior
The current crash-recovery-verification in crash tests actually does not behave correctly and it will treat it as incorrect recovery.
The reason is crash-recovery-verification assumes everything up to db→GetLatestSequenceNumber() is recovered, but this example shows that it is not the case for CFs 1, 3. In this case, the ExpectedState was replayed up to 11 and will have below data:
CF1 - 1, 2, 3, 4, 5, 6, 10
CF2 - 8, 11, explicit flush call just for CF2 after mutation with sequence number 11
CF3 - 7, 9
This will lead to an error in VerifyOrSyncValue() and signal that the recovery of the database has failed, while the database did recover all the data it could with due efforts.
Steps to reproduce the behavior
The above failure can be reproduced by running db stress manually:
(1) Run db stress under a debugger with —column_families=3 —disable_wal=1 –atomic_flush=0 —max_key=15 —flush_one_in=6 (and all the other xx_one_in=0 for simplicity) —writepercent=95 (and all the other xxpercent to be in total 5%) —threads=1 —sync_fault_injection=1
(2) Kill or exit the debugger when it finishes flush call on CF2’s 11 while all the other CFs are unflushed and contain seqnos between 8 and 11.
(3) Re-run db stress with the same settings but with —destroy_db_initially=0.
(4) The verification will fail for CF1 or CF3 on seqno before 11 saying something like “value_from_expected:.... Value not found: NotFound: Crash-recovery verification failed :(”
The text was updated successfully, but these errors were encountered: