-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't fsync wal before deletion when only single CF exists #3840
Labels
up-for-grabs
Up for grabs
Comments
solicomo
added a commit
to solicomo/rocksdb
that referenced
this issue
Nov 1, 2018
Summary: Call `SyncClosedLogs()` only if there are more than one column families. See facebook#3840 for more info. The fix is pretty simple. We only need to change 0 to 1. But the hard part of this PR is fixing the test failures uncovered by this fix. - The first test failure came up is `DBFlushTest.SyncFail`. It is the easiest one. This test is trying to test the behavior of `SyncClosedLogs` when the file system is inactive. So it expects `SyncClosedLogs` can be called. To make sure `SyncClosedLogs` can be called, we have to create/open RocksDB with at least 2 column families. Extra '/' is to blame for all the following test failures. - The next test failure is `FaultInjectionTestSplitted.FaultTest`. After several steps, this test will try to delete all the files which is not fully synced. `FaultInjectionTestEnv` is used in this test case. `FaultInjectionTestEnv` is tracking all the files opened and deleted. Unfortunately, when a file is opened, `FaultInjectionTestEnv` is told that the file name is like `testrocksdb-26040/fault_test_2273688893793634580/000003.log`, but when that file is deleted, `FaultInjectionTestEnv` is told that the file name is like `testrocksdb-26040/fault_test_2273688893793634580//000003.log`. See the extra '/'? This extra '/' is added to the file name [on line 459 in `db_impl_files.cc`](https://github.com/facebook/rocksdb/blob/v5.15.10/db/db_impl_files.cc#L459). But why this test case passed before? Maybe it passed "by mistake". - `DBWALTestWithEnrichedEnv.SkipDeletedWALs` is the last test failure. After I fixed the extra '/' issue, I got this test failure. This test case was added in this commit: facebook@d595492 This commit tries to skip deleted WALs during recovery. If I am right, it works only when `allow_2pc` is enabled. But in this test case, `allow_2pc` is not enabled at all. Again, why this test case passed before? I think it is the extra '/' who should be blamed.
ajkr
pushed a commit
to cockroachdb/rocksdb
that referenced
this issue
Apr 10, 2019
) Summary: Call `SyncClosedLogs()` only if there are more than one column families. Update several unit tests (in `fault_injection_test` and `db_flush_test`) correspondingly. See facebook#3840 for more info. Pull Request resolved: facebook#4460 Differential Revision: D12896377 Pulled By: riversand963 fbshipit-source-id: f49afdaec32568f12f001219a3aec1dfde3b32bf
ajkr
pushed a commit
to cockroachdb/rocksdb
that referenced
this issue
Apr 16, 2019
) Summary: Call `SyncClosedLogs()` only if there are more than one column families. Update several unit tests (in `fault_injection_test` and `db_flush_test`) correspondingly. See facebook#3840 for more info. Pull Request resolved: facebook#4460 Differential Revision: D12896377 Pulled By: riversand963 fbshipit-source-id: f49afdaec32568f12f001219a3aec1dfde3b32bf
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Calling
SyncClosedLogs
only necessary for system-crash recovery in DBs with multiple CFs: https://github.com/facebook/rocksdb/blob/5.8.fb/db/db_impl_compaction_flush.cc#L109However right now we will always fsync wal even only a single CF exists.
Basically the only code fix needed is to change the condition from
versions_->GetColumnFamilySet()->NumberOfColumnFamilies() > 0
toversions_->GetColumnFamilySet()->NumberOfColumnFamilies() > 1
, but that fix might uncover some test failures so please fix the tests too.The text was updated successfully, but these errors were encountered: