-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BREAKING] opt(compactions): Improve compaction performance #1574
Conversation
…ction priority adjustment.
…score exceeds 1.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found. 1 rules errored during the review.
levels.go
Outdated
// concurrently, only iterating over the provided key range, generating tables. | ||
// This speeds up the compaction significantly. | ||
func (s *levelsController) subcompact(it y.Iterator, kr keyRange, cd compactDef, | ||
inflightBuilders *y.Throttle, res chan *table.Table) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returned channels or channel arguments should generally have a direction.
@@ -66,7 +66,7 @@ var ( | |||
vlogMaxEntries uint32 | |||
loadBloomsOnOpen bool | |||
detectConflicts bool | |||
compression bool | |||
zstdComp bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid global variables to improve readability and reduce complexity
Implement multiple ideas for speeding up compactions: 1. Dynamic Level Sizes: https://rocksdb.org/blog/2015/07/23/dynamic-level.html 2. L0 to L0 compactions: https://rocksdb.org/blog/2017/06/26/17-level-based-changes.html 3. Sub Compactions: Split up one compaction into multiple sub-compactions using key ranges, which can be run concurrently. 4. If a table being generated at Li overlaps with >= 10 tables at Li+1, finish the table. This helps avoid big overlaps and expensive compactions later. 5. Update compaction priority based on the priority of the next level prioritizing compactions of lower levels over upper levels, resulting in an always healthy LSM tree structure. With these changes, we can load 1B entries (160GB of data) into Badger (without the Stream framework) in 1h25m at 31 MB/s. This is a significant improvement over current master. Co-authored-by: Ibrahim Jarif <[email protected]> fix(tests): Writebatch, Stream, Vlog tests (#1577) This PR fixes the following issues/tests - Deadlock in writes batch - Use atomic to set value of `writebatch.error` - Vlog Truncate test - Fix issues with empty memtables - Test options - Set memtable size. - Compaction tests - Acquire lock before updating level tables - Vlog Write - Truncate the file size if the transaction cannot fit in vlog size - TestPersistLFDiscardStats - Set numLevelZeroTables=1 to force compaction. This PR also fixes the failing bank test by adding an index cache to the bank test.
Implement multiple ideas for speeding up compactions:
With these changes, we can load 1B entries (160GB of data) into Badger (without the Stream framework) in 1h25m at 31 MB/s. This is a significant improvement over current master.
This change is