Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drain unnecessary levels when level_compaction_dynamic_level_bytes=true #11340

Closed
wants to merge 8 commits into from

Conversation

cbi42
Copy link
Member

@cbi42 cbi42 commented Mar 30, 2023

Summary: When a user migrates to level compaction + level_compaction_dynamic_level_bytes=true, or when a DB shrinks, there can be unnecessary levels in the DB. Before this PR, this is no way to remove these levels except a manual compaction. These extra unnecessary levels make it harder to guarantee max_bytes_for_level_multiplier and can cause extra space amp. This PR boosts compaction score for these levels to allow RocksDB to automatically drain these levels. Together with #11321, this makes migration to level_compaction_dynamic_level_bytes=true automatic without needing user to do a one time full manual compaction. Credit: this PR is modified from #3921.

Test plan:

  • New unit tests
  • python3 tools/db_crashtest.py whitebox --simple which randomly sets level_compaction_dynamic_level_bytes in each run.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cbi42 has updated the pull request. You must reimport the pull request before landing.

@cbi42 cbi42 requested a review from ajkr March 31, 2023 18:57
@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR, thanks!

Comment on lines 1117 to 1118
// Otherwise L5 will always be eligible for compaction and
// WaitForCompact(true) below will be stuck waiting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting means sleeping, not infinitely rewriting L5, right?

Copy link
Member Author

@cbi42 cbi42 Apr 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be infinite rewrite of L5 since it always has a eligible compaction score but nothing is written to the last level during each compaction of L5. This causes the WaitForCompact(true) in this unit test to wait forever.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm that isn't ideal. We should see if there's a way to not pick a draining compaction when the output level is not necessarily lower.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I've updated the lowest_unnecessary_level_ logic to exclude penultimate level with per key placement. Note that this problem may still exist in general with level compaction + per key placement. It can happen when penultimate level is over target size but with all hot data.

Comment on lines -9194 to -9216
options.level_compaction_dynamic_level_bytes = true;
Reopen(options);
// note that last level (L6) should be empty
ASSERT_EQ("1,0,0,1,2", FilesPerLevel());
verify_db();

// turning the options on and off should both be safe
options.level_compaction_dynamic_level_bytes = false;
Reopen(options);
MoveFilesToLevel(1);
ASSERT_EQ("0,1,0,1,2", FilesPerLevel());
verify_db();

// newly flushed file is also pushed down
options.level_compaction_dynamic_level_bytes = true;
Reopen(options);
ASSERT_EQ("0,0,1,1,2", FilesPerLevel());
verify_db();

// files will be pushed down to last level (L6)
options.allow_ingest_behind = false;
Reopen(options);
ASSERT_EQ("0,0,0,1,1,2", FilesPerLevel());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this deleted because it's difficult to adapt? Maybe the assertions can just verify the number of files in the bottom level (L5 since num_levels=6) is zero before the final Reopen(), and nonzero after it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did find it difficult to adapt since there is some unnecessary level in the LSM which can be drained in the background. I could not just disable auto compaction since it would also disallow trivially moving files down the LSM during DB open. However, I eventually found out a better reason to remove the test :) Otherwise, I should probably just verify the number of last level as you suggested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, IIRC there was a comment elsewhere in this PR referring to the better reason - but allow_ingest_behind feels so close to working with leveled compaction that we might as well start testing it when convenient and when there's known risks. Maybe the test can say support is not officially released since on rare occasions people infer what's supported by looking at test expectations

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just verify the number of files in the bottom level (L5 since num_levels=6) is zero before the final Reopen(), and nonzero after it.

I tried fixing this, but realize this does not work (without adding support for level compaction + ingest behind) as the second last level can become unnecessary and compacted down even when the last level is reserve for ingest behind. Making it work requires more changes that almost feels like supporting allow_ingest_behind with level compaction in this PR. I filed a task and plan to follow up with a separate PR for official support.

db/db_compaction_test.cc Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@cbi42 has updated the pull request. You must reimport the pull request before landing.

Copy link
Member Author

@cbi42 cbi42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

Comment on lines 1117 to 1118
// Otherwise L5 will always be eligible for compaction and
// WaitForCompact(true) below will be stuck waiting.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I've updated the lowest_unnecessary_level_ logic to exclude penultimate level with per key placement. Note that this problem may still exist in general with level compaction + per key placement. It can happen when penultimate level is over target size but with all hot data.

Comment on lines -9194 to -9216
options.level_compaction_dynamic_level_bytes = true;
Reopen(options);
// note that last level (L6) should be empty
ASSERT_EQ("1,0,0,1,2", FilesPerLevel());
verify_db();

// turning the options on and off should both be safe
options.level_compaction_dynamic_level_bytes = false;
Reopen(options);
MoveFilesToLevel(1);
ASSERT_EQ("0,1,0,1,2", FilesPerLevel());
verify_db();

// newly flushed file is also pushed down
options.level_compaction_dynamic_level_bytes = true;
Reopen(options);
ASSERT_EQ("0,0,1,1,2", FilesPerLevel());
verify_db();

// files will be pushed down to last level (L6)
options.allow_ingest_behind = false;
Reopen(options);
ASSERT_EQ("0,0,0,1,1,2", FilesPerLevel());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just verify the number of files in the bottom level (L5 since num_levels=6) is zero before the final Reopen(), and nonzero after it.

I tried fixing this, but realize this does not work (without adding support for level compaction + ingest behind) as the second last level can become unnecessary and compacted down even when the last level is reserve for ingest behind. Making it work requires more changes that almost feels like supporting allow_ingest_behind with level compaction in this PR. I filed a task and plan to follow up with a separate PR for official support.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cbi42 cbi42 force-pushed the drain-unnecessary-levels branch from ed0e801 to 697e66f Compare April 4, 2023 18:53
@facebook-github-bot
Copy link
Contributor

@cbi42 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cbi42 cbi42 force-pushed the drain-unnecessary-levels branch from 697e66f to 91d0486 Compare April 5, 2023 15:42
@facebook-github-bot
Copy link
Contributor

@cbi42 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cbi42 cbi42 force-pushed the drain-unnecessary-levels branch from 91d0486 to ddb7479 Compare April 6, 2023 17:09
@facebook-github-bot
Copy link
Contributor

@cbi42 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cbi42 merged this pull request in b3c43a5.

facebook-github-bot pushed a commit that referenced this pull request Jun 16, 2023
…11525)

Summary:
after #11321 and #11340 (both included in RocksDB v8.2), migration from `level_compaction_dynamic_level_bytes=false` to `level_compaction_dynamic_level_bytes=true` is automatic by RocksDB and requires no manual compaction from user. Making the option true by default as it has several advantages: 1. better space amplification guarantee (a more stable LSM shape). 2. compaction is more adaptive to write traffic. 3. automatic draining of unneeded levels. Wiki is updated with more detail: https://github.com/facebook/rocksdb/wiki/Leveled-Compaction#option-level_compaction_dynamic_level_bytes-and-levels-target-size.

The PR mostly contains fixes for unit tests as they assumed `level_compaction_dynamic_level_bytes=false`. Most notable change is commit f742be3 and b1928e4 which override the default option in DBTestBase to still set `level_compaction_dynamic_level_bytes=false` by default. This helps to reduce the change needed for unit tests. I think this default option override in unit tests is okay since the behavior of `level_compaction_dynamic_level_bytes=true` is tested by explicitly setting this option. Also, `level_compaction_dynamic_level_bytes=false` may be more desired in unit tests as it makes it easier to create a desired LSM shape.

Comment for option `level_compaction_dynamic_level_bytes` is updated to reflect this change and change made in #10057.

Pull Request resolved: #11525

Test Plan: `make -j32 J=32 check` several times to try to catch flaky tests due to this option change.

Reviewed By: ajkr

Differential Revision: D46654256

Pulled By: cbi42

fbshipit-source-id: 6b5827dae124f6f1fdc8cca2ac6f6fcd878830e1
tabokie pushed a commit to tabokie/rocksdb that referenced this pull request Aug 14, 2023
…rue` (facebook#11340)

Summary:
When a user migrates to level compaction + `level_compaction_dynamic_level_bytes=true`, or when a DB shrinks, there can be unnecessary levels in the DB. Before this PR, this is no way to remove these levels except a manual compaction. These extra unnecessary levels make it harder to guarantee max_bytes_for_level_multiplier and can cause extra space amp. This PR boosts compaction score for these levels to allow RocksDB to automatically drain these levels. Together with facebook#11321, this makes migration to `level_compaction_dynamic_level_bytes=true` automatic without needing user to do a one time full manual compaction. Credit: this PR is modified from facebook#3921.

Pull Request resolved: facebook#11340

Test Plan:
- New unit tests
- `python3 tools/db_crashtest.py whitebox --simple` which randomly sets level_compaction_dynamic_level_bytes in each run.

Reviewed By: ajkr

Differential Revision: D44563884

Pulled By: cbi42

fbshipit-source-id: e20d3620bd73dff22be18c5a91a07f340740bcc8
Signed-off-by: tabokie <[email protected]>
tabokie pushed a commit to tabokie/rocksdb that referenced this pull request Aug 15, 2023
…rue` (facebook#11340)

Summary:
When a user migrates to level compaction + `level_compaction_dynamic_level_bytes=true`, or when a DB shrinks, there can be unnecessary levels in the DB. Before this PR, this is no way to remove these levels except a manual compaction. These extra unnecessary levels make it harder to guarantee max_bytes_for_level_multiplier and can cause extra space amp. This PR boosts compaction score for these levels to allow RocksDB to automatically drain these levels. Together with facebook#11321, this makes migration to `level_compaction_dynamic_level_bytes=true` automatic without needing user to do a one time full manual compaction. Credit: this PR is modified from facebook#3921.

Pull Request resolved: facebook#11340

Test Plan:
- New unit tests
- `python3 tools/db_crashtest.py whitebox --simple` which randomly sets level_compaction_dynamic_level_bytes in each run.

Reviewed By: ajkr

Differential Revision: D44563884

Pulled By: cbi42

fbshipit-source-id: e20d3620bd73dff22be18c5a91a07f340740bcc8
Signed-off-by: tabokie <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants