-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP engine: defer compactions while ingesting SSTs #35236
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort of hacky. I don't have a suggestion for something better, though. Perhaps @ajkr does. I'll be interested in seeing some performance numbers to justify this work.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajkr and @danhhz)
c76f696
to
c2242ee
Compare
Can you remind me what is the overlap properties of these to-be-ingested files? I remember they are supposed to cover narrow ranges of the key-space and not have a lot of overlap - do they have any? |
They actually overlap each-other quite a bit. They cover a narrow chunk of the overall key-space that shouldn't be seeing much traffic yet, since they usually belong to a table or index that is being ingested but is not yet public for traffic. So they don't overlap with things I'd expect to be in the memtable, but the absolutely can overlap with each other. |
this is somewhat inspired by https://github.com/facebook/rocksdb/wiki/RocksDB-FAQ's "What's the fastest way to load data into RocksDB?" and |
That's interesting. If the L0 files in an L0->L1 only cover a narrow portion of the keyspace, I'd expect it to overlap with a small number of L1 files, which would form a reasonably low write-amp compaction. I wonder if you have RocksDB logs we can look at? |
This approach could also replace #34258 even if we didn't include adjusting the compaction trigger: just dynamically bumping the slowdown and stop triggers up during ingest would remove the risk of hitting them and since we know we're the ones making all the SSTs, we know that imposing a write-slowdown or stall won't help (and will certainly hurt OLTP traffic). |
I wonder if it would be simpler to just bump the default slowdown and stop write thresholds. Do we ever want to fully stop writes due to too many L0 sstables? |
c2242ee
to
42a4659
Compare
Immediately before ingesting an SST, we reconfigure RocksDBs compaction settings to allow more files in L0 and avoid the slowdown triggers. We then wait a minute to revert back to the normal compaction settings on the next sync loop. On subsequent ingestions, if we've already reconfigured the compaction settings, we simply note the time to extend the revert deadline. Release note (performance improvement): reduce write amplification during bulk index ingestion.
42a4659
to
c4f5b9c
Compare
Immediately before ingesting an SST, we reconfigure RocksDBs compaction settings to allow more files in L0 and avoid the slowdown triggers.
We then wait a minute to revert back to the normal compaction settings on the next sync loop.
On subsequent ingestions, if we've already reconfigured the compaction settings, we simply note the time to extend the revert deadline.
Release note (performance improvement): reduce write amplification during bulk index ingestion.