Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP engine: defer compactions while ingesting SSTs #35236

Closed
wants to merge 1 commit into from

Conversation

dt
Copy link
Member

@dt dt commented Feb 27, 2019

Immediately before ingesting an SST, we reconfigure RocksDBs compaction settings to allow more files in L0 and avoid the slowdown triggers.
We then wait a minute to revert back to the normal compaction settings on the next sync loop.
On subsequent ingestions, if we've already reconfigured the compaction settings, we simply note the time to extend the revert deadline.

Release note (performance improvement): reduce write amplification during bulk index ingestion.

@dt dt requested review from danhhz, ajkr and a team February 27, 2019 15:27
@dt dt added the do-not-merge bors won't merge a PR with this label. label Feb 27, 2019
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of hacky. I don't have a suggestion for something better, though. Perhaps @ajkr does. I'll be interested in seeing some performance numbers to justify this work.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajkr and @danhhz)

@dt dt force-pushed the rocks-skip-compact-during-ingest branch 2 times, most recently from c76f696 to c2242ee Compare February 27, 2019 15:38
@ajkr
Copy link
Contributor

ajkr commented Feb 27, 2019

Can you remind me what is the overlap properties of these to-be-ingested files? I remember they are supposed to cover narrow ranges of the key-space and not have a lot of overlap - do they have any?

@dt
Copy link
Member Author

dt commented Feb 27, 2019

They actually overlap each-other quite a bit.

They cover a narrow chunk of the overall key-space that shouldn't be seeing much traffic yet, since they usually belong to a table or index that is being ingested but is not yet public for traffic. So they don't overlap with things I'd expect to be in the memtable, but the absolutely can overlap with each other.

@dt
Copy link
Member Author

dt commented Feb 27, 2019

this is somewhat inspired by https://github.com/facebook/rocksdb/wiki/RocksDB-FAQ's "What's the fastest way to load data into RocksDB?" and PrepareForBulkLoad ()

@ajkr
Copy link
Contributor

ajkr commented Feb 27, 2019

That's interesting. If the L0 files in an L0->L1 only cover a narrow portion of the keyspace, I'd expect it to overlap with a small number of L1 files, which would form a reasonably low write-amp compaction. I wonder if you have RocksDB logs we can look at?

@dt
Copy link
Member Author

dt commented Feb 27, 2019

This approach could also replace #34258 even if we didn't include adjusting the compaction trigger: just dynamically bumping the slowdown and stop triggers up during ingest would remove the risk of hitting them and since we know we're the ones making all the SSTs, we know that imposing a write-slowdown or stall won't help (and will certainly hurt OLTP traffic).

@petermattis
Copy link
Collaborator

I wonder if it would be simpler to just bump the default slowdown and stop write thresholds. Do we ever want to fully stop writes due to too many L0 sstables?

Immediately before ingesting an SST, we reconfigure RocksDBs compaction settings to allow more files in L0 and avoid the slowdown triggers.
We then wait a minute to revert back to the normal compaction settings on the next sync loop.
On subsequent ingestions, if we've already reconfigured the compaction settings, we simply note the time to extend the revert deadline.

Release note (performance improvement): reduce write amplification during bulk index ingestion.
@dt dt force-pushed the rocks-skip-compact-during-ingest branch from 42a4659 to c4f5b9c Compare April 3, 2019 21:06
@tbg tbg added the X-noremind Bots won't notify about PRs with X-noremind label Jun 19, 2019
@dt dt closed this Jun 25, 2019
@dt dt deleted the rocks-skip-compact-during-ingest branch June 28, 2019 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge bors won't merge a PR with this label. X-noremind Bots won't notify about PRs with X-noremind
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants