-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvcoord: increase max_intent_bytes default to 4MB #66859
kvcoord: increase max_intent_bytes default to 4MB #66859
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a user-visible setting, the release notes should be explicit about which setting is being changed, what the old default was, what the new default is, and why it is being changed (in terms that a casual user can understand).
Looks like you'll also need to regenerate the documentation for the setting. make buildshort
should do the trick, I believe.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aliher1911 and @andreimatei)
pkg/kv/kvclient/kvcoord/txn_interceptor_pipeliner.go, line 72 at r1 (raw file):
// // Note: Default value was arbitrarily set to 256KB but practice showed that // it could be raised higher. See #54029 for more details.
This should elaborate a bit, for example by mentioning that exceeding the limit causes ranged intent resolution which can be very disruptive to the workload due to slow range scans while holding broad latches, and that we'd rather want to use a bit more memory than fall of this performance cliff for small-ish transactions.
pkg/kv/kvclient/kvcoord/txn_interceptor_pipeliner.go, line 76 at r1 (raw file):
"kv.transaction.max_intents_bytes", "maximum number of bytes used to track locks in transactions", 1<<20, /* 1 MB */
Is this the right value, or should we go even higher? I would think something like 4 or 8 MB might be better.
The comment above mentions that it is critical to avoid memory blowup during Raft EndTxn processing. What limits does that impose? I would think 4-8 should still be fine?
fb8cede
to
fc3d05d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @erikgrinaker)
pkg/kv/kvclient/kvcoord/txn_interceptor_pipeliner.go, line 72 at r1 (raw file):
Previously, erikgrinaker (Erik Grinaker) wrote…
This should elaborate a bit, for example by mentioning that exceeding the limit causes ranged intent resolution which can be very disruptive to the workload due to slow range scans while holding broad latches, and that we'd rather want to use a bit more memory than fall of this performance cliff for small-ish transactions.
Done.
pkg/kv/kvclient/kvcoord/txn_interceptor_pipeliner.go, line 76 at r1 (raw file):
Previously, erikgrinaker (Erik Grinaker) wrote…
Is this the right value, or should we go even higher? I would think something like 4 or 8 MB might be better.
The comment above mentions that it is critical to avoid memory blowup during Raft EndTxn processing. What limits does that impose? I would think 4-8 should still be fine?
Looks like it is capped by 64MB in raft as of now, so we should be fine with 4M.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @andreimatei and @erikgrinaker)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the commit msg and PR description say 1MB, but they're lying
We usually say something like "Touches #54029". Some people say "informs"; I personally never liked that word.
Also please try to keep the commit msg in sync with the PR description.
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @erikgrinaker)
d48c288
to
86b8068
Compare
bors r=andreimatei,erikgrinaker |
Build failed: |
86b8068
to
3dd7814
Compare
Previously limit was set conservatively to 256KB. Having a conservative limit can cause intent cleanup batches to condense aggressively resorting to range cleanups which are more expensive. Increasing defaults will reduce contention during cleanup at the expense of increased memory for large transactions on tx_coord_sender. Release note (performance improvement): Increase default value for kv.transaction.max_intents_bytes cluster setting from 256KB to 4MB to improve transaction performance for large transactions at the expense of increased memory usage. Transactions above this size limit use slower cleanup mode for commits and aborts.
3dd7814
to
defd0de
Compare
bors r=andreimatei,erikgrinaker |
PR failed to land because of integration test flaking. This behaviour was actually caused by change in limit as it pushed default value for range transaction resolution above what test was writing and it slowed down test enough to start timing out. Now test is temporarily disabled. |
Build succeeded: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR failed to land because of integration test flaking. This behaviour was actually caused by change in limit as it pushed default value for range transaction resolution above what test was writing and it slowed down test enough to start timing out. Now test is temporarily disabled.
Wait, which test are we talking about? Why did it slow down?
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale)
There's a test that does 100000k transaction on a single range. Before the change it was switching to range resolve and after the change this is no longer the case and it was doing point resolve which was 10%-ish slower when running locally but never failing. In CI is was flaking. |
100M different transactions? Is that right? And each transaction is... large (such that the condensing limit kicks in)? What test is that? I don't think we've ever realized that the range resolution can be a latency win. It'd be interesting to measure this. |
Any talk of backport here? I assume not -- we can advise users to update manually, and we probably don't want to change defaults retroactively. |
I think we don't backport this since there's a workaround to just set it manually. |
Previously limit was set conservatively to 256KB. Having a
conservative limit can cause intent cleanup batches to condense
aggressively resorting to range cleanups which are more expensive.
Increasing defaults will reduce contention during cleanup at the
expense of increased memory for large transactions on
tx_coord_sender.
Release note (performance improvement): Increase default value for
kv.transaction.max_intents_bytes cluster setting from 256KB to 4MB
to improve transaction performance for large transactions at the
expense of increased memory usage. Transactions above this size
limit use slower cleanup mode for commits and aborts.
#54029