-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: investigate loadgen/kv
slowdown
#15797
Comments
@petermattis pointed out to me that the generators verbatim hash the sequence numbers, and don't use their own seed to make their hashes unique. So this is simply expected: one has no overlapping writes, the other only overlapping writes. Duh! |
Contention vs no contention explains some of the difference between
I would expect latencies to increase with higher concurrency, but throughput to remain relatively steady. Instead, throughput drops through the floor. |
Not sure why, but with |
Looks like what is happening is that 2 concurrent writers to the same key for a 1PC transaction will eventually cause a I think we could handle the |
It would bypass the timestamp cache - the "normal" txn path does this too, but doesn't commit, so if there's a bug there it maybe doesn't manifest. For example |
As @tschottdorf points out, this likely isn't safe. Seems like we'd want to retry the 1PC at a higher level so that we go through the timestamp cache again. Fixes cockroachdb#15797
Retrying at a higher timestamp in @tschottdorf In your example, where is the timestamp moved after the timestamp cache has been checked? If I'm reading this correctly the 2PC retry uses the original Transaction's timestamp, not one influenced by a possible |
The unbounded retry scenario could be mitigated by returning the WriteTooOldError if the transaction's epoch is less than N, and doing the 2PC retry for later epochs. |
@bdarnell @petermattis' suggestion was:
and that's also the code in https://github.com/cockroachdb/cockroach/pull/15863/files?w=1#diff-0f4e61b240de001f1b97a87fc31af043R4057. That would mean that my example would do the following:
Ok, now that I wrote that I see that you're probably assuming that there's a problem with the current code - no, probably not, though we are laying down an intent at a higher timestamp in the 2PC code, no? At least if I remember correctly how
|
Yes, @petermattis's suggestion was what I was referring to as "retrying in evalutateTxnWriteBatch", and it's unsafe (definitely in this example with DelRange, and probably without DelRange too). My second paragraph (where I asked where the timestamp was moved) was because I thought your DelRange example was raising a concern about the current 2PC code. There's not a problem when a 1PC txn retries as 2PC because the request's txn timestamp isn't moved (as far as I can see). |
Wouldn't the 2PC example go like this:
The corresponding hook that catches it on commit is here (also copying the
|
I think we're in agreement here but we're just confusing each other because we're not being clear about when we're talking about the current code and when we're talking about hypothetical changes. #15797 (comment) describes the current behavior, and everything is fine. The retry in #15863 is unsafe because it advances the timestamp inside We currently handle WriteTooOldErrors by doing two things: falling back to 2PC mode and returning an error to the client.Txn to retry. The proposal here is to separate the two: retry at least once while staying in 1PC mode before falling back to the much more expensive 2PC. |
@tschottdorf I'm going to bump this issue over to you as you're more familiar with this area of the code than I am. #15863 clearly isn't the right approach. |
See cockroachdb#15797 for context. The main problem here are `WriteTooOldError`s, which lay down an intent. If this happens frequently, performance suffers dramatically since there is almost always an intent on the key, forcing everyone into conflict resolution (which is sure to lay down another intent thanks to `WriteTooOld`). With this change we don't lay down intents for this workload, but the 1PC transactions could be starved. We could do even better (making the hack in cockroachdb#15863 correct): If we knew that the 1PC transaction wasn't previously used for any reads (or it's SNAPSHOT), we could in theory commit it directly again from the Store (saving the roundtrip to the gateway/client). If it's SERIALIZABLE and there were previous reads, we can't simulate a restart internally and the client must do it properly. It also stands to reason that on a restart, the client shouldn't take the existing timestamp but simply use `hlc.Now()` which is less stale. Neither of the last two paragraphs are explored in this PR. Here, we should discuss whether it's OK to expose 1PC txns to starvation in the first place. ``` go run $GOPATH/src/github.com/cockroachdb/loadgen/kv/main.go --cycle-length 1 --concurrency $c --duration 30s // this PR, c=10 _elapsed___errors____________ops___ops/sec(cum)__p95(ms)__p99(ms)_pMax(ms)_____seq(begin/end) 30.0s 0 14905 496.8 16.3 17.8 29.4 0 / 0 // this PR and master, c=1 _elapsed___errors____________ops___ops/sec(cum)__p95(ms)__p99(ms)_pMax(ms)_____seq(begin/end) 30.0s 0 22204 740.1 1.6 3.5 15.7 0 / 0 // master, c=10 _elapsed___errors____________ops___ops/sec(cum)__p95(ms)__p99(ms)_pMax(ms)_____seq(begin/end) 30.0s 0 3185 106.2 369.1 604.0 1811.9 0 / 0 ```
Since we didn't like #17121 and we also don't want to put in a lot of complexity for the 1PC path (which ultimately users only see when they have quite well-adjusted insert patterns), nothing has been done so far. One thought is that users could explicitly opt out of starvation avoidance for such workloads. That is, the transaction doesn't lay down intents on Or perhaps there's value in exploring a general heuristic in which intents are laid down only after the n-th write. Say, for n=20, if my txn writes 2 keys, it would only start writing intents after having retried 10 times. If the transaction writes 400 keys, it'll write intents for writes 20-400 and any in restarts after that. But that leaves a question for the optimal value of |
@jordanlewis mentioned today that TPC-C exhibits contention on rows. Probably better to motivate improvements here based on that benchmark which is somewhat more realistic than a single key. |
Yes. There are two primary examples of that in TPC-C.
Since we've been testing with just 1 Warehouse for the time being, the first one can probably be thought of as similar to |
To amend that last statement, there are actually so many things different about TPCC and |
The TPCC examples don't look like they could ever pass for 1PC txns, but they would indeed be harmed by not laying down intents there. I'm inclined to say there isn't much we're going to be able to do here in the foreseeable future. A real client using TPCC would likely be advised to run an individual txn to bump the ID though (risking creation of IDs that are never used, which seems OK), which makes this a 1PC txn again which would then profit from better handling of 1PC txns that run into a WriteTooOld. |
From cockroachdb/loadgen#50:
Something odd I've noticed is that this:
is much slower than this:
Unless I'm mistaken, the eight clients "never" intersect, so the only
difference is that in one example they're each hitting one own key, and in the
latter eight own keys. Perhaps there is more range parallelism in the latter,
but you wouldn't expect it. The difference disappears with
--concurrency=1
.@petermattis for triage.
The text was updated successfully, but these errors were encountered: