Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] storage: retry 1PC transactions on WriteTooOld errors #15863

Conversation

petermattis
Copy link
Collaborator

As @tschottdorf points out, this likely isn't safe. Seems like we'd want
to retry the 1PC at a higher level so that we go through the timestamp
cache again.

Fixes #15797

As @tschottdorf points out, this likely isn't safe. Seems like we'd want
to retry the 1PC at a higher level so that we go through the timestamp
cache again.

Fixes cockroachdb#15797
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg
Copy link
Member

tbg commented Jul 17, 2017

Closing as not the right approach. PR is referenced from #15797, so we won't forget about it.

@tbg tbg closed this Jul 17, 2017
tbg added a commit to tbg/cockroach that referenced this pull request Jul 19, 2017
See cockroachdb#15797 for context.

The main problem here are `WriteTooOldError`s, which lay down an intent. If
this happens frequently, performance suffers dramatically since there is almost
always an intent on the key, forcing everyone into conflict resolution (which
is sure to lay down another intent thanks to `WriteTooOld`).

With this change we don't lay down intents for this workload, but the 1PC
transactions could be starved.

We could do even better (making the hack in cockroachdb#15863 correct): If we knew that
the 1PC transaction wasn't previously used for any reads (or it's SNAPSHOT), we
could in theory commit it directly again from the Store (saving the roundtrip
to the gateway/client). If it's SERIALIZABLE and there were previous reads, we
can't simulate a restart internally and the client must do it properly.

It also stands to reason that on a restart, the client shouldn't take the
existing timestamp but simply use `hlc.Now()` which is less stale.

Neither of the last two paragraphs are explored in this PR. Here, we should
discuss whether it's OK to expose 1PC txns to starvation in the first place.

```
go run $GOPATH/src/github.com/cockroachdb/loadgen/kv/main.go --cycle-length 1 --concurrency $c --duration 30s
// this PR, c=10
_elapsed___errors____________ops___ops/sec(cum)__p95(ms)__p99(ms)_pMax(ms)_____seq(begin/end)
   30.0s        0          14905          496.8     16.3     17.8     29.4              0 / 0
// this PR and master, c=1
_elapsed___errors____________ops___ops/sec(cum)__p95(ms)__p99(ms)_pMax(ms)_____seq(begin/end)
   30.0s        0          22204          740.1      1.6      3.5     15.7              0 / 0
// master, c=10
_elapsed___errors____________ops___ops/sec(cum)__p95(ms)__p99(ms)_pMax(ms)_____seq(begin/end)
   30.0s        0           3185          106.2    369.1    604.0   1811.9              0 / 0
```
@petermattis petermattis deleted the pmattis/1pc-write-too-old-retry branch August 3, 2017 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants