Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Push Txns & Resolve Intents on Conflicts #72

Merged
merged 3 commits into from
Sep 21, 2014

Conversation

spencerkimball
Copy link
Member

Automatically push transactions on write intent failures and resolve
conflicting write intents. This is done at the level of
Store.ExecuteCmd on command failure. Write/write conflicts are
pushed and resolved but then must still return the write intent
error to the client so the client can retry with a new client command
ID, ensuring idempotence. This way, all original requests (the one
which hit the conflict) will always return a write intent error via
the response cache. Follow-on requests with the new client command
ID can succeed, and with new ClientCmdID, will always succeed on
retries. Read/write conflicts are automatically retried instead of
returning the write intent error, because they don't have the same
idempotency issues with client command ID.

Reorg'd a bit of code between store and range to move range-specific
checks into range (addition of Range.AddCmd method). Also, added an
ability to submit commands to Raft without having to always wait for
completion. This is used to "fire and forget" resolve intent cmds.
Making sure we cleanup entries pending in the "read queue" is slightly
tricky in Range.addReadWriteCmd.

Moved writeIntentError and writeTooOldError from mvcc.go to the proto/
subdirectory as they're communicated across RPCs from server to clients.
Add the conflicting Key and also a Resolved flag to WriteIntentError.
This allows the client to decide whether to backoff (if resolve failed)
or retry immediately.

Automatically push transactions on write intent failures and resolve
conflicting write intents. This is done at the level of
Store.ExecuteCmd on command failure. Write/write conflicts are
pushed and resolved but then must still return the write intent
error to the client so the client can retry with a new client command
ID, ensuring idempotence. This way, all original requests (the one
which hit the conflict) will always return a write intent error via
the response cache. Follow-on requests with the new client command
ID can succeed, and with new ClientCmdID, will always succeed on
retries. Read/write conflicts are automatically retried instead of
returning the write intent error, because they don't have the same
idempotency issues with client command ID.

Reorg'd a bit of code between store and range to move range-specific
checks into range (addition of Range.AddCmd method). Also, added an
ability to submit commands to Raft without having to always wait for
completion. This is used to "fire and forget" resolve intent cmds.
Making sure we cleanup entries pending in the "read queue" is slightly
tricky in Range.addReadWriteCmd.

Moved writeIntentError and writeTooOldError from mvcc.go to the proto/
subdirectory as they're communicated across RPCs from server to clients.
Add the conflicting Key and also a Resolved flag to WriteIntentError.
This allows the client to decide whether to backoff (if resolve failed)
or retry immediately.
@spencerkimball
Copy link
Member Author

@bdarnell @tschottdorf @cockroachdb/developers

// If the original client didn't wait, log execution errors so they're
// surfaced somewhere.
if !wait && err != nil {
log.V(1).Infof("non-synchronous execution of %s with %+v failed: %v", cmd.Method, cmd.Args, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A log.V(1) is a pretty weak form of "surfaced somewhere"; I'd make this at least a log.Info. I might also log at V(1) even if wait is true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to a warning and added a V(1) info in Range.executeCmd to always log execution of commands.

@bdarnell
Copy link
Contributor

LGTM

}
return rng.ReadWriteCmd(method, args, reply)

return wiErr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may add a log here to see how frequently it may happen.

We may potentially resubmit the command here immediately without a new ClientCmdID to optimize the latency. (need to work around the respCache)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a V(1) log for the method as a whole. I figured we can get a bit fancier with optimizations in future changes. Right now, wanted to make sure we keep our idempotency guarantees. We'll need to start optimizing all of this at some point, but let's do it right, using appropriate benchmarks, and avoid premature optimization.

@jiangmingyang
Copy link
Contributor

LGTM

spencerkimball added a commit that referenced this pull request Sep 21, 2014
…intents

Push Txns & Resolve Intents on Conflicts
@spencerkimball spencerkimball merged commit efc7718 into master Sep 21, 2014
@spencerkimball spencerkimball deleted the spencerkimball/resolve-write-intents branch September 21, 2014 19:13
tbg referenced this pull request in tbg/cockroach Aug 31, 2016
We have been seeing long startup times which disappear spontaneously. During a
restart of the beta cluster, the following goroutine was observed, which suggests
that we were spending a lot of time GCing replicas on startup.

    engine              ??:0                     _Cfunc_DBIterNext(cockroachdb#324, cockroachdb#323, 0, 0, 0, 0, 0, 0, 0)
    engine              rocksdb.go:1135          (*rocksDBIterator).Next(cockroachdb#235)
    storage             replica_data_iter.go:104 (*ReplicaDataIterator).Next(cockroachdb#316)
    storage             store.go:1748            (*Store).destroyReplicaData(#109, cockroachdb#317, 0, 0)
    storage             store.go:841             (*Store).Start.func2(0x101b, cockroachdb#300, 0x36, 0x40, cockroachdb#301, 0x36, 0x40, cockroachdb#315, 0x3, 0x4, ...)
    storage             store.go:734             IterateRangeDescriptors.func1(cockroachdb#306, 0x40, 0x41, cockroachdb#307, 0x92, 0x92, cockroachdb#341, 0, 0x186c, 0x4000, ...)
    engine              mvcc.go:1593             MVCCIterate(cockroachdb#329, #68, #47, #81, cockroachdb#232, 0x9, 0x10, cockroachdb#233, 0xb, 0x10, ...)
    storage             store.go:738             IterateRangeDescriptors(cockroachdb#330, cockroachdb#196, #47, #81, cockroachdb#195, #179, #110)
    storage             store.go:867             (*Store).Start(#109, cockroachdb#330, cockroachdb#196, #179, cockroachdb#185, 0x1)
    server              node.go:405              (*Node).initStores(#78, cockroachdb#330, cockroachdb#196, #98, 0x1, 0x1, #179, 0, #55)
    server              node.go:330              (*Node).start(#78, cockroachdb#330, cockroachdb#196, #42, #129, #98, 0x1, 0x1, 0, 0, ...)
    server              server.go:431            (*Server).Start(#5, cockroachdb#330, cockroachdb#196, #95, 0x1)
    cli                 start.go:368             runStart(#34, #178, 0, 0x9, 0, 0)
    cobra               command.go:599           (*Command).execute(#34, #177, 0x9, 0x9, #34, #177)
    cobra               command.go:689           (*Command).ExecuteC(#33, #70, cockroachdb#343, #72)
    cobra               command.go:648           (*Command).Execute(#33, #71, cockroachdb#343)
    cli                 cli.go:96                Run(#64, 0xa, 0xa, 0, 0)
    main                main.go:37               main()
tbg referenced this pull request in tbg/cockroach Aug 31, 2016
We have been seeing long startup times which disappear spontaneously. During a
restart of the beta cluster, the following goroutine was observed, which suggests
that we were spending a lot of time GCing replicas on startup.

    engine              ??:0                     _Cfunc_DBIterNext(cockroachdb#324, cockroachdb#323, 0, 0, 0, 0, 0, 0, 0)
    engine              rocksdb.go:1135          (*rocksDBIterator).Next(cockroachdb#235)
    storage             replica_data_iter.go:104 (*ReplicaDataIterator).Next(cockroachdb#316)
    storage             store.go:1748            (*Store).destroyReplicaData(#109, cockroachdb#317, 0, 0)
    storage             store.go:841             (*Store).Start.func2(0x101b, cockroachdb#300, 0x36, 0x40, cockroachdb#301, 0x36, 0x40, cockroachdb#315, 0x3, 0x4, ...)
    storage             store.go:734             IterateRangeDescriptors.func1(cockroachdb#306, 0x40, 0x41, cockroachdb#307, 0x92, 0x92, cockroachdb#341, 0, 0x186c, 0x4000, ...)
    engine              mvcc.go:1593             MVCCIterate(cockroachdb#329, #68, #47, #81, cockroachdb#232, 0x9, 0x10, cockroachdb#233, 0xb, 0x10, ...)
    storage             store.go:738             IterateRangeDescriptors(cockroachdb#330, cockroachdb#196, #47, #81, cockroachdb#195, #179, #110)
    storage             store.go:867             (*Store).Start(#109, cockroachdb#330, cockroachdb#196, #179, cockroachdb#185, 0x1)
    server              node.go:405              (*Node).initStores(#78, cockroachdb#330, cockroachdb#196, #98, 0x1, 0x1, #179, 0, #55)
    server              node.go:330              (*Node).start(#78, cockroachdb#330, cockroachdb#196, #42, #129, #98, 0x1, 0x1, 0, 0, ...)
    server              server.go:431            (*Server).Start(#5, cockroachdb#330, cockroachdb#196, #95, 0x1)
    cli                 start.go:368             runStart(#34, #178, 0, 0x9, 0, 0)
    cobra               command.go:599           (*Command).execute(#34, #177, 0x9, 0x9, #34, #177)
    cobra               command.go:689           (*Command).ExecuteC(#33, #70, cockroachdb#343, #72)
    cobra               command.go:648           (*Command).Execute(#33, #71, cockroachdb#343)
    cli                 cli.go:96                Run(#64, 0xa, 0xa, 0, 0)
    main                main.go:37               main()
soniabhishek pushed a commit to soniabhishek/cockroach that referenced this pull request Feb 15, 2017
…onversion

Changes to get the correct start time and end time for random wait ca…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants