-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node_confirms on GET #1750
Comments
There are some peculiarities with a specific customer that make this more relevant than it might seem. For this customer inbound updates are persisted first to a Riak msgStore, and are then queued from the store to be applied, and periodically re-queued until the apply processing loop succeeds. Success is determined by: a) either making the update; The desire with adding node_confirms to GET requests is to make part (b) stricter. We want this to process to work (safely) even when multiple nodes have temporary failure, but not when a rogue node has been added to a load balanced group due to admin error, and always be resilient to data loss when any single node suffers permanent failure and loss of persisted storage. |
No changes yet to support option on specific request, just respecting of bucket property. See #1750 Tested https://github.com/basho/riak_test/blob/mas-i1750-nodeconfirms/tests/node_confirms_vs_pw.erl
The node_confirms bucket property is respected on PUT. A PUT may succeed (in that it has been written to Riak) but will error when insufficient vnodes have responded from different nodes. This allows us to be sure that the PUT will not disappear on a single node failure.
More information - https://github.com/basho/riak_kv/blob/develop-2.9/docs/Node-Diversity.md
If a PUT is made and errors due to node_confirms. After the PUT, a GET can be made and succeed - but at that stage, the object may still be on a single node. How does the application know if it has safely stored the data on two nodes - especially as this may now be a different process reading the object, and so not be aware of the potentially failed PUT i.e. re-PUT until a success response is not an option.
It would be useful to have a GET request, where success indicates that even on a single node failure, it can be safely assumed that the read change will not be lost from the history.
PR=2 can be used as an alternative, but this will fail in lots of cases when node diversity does exist. The other issue with PR=2 and PW=2 is when a node is started, but not joined to the cluster, and perhaps due to admin error is introduced into a load-balanced group. Now PR=2, and PW=2 will succeed, but we haven't actually ensured the data is stored in more than one place.
It is possible to add node_confirms to GET. However, this would not naturally confirm that the PUT has reached two nodes - counting parameter on GETs count the number of nodes consulted, but the answer may still only come from one node. However, if node_confirms=2 on GET, on success the application would know that:
The cluster using node_confirms = 2 on GET and PUT would still have a degree of partition tolerance. however, some preflists on some partitions (especially minority partitions) may not have node diversity and would error.
The text was updated successfully, but these errors were encountered: