Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write durability: always commit a write to both kube and spicedb, or neither #16

Merged
merged 2 commits into from
Sep 6, 2023

Conversation

ecordell
Copy link
Collaborator

@ecordell ecordell commented Aug 28, 2023

Closes #3

This adds a durable saga that writes to spicedb and kube, with the goal of ensuring that a write happens in both, or neither, but not just one or the other.

There are two methods of writing implemented: a pessimistic lock that prevents other requests from attempting to create same object at the same time, and an optimistic lock that detects when there are conflicts and rolls back or forward as needed.

Pessimistic outline:

  • A create namespace foo call comes in from user:evan
  • Compute a workflow hash i.e. xxhash(create, namespace, foo)
  • Write to spicedb:
    • TOUCH workflow:xxhash(create,namespace,foo)#id@workflow_id:caca56e8-388b-46ca-bf2a-7fe325defe68
    • TOUCH namespace:foo#creator@user:evan
    • with a precondition:
      • operation: OPERATION_MUST_NOT_MATCH
      • filter: `workflow:xxhash(create,namespace,foo)#id@workflow_id:*
  • If the SpiceDB write fails, fail the workflow (return error to user if not async)
  • In a loop:
    • Attempt to write to kube
    • If write succeeds, remove workflow lock tuple, return success to user
    • If kube resp is IsAlreadyExists, remove workflow lock tuple, return success to user (lock tuple ensures no one else did this write - assuming all traffic is going through the proxy)
    • If kube resp is any other error, return error to user, remove both tuples, return error to user
    • If there is some other error where the kube response can't be retrieved, continue the loop

Optimistic outline:

  • A create namespace foo call comes in from user:evan
  • Write record to SpiceDB
    • If SpiceDB write fails, fail the request. The client can retry / fix the error, and no data has been written to either.
    • If the SpiceDB write succeeds, but the proxy sees the step as failed (i.e. because the process failed), the write is rolled back, and an error is returned to the user to try again.
  • Write record to Kube
    • If Kube write fails:
      • Check to see if object already exists in Kube:
        • If so, work is done.
        • If object does not exist, revert the SpiceDB write.

There are pros and cons to each approach, for now both are supported and we can configure them per request type or per instance of the proxy.

The durability of this function means that inputs, outputs, and progress state are stored in a sqlite database. The goal is to be robust to service failures (SpiceDB and Kube API) and process failures (network dies, process crashes and restarts).

The tests make use of failpoints to inject faults at specific places, and then verify that either both writes effectively happened, or neither did.

This initial implementation just deals with namespace objects but should be fairly straightforward to make generic for other types. I'm assuming we'll spend time on that in #6.

@github-actions github-actions bot added area/dependencies Affects dependencies area/tooling Affects the dev or user toolchain area/core labels Aug 28, 2023
this adds a durable saga that writes to spicedb and kube.

if the kube write fails, the spicedb write is reverted.

if the program crashes, the process picks up where it left off when it
restarts by reading state from a sqlite db.
magefiles/test.go Outdated Show resolved Hide resolved
pkg/proxy/durable.go Outdated Show resolved Hide resolved
pkg/proxy/server.go Outdated Show resolved Hide resolved
e2e/proxy_test.go Outdated Show resolved Hide resolved

// paul creates chani's namespace
Expect(failpoint.Disable("github.com/authzed/spicedb-kubeapi-proxy/pkg/proxy/panicKubeWrite")).To(Succeed())
_, err = paulClient.CoreV1().Namespaces().Create(ctx, &corev1.Namespace{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the task has been made durable, wouldn't it be restarted where it left off before crashing? If so, I wouldn't expect this call to succeed, because after disabling panicKubeWrite, chani's Create call should have eventually succeeded after process restart.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in this test is that task runner catches panics and returns records it as a failure of the WriteToKube activity. So the workflow continues, and because it errored, runs CheckKube and finds that the record doesn't exist, and rolls back the write.

I think you're right that that's what would happen if we crashed the whole process, and maybe we should work on a test harness to allow us to actually test that. Or maybe we can contribute something to durabletask to disable panic recovery for tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would explain it, but it seems weird to have various ways to recover from an error:

  • one that handles scenario where the process does not crash
  • one that handles scenario where the process crashes

Plus there is a record of the task being failed, so why would the workflow ignore it? Or are we implicitly telling it that "we are ok" because we run CheckKube task?

pkg/proxy/durable.go Outdated Show resolved Hide resolved
pkg/proxy/authz.go Show resolved Hide resolved
pkg/proxy/authz.go Outdated Show resolved Hide resolved
pkg/proxy/authz.go Show resolved Hide resolved
pkg/proxy/authz.go Show resolved Hide resolved
This was referenced Aug 30, 2023
@ecordell ecordell changed the title write to both kube and spicedb write durability: always commit a write to both kube and spicedb, or neither Aug 31, 2023
@ecordell ecordell force-pushed the durable-writes branch 4 times, most recently from e9c8ce0 to db94273 Compare September 1, 2023 21:52
pkg/proxy/server.go Outdated Show resolved Hide resolved
pkg/proxy/server.go Show resolved Hide resolved
magefiles/test.go Show resolved Hide resolved
pkg/failpoints/failpoints_on.go Outdated Show resolved Hide resolved
pkg/proxy/durable_activities.go Outdated Show resolved Hide resolved
e2e/proxy_test.go Outdated Show resolved Hide resolved
e2e/proxy_test.go Show resolved Hide resolved
err := CreateNamespace(ctx, chaniClient, chaniNamespace)
Expect(err).ToNot(BeNil())
// pessimistic locking reports a conflict, optimistic locking reports already exists
Expect(k8serrors.IsConflict(err) || k8serrors.IsAlreadyExists(err)).To(BeTrue())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why reporting different errors? this would be breaking the contract depending on which implementation is used - I think both implementations should yield the same result.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different operations are happening in kube and spicedb in the two cases, I'm mostly just passing the errors back and not trying to obfuscate. I can look into normalizing them.

e2e/proxy_test.go Show resolved Hide resolved
e2e/proxy_test.go Outdated Show resolved Hide resolved
@vroldanbet vroldanbet mentioned this pull request Sep 4, 2023
this also removes the failpoint library in favor of a quick local
version that doesn't require transforming the codebase.
@ecordell ecordell merged commit 323ad85 into main Sep 6, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Sep 6, 2023
@vroldanbet vroldanbet deleted the durable-writes branch September 6, 2023 10:57
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/core area/dependencies Affects dependencies area/tooling Affects the dev or user toolchain
Projects
None yet
Development

Successfully merging this pull request may close these issues.

proper dual writes: ensure a write happens in both spicedb and kube, or neither
2 participants