-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fault injection #310
Comments
Made some progress for 1.1 by adding tests for dynamic membership. Removing the 1.1 label as we will be revisiting this issue post 1.1. |
We should aim to do more of this in 1.2. |
We should also add coverage for the places that we support exponential backoff. Specifically in |
The need for this tool is getting stronger, as we are seeing more people pumping a serious amount of data into Pachyderm and bugs are being caught in production. Most of those bugs only manifest when certain requests fail due to transient network failures, which can be simulated by such a tool. |
Here's how Kubernetes does it: https://github.com/kubernetes/kubernetes/tree/master/pkg/client/chaosclient I think this makes sense for us as the first step as well. Basically the faults are injected at client side: in our case it will be code under |
We have a recent need for testing of this nature: This will be a critical path for a customer's deployment. |
Most mature distributed system projects have some sort of fault injection frameworks for testing purposes. Some examples:
And there are more general-purpose tools such as Jepsen and ChaosMonkey that can be used to inject network faults.
Using tools such as these will help us identify bugs that would otherwise be found in production.
The text was updated successfully, but these errors were encountered: