-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EPIC] A reusable fault-injector and resolver #139
Comments
In order make progress on this EPIC/Project I want to use the 2022 Summerhackdays. My plan for the hackdays: Preparation:
Hackdays:
|
Regarding testing internal backend which talks with kubernetes API and uses the k8 client we can use some fake client https://medium.com/the-phi/mocking-the-kubernetes-client-in-go-for-unit-testing-ddae65c4302 https://pkg.go.dev/k8s.io/client-go/kubernetes/fake#Clientset.AppsV1 Example: **
type testClientConfig struct {
namespace string
namespaceSpecified bool
err error
}
func (c *testClientConfig) Namespace() (string, bool, error) {
return c.namespace, c.namespaceSpecified, c.err
}
func (c *testClientConfig) RawConfig() (api.Config, error) {
panic("implement me")
}
func (c *testClientConfig) ClientConfig() (*rest.Config, error) {
panic("implement me")
}
func (c *testClientConfig) ConfigAccess() clientcmd.ConfigAccess {
panic("implement me")
}
func Test_GetBrokerPodNames(t *testing.T) {
// given
k8Client := K8Client{Clientset: fake.NewSimpleClientset(), ClientConfig: &testClientConfig{namespace: "default"}}
k8Client.Clientset.CoreV1().Pods(k8Client.GetCurrentNamespace()).Create(context.TODO(), &v1.Pod{
Spec: v1.PodSpec{
},
}, v12.CreateOptions{})
// when
names, err := k8Client.GetBrokerPodNames()
// then
require.NoError(t, err)
require.NotNil(t, names)
} https://www.youtube.com/watch?v=reDCJYbxtRg&ab_channel=CNCF%5BCloudNativeComputingFoundation%5D |
Happy to announce that the EPIC is done and we have release v1.0.0 https://github.com/zeebe-io/zeebe-chaos/releases/tag/zbchaos-v1.0.0 |
Motivation
Currently we have several shell scripts to execute chaos experiments with chaostoolkit. The scripts are currently working well, but the maintenance is rather hard, especially for people which might not familiar enough with bash.
This is the reason why we already migrated some of them to a kotlin based chaos worker. But we haven't done this for scripts which directly interact with the kubernetes API. The problem here is that we would need an executable cli to reference them also in the chaostoolkit experiments, to run it locally. Furthermore, the interaction with kubernetes in go, I would say, is easier/better.
Solution
We create a new go cli, with cobra. The cli allows to be executed by chaostoolkit locally. Furthermore, we use the zeebe go worker api such that we can register on the testbench. We use the go kubernetes client to interact with the kubernetes api, and use retry functionalities as we do in the shell scripts to make the experiments less flaky.
Benefit of this would be to familiarize a bit more with go and our provided go client.
Todo's left:
Inventory
In order to see what is left and missing here is a table of scripts/functionality and the related mapping in zbchaos
apply_net_admin.sh
zbchaos disconnect
await-message-correlation.sh
zbchaos verify steady-state --awaitResult --processModelPath
, since we can define the model and await the completionawait-processes-with-result.sh
zbchaos verify steady-state --awaitResult
connect-leaders.sh
zbchaos connect brokers
connect-standalone-gateway.sh
zbchaos connect gateway
deploy-different-versions.sh
zbchaos deploy process
deploy-model.sh
zbchaos verify steady-state
corrupt*
disconnect-leaders-one-way.sh
zbchaos disconnect brokers --one-direction
disconnect-leaders.sh
zbchaos disconnect brokers
disconnect-standalone-gateway.sh
zbchaos disconnect gateway --all
publish-message.sh
zbchaos publish
This command also supports specifying different partitions and different message names.shutdown-gracefully-partition.sh
zbchaos restart
This command allows to specify a broker via nodeId or via partitionId and role.start-instance-on-partition-with-version.sh
zbchaos verify steady-state --version
start-many-instances.sh
stress-cpu.sh
zbchaos stress gateway/broker --cpu
terminate-partition.sh
zbchaos terminate
This command allows to specify a broker via nodeId or via partitionId and role.terminate-workers.sh
zbchaos terminate worker
util*
verify-readiness.sh
zbchaos verify readiness
verify-steady-state.sh
zbchaos verify steady-state
zbctl-start-instances.sh
start-many-instances.sh
In order to understand which experiments are supported right now with
zbchaos
and which are missing I will list them in the following table. Be aware that I will only mention the Production-S experiments since these are the only experiments that we have automated.What else is missing:
The current kotlin worker does also some other things we need to port before we can remove it completely.
zbchaos
to deploy workers which can complete instances.zbchaos
commands, instead of referencing the scripts. This can be done incrementally Adjust all chaos experiments to use the zbchaos tool #237zbchaos
Done
Q2 2022 KR A reusable fault-injector and resolver is implemented and used in the Zeebe E2E and chaos tests
The text was updated successfully, but these errors were encountered: