-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Examples, Flakes: Wait for Shard's VReplication Engine to Open #12560
Examples, Flakes: Wait for Shard's VReplication Engine to Open #12560
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
If a new flag is being introduced:
If a workflow is added or modified:
Bug fixes
Non-trivial changes
New/Existing features
Backward compatibility
|
0b43617
to
2584204
Compare
Signed-off-by: Matt Lord <[email protected]>
2584204
to
9a312eb
Compare
Signed-off-by: Matt Lord <[email protected]>
if vtctlclient --server=localhost:15999 Workflow -- "${keyspace}" listall &>/dev/null; then | ||
break | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
listall
here exits with 1
or another value than 0
if the list is not valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean by "the list is not valid", but:
$ vtctlclient Workflow -- commerce listall; echo $?
Workflow Error: rpc error: code = Unknown desc = no primary found for shard commerce/0
1
$ vtctlclient Workflow -- commerce listall; echo $?
Workflow Error: rpc error: code = Unknown desc = TabletManager.VReplicationExec on zone1-0000000100 error: vreplication engine is closed: vreplication engine is closed
1
$ vtctlclient Workflow -- foobar listall; echo $?
Workflow Error: rpc error: code = Unknown desc = node doesn't exist: /vitess/global/keyspaces/foobar/shards/
1
$ vtctlclient Workflow -- commerce listall; echo $?
No workflows found in keyspace commerce
0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
…sio#12560) * Examples: wait for shard's vreplication engine to open Signed-off-by: Matt Lord <[email protected]> * Minor comment changes Signed-off-by: Matt Lord <[email protected]> --------- Signed-off-by: Matt Lord <[email protected]>
… (#12581) * Examples: wait for shard's vreplication engine to open * Minor comment changes --------- Signed-off-by: Matt Lord <[email protected]>
Description
The local examples CI workflow is still flaky — failing ~ 35% of the time — and it always fails here (or in the same way in later VReplication steps):
You can see an example here (see any of the previous 4 failed runs): https://github.com/vitessio/vitess/actions/runs/4346133011
The CI test specifically waits for the shard to be able to successfully serve a query through the vtgate, but for some reason that doesn't always work as expected in the CI; I have not been able to repeat it locally — so perhaps a bash version difference or some other missing context variable OR perhaps the vtgate is able to serve queries but the VReplication engine takes much longer to open (GitHub runners being very slow at times is a common factor in our tests). Given that users can also run into this when running the steps as a sequence:
./101...; ./201...; ./202...; ./203...
I added a VReplication engine check to thewait_for_healthy_shard
library function.With the changes in this PR, the workflow ran successfully 10 times in a row in the CI with each workflow run containing 3 subtests: 1 each for consul, etcd, and k8s topos, so the base test really passed 30 times in a row: https://github.com/vitessio/vitess/actions/runs/4347322200
Related Issue(s)
Checklist
Deployment Notes