Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

srv_topo_timeout of 1s too aggressive #8010

Closed
aquarapid opened this issue Apr 30, 2021 · 1 comment
Closed

srv_topo_timeout of 1s too aggressive #8010

aquarapid opened this issue Apr 30, 2021 · 1 comment

Comments

@aquarapid
Copy link
Contributor

A recent change (#7278), introduced a timeout for srv_topo operations. The default value was set at 1 second. This seems reasonable, but it has had the unintended side-effect of causing topo timeouts, especially during vtgate startup (tablet wait/healthcheck), for users that have topology spread across multiple DCs.

In the interest of sane defaults, I suggest we lift the default to 5 seconds; and those who need a lower value can set it down.

For discoverability, here is the type of error you would see:

E0430 16:34:01.359471       1 resilient_server.go:311] GetSrvKeyspaceNames(context.Background.WithDeadline(2021-04-30 16:34:30.359023101 +0000 UTC m=+30.040465073 [28.999518338s]), xxxx4a1) failed: deadline exceeded: /vitess/tst006/global/cells/xxxx4a1/CellInfo (no cached value, caching and returning error)
F0430 16:34:01.359543       1 vtgate.go:157] gateway.WaitForTablets failed: deadline exceeded: /vitess/tst006/global/cells/xxxx4a1/CellInfo
aquarapid added a commit to planetscale/vitess that referenced this issue Apr 30, 2021
@deepthi
Copy link
Member

deepthi commented Apr 30, 2021

Fixed by #8011

@deepthi deepthi closed this as completed Apr 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants